Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

Arthur Silva Wed, 14 Sep 2016 09:00:00 -0700

On Sep 14, 2016 5:18 PM, "Robert Haas" <robertmh...@gmail.com> wrote:
>
> On Wed, Sep 14, 2016 at 8:16 AM, Pavan Deolasee
> <pavan.deola...@gmail.com> wrote:
> > Ah, thanks. So MaxHeapTuplesPerPage sets the upper boundary for the per
page
> > bitmap size. Thats about 36 bytes for 8K page. IOW if on an average
there
> > are 6 or more dead tuples per page, bitmap will outperform the current
> > representation, assuming max allocation for bitmap. If we can use
additional
> > estimates to restrict the size to somewhat more conservative value and
then
> > keep overflow area, then probably the break-even happens even earlier
than
> > that. I hope this gives us a good starting point, but let me know if you
> > think it's still a wrong approach to pursue.
>
> Well, it's certainly a bigger change.  I think the big concern is that
> the amount of memory now becomes fixed based on the table size.  So
> one problem is that you have to figure out what you're going to do if
> the bitmap doesn't fit in maintenance_work_mem.  A related problem is
> that it might fit but use more memory than before, which could cause
> problems for some people.  Now on the other hand it could also use
> less memory for some people, and that would be good.
>
> I am kind of doubtful about this whole line of investigation because
> we're basically trying pretty hard to fix something that I'm not sure
> is broken.    I do agree that, all other things being equal, the TID
> lookups will probably be faster with a bitmap than with a binary
> search, but maybe not if the table is large and the number of dead
> TIDs is small, because cache efficiency is pretty important.  But even
> if it's always faster, does TID lookup speed even really matter to
> overall VACUUM performance? Claudio's early results suggest that it
> might, but maybe that's just a question of some optimization that
> hasn't been done yet.
>
> I'm fairly sure that our number one priority should be to minimize the
> number of cases where we need to do multiple scans of the indexes to
> stay within maintenance_work_mem.  If we're satisfied we've met that
> goal, then within that we should try to make VACUUM as fast as
> possible with as little memory usage as possible.  I'm not 100% sure I
> know how to get there, or how much work it's worth expending.  In
> theory we could even start with the list of TIDs and switch to the
> bitmap if the TID list becomes larger than the bitmap would have been,
> but I don't know if it's worth the effort.
>
> /me thinks a bit.
>
> Actually, I think that probably *is* worthwhile, specifically because
> it might let us avoid multiple index scans in cases where we currently
> require them.  Right now, our default maintenance_work_mem value is
> 64MB, which is enough to hold a little over ten million tuples.  It's
> also large enough to hold a bitmap for a 14GB table.  So right now if
> you deleted, say, 100 tuples per page you would end up with an index
> vacuum cycles for every ~100,000 pages = 800MB, whereas switching to
> the bitmap representation for such cases would require only one index
> vacuum cycle for every 14GB, more than an order of magnitude
> improvement!
>
> On the other hand, if we switch to the bitmap as the ONLY possible
> representation, we will lose badly when there are scattered updates -
> e.g. 1 deleted tuple every 10 pages.  So it seems like we probably
> want to have both options.  One tricky part is figuring out how we
> switch between them when memory gets tight; we have to avoid bursting
> above our memory limit while making the switch.  And even if our
> memory limit is very high, we want to avoid using memory gratuitously;
> I think we should try to grow memory usage incrementally with either
> representation.
>
> For instance, one idea to grow memory usage incrementally would be to
> store dead tuple information separately for each 1GB segment of the
> relation.  So we have an array of dead-tuple-representation objects,
> one for every 1GB of the relation.  If there are no dead tuples in a
> given 1GB segment, then this pointer can just be NULL.  Otherwise, it
> can point to either the bitmap representation (which will take ~4.5MB)
> or it can point to an array of TIDs (which will take 6 bytes/TID).
> That could handle an awfully wide variety of usage patterns
> efficiently; it's basically never worse than what we're doing today,
> and when the dead tuple density is high for any portion of the
> relation it's a lot better.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


I'd say it's an idea worth pursuing. It's the base idea behind roaring
bitmaps, arguably the best overall compressed bitmap implementation.

Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

Reply via email to