On Wed, Jan 4, 2017 at 6:05 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> Okay, so this optimization can work only after all the active
> transactions operating on a page are finished.  If that is true, in
> some cases such a design can consume a lot of CPU traversing all the
> tuples in a page for un-setting the bit, especially when such tuples
> are less.

I suppose.  I didn't think that cost was likely to be big enough to
worry about, but I might be wrong.  The worst case would be when you
modify one tuple on a page, let the transaction that did the
modification become all-visible, modify one tuple on the page again,
etc. and, at the same time, the page is entirely full of tuples.  So
you keep having to loop over all the bits to clear them (they are all
clear except one, but you don't know that) and then re-set just one of
them.  That's not free, but keep in mind that the existing system
would be forced to perform non-HOT updates in that situation, which
isn't free either.

Also, I'm thinking the bit could be stored in the line pointer rather
than the tuple, because with this design we don't need
LP_UNUSED/LP_NORMAL/LP_REDIRECT/LP_DEAD any more.  We could use one
bit to indicate dead or not-dead and the second bit to indicate
recently-modified or not-recently-modified.  With that approach,
clearing the bits only requires iterating over the line pointer array,
not the tuples themselves.

>> We don't necessarily need UNDO to clean up the indexes, although it
>> might be a good idea.  It would *work* to just periodically scan the
>> index for delete-marked items.  For each one, visit the heap and see
>> if there's a version of that tuple present in the heap or current UNDO
>> that matches the index entry.  If not, remove the index entry.
> I think this is somewhat similar to how we clean the index now and
> seems to be a workable design.  However, don't you think that it will
> have somewhat similar characteristics for index-bloat as we have now?

Yes, it would have similar characteristics.  Thus we might want to do better.

> OTOH for heap, it will help us to take away old versions away from the
> main heap, but still, we can't get rid of that space or reuse that
> space till we can clean corresponding index entries.

I don't think that's true.  If in-place update is ever allowed in
cases where indexed columns have been modified, then the index already
has to cope with the possibility that the heap tuple it can see
doesn't match the index.  And if it can cope with that, then why do we
have to remove the index entry before reusing the heap TID?

Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to