On Thu, Aug 4, 2016 at 7:59 AM, Pavan Deolasee <pavan.deola...@gmail.com> wrote:
> We don’t want to force a recheck for every index fetch because that will
> slow everything down. So we would like a simple and efficient way of knowing
> about the existence of a WARM tuple in a chain and do a recheck in only
> those cases, ideally O(1). Having a HOT chain contain a WARM tuple is
> discussed below as being a “WARM chain”, implying it needs re-check.
> We could solve this by either 1) marking both old and new index tuples as
> "recheck-required" or 2) marking the HOT chain on the heap as
> "recheck-required" such that any tuple returned from the chain requires a
> recheck.
> The advantage of doing 1) is that the recheck is required only for the
> compromised index.
> Note that in presence of multiple index keys pointing to the same root line
> pointer, the chain needs re-check for both the old as well as the new index
> key. The old index key must not fetch the new tuple(s) and the new index key
> must not fetch the old tuple(s). So irrespective of whether an old or a new
> tuple is being returned, the caller must know about the WARM tuples in the
> chains. So marking the index would require us to mark both old and new index
> tuples as requiring re-check. That requires an additional index scan to
> locate the old row and then an additional write to force it to re-check,
> which is algorithmically O(NlogN) on table size.
> Marking the heap, (2) is O(1) per updated index, so seems better. But since
> we don't know with respect to which index or indexes the chain is broken,
> all index scans must do a recheck for a tuple returned from a WARM chain.
> How do we mark a chain as WARM? How do we clear the flag? There are a few
> possible approaches:
> 1. Mark the first WARM tuple which causes HOT chain to become WARM with a
> special HEAP_RECHECK_REQUIRED flag (this flag must then be carried over to
> any further additions to the chain. Then whenever a tuple if returned from a
> chain, we must look forward into the chain for WARM tuple and let the caller
> know about potential breakage. This seems simple but would require us to
> follow HOT chain every time to check if it’s HOT or WARM.
> 2. Mark the root line pointer (or the root tuple) with a special
> HEAP_RECHECK_REQUIRED flag to tell us about the presence of a WARM tuple in
> the chain. Since all indexes point to the root line pointer, it should be
> enough to just mark the root line pointer (or tuple) with this flag.
> 3. We could also adopt a hybrid approach and mark the new index tuple and
> new heap tuple with HEAP_RECHECK_REQUIRED flag and presence of either of the
> flag in either tuple forces recheck.

I've actually been thinking of another possibility in the context of
LITE, and what I was pondering looks eerily similar to WARM, so let me
summarize (I was going to write a lengthier POC but I thought I better
comment now since you're introducing the same concept):

Marks are troublesome becuase, as you mention, you would need one
"WARM" bit per index to be efficient. We also don't want a recheck on
every index access.

HOT is very close to WARM already, even the code. The only problem
with HOT is that it stops scanning the update chain when a tuple is
not HOT-updated. If heap_hot_search_buffer knew to keep on following
the update chain when there is a WARM update on the index in question,
this could work.

But it won't work if it's implemented naively, because if there was
another item pointer in another page of the index pointing to a heap
tuple that belongs to an already-visited chain, which is possible in
the below case, then the scan would produce duplicate rows:

N: Row versions:
->: update
* : causes a new index tuple on index I

1 -> 2 -> 3 -> 4 -> 5
*             *     *
a             b    a  <- key

If the scan qual is "key in ('a','b')", then the index scan will visit
both versions of the row, 1, 3 and 4. Suppose only 5 is visible, and
suppose the whole chain lies in the same page.

There is one update chain, 1..5, but three WARM chains: 1..2, 3, 4..5

When visiting item pointers, 1 will be found, the chain will be
followed and eventually version 5 will pass the visibility check and
the tuple will pass the scan quals, it will even match the key in the
index item, so no key check will fix this. 5 will be yielded when
visiting version 1.

Next, version 3 will be visited. There's no easy way to know that we
already visited 3, so we'll follow the chain all the way to 5, and,
again, it will be yielded. Here, the key won't match, so maybe this
case could be filtered out with a key check.

Finally, version 4 will be visited, the chain will be followed to 5,
and as in the first case, it will be yielded.

So the scan yielded the same heap tuple thrice, which isn't an ideal situation.

It would be totally acceptable for a bitmap index scan, so the
expensive logic I will be outlining below may be skipped when building
a bitmap (lossy or not).

To fix this, what I was pondering, and I believe it would be viable,
is to teach heap_hot_search_buffer about the WARM invariant. It will
make correctness heavily depend on the invariant being true, which
means either flagging item pointers as WARM chains, or requiring a
reindex during upgrade to make sure all indexes verify the invariant -
as old indices won't.

When an item pointer is WARM (either implicit or explicit),
heap_hot_search_buffer will only stop following the update chain if it
reaches not the end of a HOT chain, but rather a key change. For this,
it will need to recheck the key of all but the first TID in the chain,
so singleton chains (ie: not-updated or frozen tuples) won't incurr
this extra cost. When walking the chain, it will need to form an index
tuple for both the old and new versions, and compare them, and follow
the chain *only* if the keys are equal.

As an optimization, if the tuple is marked HOT-updated, the key check
may be skipped.

Vacuum will need to be taught not to break the invariant, but I
believe this is viable and efficient.

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to