On Mon, Mar 27, 2017 at 4:45 PM, Amit Kapila <amit.kapil...@gmail.com>
> On Sat, Mar 25, 2017 at 1:24 PM, Amit Kapila <amit.kapil...@gmail.com>
> > On Fri, Mar 24, 2017 at 11:49 PM, Pavan Deolasee
> > <pavan.deola...@gmail.com> wrote:
> >> On Fri, Mar 24, 2017 at 6:46 PM, Amit Kapila <amit.kapil...@gmail.com>
> >> wrote:
> >> While looking at this problem, it occurred to me that the assumptions
> >> for hash indexes are also wrong :-( Hash index has the same problem as
> >> expression indexes have. A change in heap value may not necessarily
> cause a
> >> change in the hash key. If we don't detect that, we will end up having
> >> hash identical hash keys with the same TID pointer. This will cause the
> >> duplicate key scans problem since hashrecheck will return true for both
> >> hash entries.
> Isn't it possible to detect duplicate keys in hashrecheck if we
> compare both hashkey and tid stored in index tuple with the
> corresponding values from heap tuple?
Hmm.. I thought that won't work. For example, say we have a tuple (X, Y, Z)
in the heap with a btree index on X and a hash index on Y. If that is
updated to (X, Y', Z) and say we do a WARM update and insert a new entry in
the hash index. Now if Y and Y' both generate the same hashkey, we will
have exactly similar looking <hashkey, TID> tuples in the hash index
leading to duplicate key scans.
I think one way to solve this is to pass both old and new heap values to
amwarminsert and expect each AM to detect duplicates and avoid creating of
a WARM pointer if index keys are exactly the same (we can do that since
there already exists another index tuple with the same keys pointing to the
same root TID).
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services