Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

Amit Kapila Sat, 25 Mar 2017 00:55:23 -0700

On Fri, Mar 24, 2017 at 11:49 PM, Pavan Deolasee
<pavan.deola...@gmail.com> wrote:
>
>
> On Fri, Mar 24, 2017 at 6:46 PM, Amit Kapila <amit.kapil...@gmail.com>
> wrote:
>>
>>
>>
>> I was worried for the case if the index is created non-default
>> collation, will the datumIsEqual() suffice the need.  Now again
>> thinking about it, I think it will because in the index tuple we are
>> storing the value as in heap tuple.  However today it occurred to me
>> how will this work for toasted index values (index value >
>> TOAST_INDEX_TARGET).  It is mentioned on top of datumIsEqual() that it
>> probably won't work for toasted values.  Have you considered that
>> point?
>>
>
> No, I haven't and thanks for bringing that up. And now that I think more
> about it and see the code, I think the naive way of just comparing index
> attribute value against heap values is probably wrong. The example of
> TOAST_INDEX_TARGET is one such case, but I wonder if there are other varlena
> attributes that we might store differently in heap and index. Like
> index_form_tuple() ->heap_fill_tuple seem to some churning for varlena. It's
> not clear to me if index_get_attr will return the values which are binary
> comparable to heap values.. I wonder if calling index_form_tuple on the heap
> values, fetching attributes via index_get_attr on both index tuples and then
> doing a binary compare is a more robust idea.
>


I am not sure how do you want to binary compare two datums, if you are
thinking datumIsEqual(), that won't work.  I think you need to use
datatype specific compare function something like what we do in
_bt_compare().

> Or may be that's just
> duplicating efforts.
>

I think if we do something on the lines as mentioned by me above we
might not need to duplicate the efforts.

> While looking at this problem, it occurred to me that the assumptions made
> for hash indexes are also wrong :-( Hash index has the same problem as
> expression indexes have. A change in heap value may not necessarily cause a
> change in the hash key. If we don't detect that, we will end up having two
> hash identical hash keys with the same TID pointer. This will cause the
> duplicate key scans problem since hashrecheck will return true for both the
> hash entries. That's a bummer as far as supporting WARM for hash indexes is
> concerned,
>

Yeah, I also think so.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

Reply via email to