Re: [HACKERS] Hash Indexes

Amit Kapila Sun, 18 Sep 2016 23:45:07 -0700

On Mon, Sep 19, 2016 at 11:20 AM, Mark Kirkwood
<[email protected]> wrote:
>
>
> On 17/09/16 06:38, Andres Freund wrote:
>>
>> On 2016-09-16 09:12:22 -0700, Jeff Janes wrote:
>>>
>>> On Thu, Sep 15, 2016 at 7:23 AM, Andres Freund <[email protected]>
>>> wrote:
>>>>
>>>> One earlier question about this is whether that is actually a worthwhile
>>>> goal.  Are the speed and space benefits big enough in the general case?
>>>> Could those benefits not be achieved in a more maintainable manner by
>>>> adding a layer that uses a btree over hash(columns), and adds
>>>> appropriate rechecks after heap scans?
>>>>
>>>> Note that I'm not saying that hash indexes are not worthwhile, I'm just
>>>> doubtful that question has been explored sufficiently.
>>>
>>> I think that exploring it well requires good code.  If the code is good,
>>> why not commit it?
>>
>> Because getting there requires a lot of effort, debugging it afterwards
>> would take effort, and maintaining it would also takes a fair amount?
>> Adding code isn't free.
>>
>> I'm rather unenthused about having a hash index implementation that's
>> mildly better in some corner cases, but otherwise doesn't have much
>> benefit. That'll mean we'll have to step up our user education a lot,
>> and we'll have to maintain something for little benefit.
>>
>
> While I see the point of what you are saying here, I recall all previous
> discussions about has indexes tended to go a bit like this:
>
> - until WAL logging of hash indexes is written it is not worthwhile trying
> to make improvements to them
> - WAL logging will be a lot of work, patches 1st please
>
> Now someone has done that work, and we seem to be objecting that because
> they are not improved then the patches are (maybe) not worthwhile.
>


I think saying hash indexes are not improved after proposed set of
patches is an understatement.  The read performance has improved by
more than 80% as compare to HEAD [1] (refer data in Mithun's mail).
Also, tests by Mithun and Jesper has indicated that in multiple
workloads, they are better than BTREE by 30~60% (in fact Jesper
mentioned that he is seeing 40~60% benefit on production database,
Jesper correct me if I am wrong.).  I agree that when index column is
updated they are much worse than btree as of now, but no work has been
done improve it and I am sure that it can be improved for those cases
as well.

In general, I thought the tests done till now are sufficient to prove
the importance of work, but if still Andres and others have doubt and
they want to test some specific cases, then sure we can do more
performance benchmarking.

Mark, thanks for supporting the case for improving Hash Indexes.


[1] - 
https://www.postgresql.org/message-id/CAD__OugX0aOa7qopz3d-nbBAoVmvSmdFJOX4mv5tFRpijqH47A%40mail.gmail.com
-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hash Indexes

Reply via email to