Re: [HACKERS] [PATCH]-hash index improving

Tom Lane Thu, 17 Jul 2008 22:37:48 -0700

"Jonah H. Harris" <[EMAIL PROTECTED]> writes:
> The real question now has been listed above; why are hash index
> queries, including this patch, no better than b-tree even for straight
> equality comparisons?


Well, the theoretical advantage is that a hash probe is O(1) while a
btree probe is O(log N) (ignoring a lot of possible nonoptimalities
on each side of course).  So you'd need a test case large enough for
log N to be noticeably more than 1, which I think means in practice that
the upper levels of the btree don't fit in memory.  So small test cases
aren't going to show any improvement.

I think the historical problem with our hash implementation has been
that it was terribly space-inefficient, because of the fixed (and large)
bucket size.  A dense btree index can probably beat a sparse hash up to
exceedingly large values of N.  It's not clear to me how much the patch
at hand does to fix that.

The patch might introduce a new problem, which is extra heap visits
caused by the lossiness of the hash value.  Again, that doesn't hurt
much ideally, but maybe you chanced to use a test case where it does
hurt.  It would be worth sticking in a bit of debug instrumentation to
see whether the number of failed heap visits is significant.

In any case, the reported test seems to be dealing with index sizes in
the tens-of-megabytes range, and it doesn't surprise me a whole lot that
an O(log N) penalty isn't showing up there.  Try a test case where the
index size is significantly larger than your RAM.

                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH]-hash index improving

Reply via email to