On Wed, 2008-07-23 at 10:57 +0800, Xiao Meng wrote:
Well, I'll do it after I finish my second patch.
Hash index should be more efficient than btree when N is big enough.
It seems meaningful to find how big N is in an experiment way.
Agreed.
We should also examine the basic thinking of the
] [PATCH]-hash index improving
Well, I'll do it after I finish my second patch.
Hash index should be more efficient than btree when N is big enough.
It seems meaningful to find how big N is in an experiment way.
The savings will depend on many factors. Another thing (besides volume
I'm sorry for delay reply. I couldn't get access to the internet these
days for some reason.
I do apologize for my rough work and very bad readability. I posted it
in a hurry and I didn't mean to cause the reader so much
inconvenience.
I'll NEVER make such a mistake again.
Currently, I've made
Well, I'll do it after I finish my second patch.
Hash index should be more efficient than btree when N is big enough.
It seems meaningful to find how big N is in an experiment way.
On Fri, Jul 18, 2008 at 6:35 PM, Simon Riggs [EMAIL PROTECTED] wrote:
On Fri, 2008-07-18 at 11:07 +0100, Gregory
-Original Message-
From: [EMAIL PROTECTED] [mailto:pgsql-hackers-
[EMAIL PROTECTED] On Behalf Of Xiao Meng
Sent: Tuesday, July 22, 2008 7:57 PM
To: Simon Riggs
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] [PATCH]-hash index improving
Well, I'll do it after I finish my
On Thu, 2008-07-17 at 16:37 -0700, Dann Corbit wrote:
Large table unique index equality search should be very fast with hashed
index (and the only place where any advantage will be seen). Hashed
indexes are useless for any search besides equality and gain more and
more when the levels of
Simon Riggs [EMAIL PROTECTED] writes:
On Thu, 2008-07-17 at 16:37 -0700, Dann Corbit wrote:
Large table unique index equality search should be very fast with hashed
index (and the only place where any advantage will be seen). Hashed
indexes are useless for any search besides equality and
On Fri, 2008-07-18 at 11:07 +0100, Gregory Stark wrote:
Simon Riggs [EMAIL PROTECTED] writes:
hash lookups can in theory be O(1).
I'm not sure whether that applies here? I'm interested in how *this*
patch will work, not in more generic algorithm theory.
To patch authors: Can we please see a
Gregory Stark escribió:
For cpu-bound databases with small indexes there might be a win if you can
avoid the binary search of all the elements on a page. (Have we modified btree
to do that or does it still scan sequentially on the leaf pages?)
Hmm? It has used binary search since as long as
Gregory Stark wrote:
For i/o-bound databases with very large indexes there should be an opportunity
where btree lookups are O(logn) and hash lookups can in theory be O(1).
Ignoring the big-O complexity, if a hash index only stores a 32-bit hash
code instead of the whole key, it could be a big
Heikki Linnakangas [EMAIL PROTECTED] writes:
Gregory Stark wrote:
For i/o-bound databases with very large indexes there should be an
opportunity
where btree lookups are O(logn) and hash lookups can in theory be O(1).
Ignoring the big-O complexity, if a hash index only stores a 32-bit hash
On Fri, Jul 18, 2008 at 10:44 AM, Heikki Linnakangas
[EMAIL PROTECTED] wrote:
Ignoring the big-O complexity, if a hash index only stores a 32-bit hash
code instead of the whole key, it could be a big win in storage size, and
therefore in cache-efficiency and performance, when the keys are very
Jonah H. Harris [EMAIL PROTECTED] writes:
Agreed. My thinking is that there's either something inherently wrong
with the implementation, or we're performing so many disk I/Os that
it's nearly equivalent to b-tree. Tom has a couple suggestions which
Xiao and I will explore.
I finally got a
I just ran my original 16M word test case against the patched
version, and like Tom noted below, the tuples per bucket
calculation is wrong which results in identical index sizes
for both the original version and the hash-value-only version.
I suppose that the main point of #1 is to reduce index
FYI,
I just patched the fill-factor calculation and re-ran my test.
The index size dropped from 513M to 43M which is the same disk
footprint as the corresponding btree index.
Have a nice weekend.
Ken
On Fri, Jul 18, 2008 at 12:23:14PM -0500, Kenneth Marshall wrote:
I just ran my original 16M
The patch store hash code only in the index tuple.
It based on Neil Conway's patch with an old version of PostgreSQL.
It passes the regression test but I didn't test the performance yet.
Anyone interested can make a performance test;-)
You can undefine the macro HASHVALUE_ONLY in hash.h to get
On Thu, Jul 17, 2008 at 5:26 AM, Xiao Meng [EMAIL PROTECTED] wrote:
The patch store hash code only in the index tuple.
It based on Neil Conway's patch with an old version of PostgreSQL.
It passes the regression test but I didn't test the performance yet.
Anyone interested can make a
Xiao Meng escribió:
The patch store hash code only in the index tuple.
It based on Neil Conway's patch with an old version of PostgreSQL.
It passes the regression test but I didn't test the performance yet.
Anyone interested can make a performance test;-)
You can undefine the macro
On Thu, Jul 17, 2008 at 12:42 PM, Alvaro Herrera
[EMAIL PROTECTED] wrote:
I think having the HASHVALUE_ONLY define is not a good idea -- it just
makes the patch harder to read. I suggest just removing the old code
and putting the new code in place. (That's why we have revision
control.)
On Thu, Jul 17, 2008 at 12:42:39PM -0400, Alvaro Herrera wrote:
Xiao Meng escribi?:
The patch store hash code only in the index tuple.
It based on Neil Conway's patch with an old version of PostgreSQL.
It passes the regression test but I didn't test the performance yet.
Anyone interested
On Thu, Jul 17, 2008 at 02:00:07PM -0400, Jonah H. Harris wrote:
On Thu, Jul 17, 2008 at 1:54 PM, Kenneth Marshall [EMAIL PROTECTED] wrote:
I think having the HASHVALUE_ONLY define is not a good idea -- it just
makes the patch harder to read. I suggest just removing the old code
and
On Thu, Jul 17, 2008 at 1:54 PM, Kenneth Marshall [EMAIL PROTECTED] wrote:
I think having the HASHVALUE_ONLY define is not a good idea -- it just
makes the patch harder to read. I suggest just removing the old code
and putting the new code in place. (That's why we have revision
control.)
Kenneth Marshall escribió:
On Thu, Jul 17, 2008 at 12:42:39PM -0400, Alvaro Herrera wrote:
I think having the HASHVALUE_ONLY define is not a good idea -- it just
makes the patch harder to read. I suggest just removing the old code
and putting the new code in place. (That's why we have
On Thu, Jul 17, 2008 at 5:26 AM, Xiao Meng [EMAIL PROTECTED] wrote:
The patch store hash code only in the index tuple.
It based on Neil Conway's patch with an old version of PostgreSQL.
It passes the regression test but I didn't test the performance yet.
Anyone interested can make a
On Thu, Jul 17, 2008 at 04:24:28PM -0400, Jonah H. Harris wrote:
On Thu, Jul 17, 2008 at 5:26 AM, Xiao Meng [EMAIL PROTECTED] wrote:
The patch store hash code only in the index tuple.
It based on Neil Conway's patch with an old version of PostgreSQL.
It passes the regression test but I
On Thu, 2008-07-17 at 16:24 -0400, Jonah H. Harris wrote:
I'm not really seeing any performance improvements over b-tree.
I'd like to see a theoretical analysis on the benefit case before we
spend too many teraflops on investigation.
In which cases do we expect that hash indexes will beat
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Simon Riggs
Sent: Thursday, July 17, 2008 4:10 PM
To: Jonah H. Harris
Cc: Xiao Meng; pgsql-hackers@postgresql.org; Kenneth Marshall
Subject: Re: [HACKERS] [PATCH]-hash index improving
On Thu
On Thu, Jul 17, 2008 at 02:11:20PM -0400, Alvaro Herrera wrote:
Kenneth Marshall escribió:
On Thu, Jul 17, 2008 at 12:42:39PM -0400, Alvaro Herrera wrote:
I think having the HASHVALUE_ONLY define is not a good idea --
it just makes the patch harder to read. I suggest just removing
On Thu, Jul 17, 2008 at 7:37 PM, Dann Corbit [EMAIL PROTECTED] wrote:
In which cases do we expect that hash indexes will beat btrees?
Large table unique index equality search should be very fast with hashed
index (and the only place where any advantage will be seen).
Yes, this is the exact
Alvaro Herrera [EMAIL PROTECTED] writes:
Xiao Meng escribió:
You can undefine the macro HASHVALUE_ONLY in hash.h to get the
original implementation.
I think having the HASHVALUE_ONLY define is not a good idea -- it just
makes the patch harder to read.
While we are griping about readability:
On Fri, Jul 18, 2008 at 1:00 AM, Tom Lane [EMAIL PROTECTED] wrote:
While we are griping about readability: failure to update the comments
to match the code is NOT, NOT, NOT acceptable. I had barely started
to read the patch before encountering this insult to the reader:
...
As this is Xiao's
Jonah H. Harris [EMAIL PROTECTED] writes:
The real question now has been listed above; why are hash index
queries, including this patch, no better than b-tree even for straight
equality comparisons?
Well, the theoretical advantage is that a hash probe is O(1) while a
btree probe is O(log N)
32 matches
Mail list logo