Re: [HACKERS] [PATCH]-hash index improving

2008-07-23 Thread Simon Riggs

On Wed, 2008-07-23 at 10:57 +0800, Xiao Meng wrote:
 Well, I'll do it after I finish my second patch.
 Hash index should be more efficient than btree when N is big enough.
 It seems meaningful to find how big N is in an experiment way.

Agreed.

We should also examine the basic thinking of the index.

My understanding is that it dynamically resizes hash as the index grows.
If we already believe the only benefit would come when the index is
large, having special handling for small tables seems like a waste of
time because we will never use it in those contexts. 

So starting the hash at a fairly large size makes more sense than it
might otherwise seem to.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-23 Thread Kenneth Marshall
On Tue, Jul 22, 2008 at 08:36:34PM -0700, Dann Corbit wrote:
  -Original Message-
  From: [EMAIL PROTECTED] [mailto:pgsql-hackers-
  [EMAIL PROTECTED] On Behalf Of Xiao Meng
  Sent: Tuesday, July 22, 2008 7:57 PM
  To: Simon Riggs
  Cc: pgsql-hackers@postgresql.org
  Subject: Re: [HACKERS] [PATCH]-hash index improving
  
  Well, I'll do it after I finish my second patch.
  Hash index should be more efficient than btree when N is big enough.
  It seems meaningful to find how big N is in an experiment way.
 
 The savings will depend on many factors.  Another thing (besides volume) 
 which is important is the sort of key data being indexed.
 
 Consider a unique key on six varchar(40) fields:
 
 1.  Country
 2.  State/Province
 3.  City
 4.  Division
 5.  Branch
 6.  Office
 
 Typically, a key such as this will have lots of clumps of similar data, only 
 being differentiated with the final segment.  This sort of index is often 
 used for reporting purposes.  To determine a unique entry, it is not unlikely 
 that more than 200 characters will be traversed.  A hash index gets a special 
 boost here because a much smaller data signature is stored.  Even a 64 bit 
 hash occupies only 8 bytes.  On the other hand, consider an index on a field 
 consisting of a single character.  Here, the pages of the b-tree will have a 
 huge volume of entries per page, requiring fewer pages to search, and the 
 hash index is many times larger and hence more pages will have to be loaded.
 
 These things make a hash index desirable:
 1. Unique index
 2. Long keys
 3. Huge data cardinality
 4. Equality search
 
 These things make a hash index undesirable:
 1.  Non-unique index
 2.  Short keys
 3.  Small data sets
 
I mentioned in a previous E-mail, adding some additional informational settings
that can be used like fill-factor to improve the layout and performance of a 
hash
index. They are roughly:
   - number of elements
   - maximum number of elements
   - multiplicity - estimate of element repetition in a non-unique index
Knowing the number of elements in advance can allow the index to be pre-created
in the optimal layout and disk footprint. For every multiple of 256, you can
reduce the space needed by the hash value stored by 8-bits. For large indexes
you can store a 64-bit hash in the same space as the 32-bit hash in a small
index. This will allow for the use of larger hash values which will result in
better data distribution between the buckets and more O(1) like behavior.

 These things render a hash index as worthless (except in COMBINATION with a 
 b-tree type index):
 1.  Need for range searches like BETWEEN
 2.  Need for ORDER BY on the column(s)
 
 As an aside:
 I guess it will also be nice if you can CLUSTER both index and data values on 
 the hash.  It may need a different algorithm than a b-tree clustering 
 concept.  I know that Rdb uses different kinds of storage areas for hashed 
 indexes verses b-tree indexes.
 
Clustering a hash index will allow for much smaller indexes through the use
of prefix-compression of the common heap tuple id's. Now an entry in the hash
index would need sizeof(hash) + sizeof(heap tuple id) which is 4 + 6 = 10bytes
before clustering. After clustering and for large indexes, this could drop to
4bytes per entry plus a constant value.

 This effort to create hashed indexes is very valuable.  Because it becomes 
 more and more dominant as the data scales up, right at the point when things 
 get tougher is when it becomes the most helpful.  If you have a tiny table, 
 it does not even matter if you index it, because (for instance) 10 rows will 
 probably always stay in memory and iteration will find what is needed 
 instantly.  But if you have hundreds of millions of rows or billions of rows, 
 now is when performance really matters.  So when the data scales to 
 preposterous size (which it has an uncanny ability to do) the boost of 
 performance becomes even more valuable.

Although it is a clear theoretical benefit from the O(1) lookup for large 
indexes,
I think that the cross-over point between btree and hash indexes may take place
for smaller indexes than might be expected due to the possibly smaller memory 
footprint
needed for the hash index. Of course, this will all need to be tested.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-22 Thread Xiao Meng
I'm sorry for delay reply. I couldn't get access to the internet these
days for some reason.
I do apologize for my rough work and very bad readability. I posted it
in a hurry and I didn't mean to  cause the reader so much
inconvenience.
I'll NEVER make such a mistake again.
Currently, I've made some optimization Tom advised and removed the
macro HASHVALUE_ONLY.  And I'm working on fixing the problem that it
crashed in large data set.
I'll post a new patch later.
Thank you for all your advice and test.

-- 
Best Regards,
Xiao Meng

DKERC, Harbin Institute of Technology, China
Gtalk: [EMAIL PROTECTED]
MSN: [EMAIL PROTECTED]
http://xiaomeng.yo2.cn

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-22 Thread Xiao Meng
Well, I'll do it after I finish my second patch.
Hash index should be more efficient than btree when N is big enough.
It seems meaningful to find how big N is in an experiment way.

On Fri, Jul 18, 2008 at 6:35 PM, Simon Riggs [EMAIL PROTECTED] wrote:

 On Fri, 2008-07-18 at 11:07 +0100, Gregory Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:

 hash lookups can in theory be O(1).

 I'm not sure whether that applies here? I'm interested in how *this*
 patch will work, not in more generic algorithm theory.

 To patch authors: Can we please see a table showing expected number of
 logical I/Os (i,e, block accesses) for btrees and hash indexes

 e.g. for 100-byte rows...

 rowsbtree   hash
 -   
 10^2
 10^3
 10^4
 10^5
 10^6
 10^7
 10^8

 --
  Simon Riggs   www.2ndQuadrant.com
  PostgreSQL Training, Services and Support





-- 
Best Regards,
Xiao Meng

DKERC, Harbin Institute of Technology, China
Gtalk: [EMAIL PROTECTED]
MSN: [EMAIL PROTECTED]
http://xiaomeng.yo2.cn

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-22 Thread Dann Corbit
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:pgsql-hackers-
 [EMAIL PROTECTED] On Behalf Of Xiao Meng
 Sent: Tuesday, July 22, 2008 7:57 PM
 To: Simon Riggs
 Cc: pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] [PATCH]-hash index improving
 
 Well, I'll do it after I finish my second patch.
 Hash index should be more efficient than btree when N is big enough.
 It seems meaningful to find how big N is in an experiment way.

The savings will depend on many factors.  Another thing (besides volume) which 
is important is the sort of key data being indexed.

Consider a unique key on six varchar(40) fields:

1.  Country
2.  State/Province
3.  City
4.  Division
5.  Branch
6.  Office

Typically, a key such as this will have lots of clumps of similar data, only 
being differentiated with the final segment.  This sort of index is often used 
for reporting purposes.  To determine a unique entry, it is not unlikely that 
more than 200 characters will be traversed.  A hash index gets a special boost 
here because a much smaller data signature is stored.  Even a 64 bit hash 
occupies only 8 bytes.  On the other hand, consider an index on a field 
consisting of a single character.  Here, the pages of the b-tree will have a 
huge volume of entries per page, requiring fewer pages to search, and the hash 
index is many times larger and hence more pages will have to be loaded.

These things make a hash index desirable:
1. Unique index
2. Long keys
3. Huge data cardinality
4. Equality search

These things make a hash index undesirable:
1.  Non-unique index
2.  Short keys
3.  Small data sets

These things render a hash index as worthless (except in COMBINATION with a 
b-tree type index):
1.  Need for range searches like BETWEEN
2.  Need for ORDER BY on the column(s)

As an aside:
I guess it will also be nice if you can CLUSTER both index and data values on 
the hash.  It may need a different algorithm than a b-tree clustering concept.  
I know that Rdb uses different kinds of storage areas for hashed indexes verses 
b-tree indexes.

This effort to create hashed indexes is very valuable.  Because it becomes more 
and more dominant as the data scales up, right at the point when things get 
tougher is when it becomes the most helpful.  If you have a tiny table, it does 
not even matter if you index it, because (for instance) 10 rows will probably 
always stay in memory and iteration will find what is needed instantly.  But if 
you have hundreds of millions of rows or billions of rows, now is when 
performance really matters.  So when the data scales to preposterous size 
(which it has an uncanny ability to do) the boost of performance becomes even 
more valuable.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Simon Riggs

On Thu, 2008-07-17 at 16:37 -0700, Dann Corbit wrote:

 Large table unique index equality search should be very fast with hashed
 index (and the only place where any advantage will be seen).  Hashed
 indexes are useless for any search besides equality and gain more and
 more when the levels of the b-tree index increase.

I think a comparison with a btree using a functional index should be
shown.

 The only way to get better performance from hash based indexes is to
 read fewer index pages than if a tree-based index were used.  So I think
 that the scheme used to create the index pages is the focus to make them
 worthwhile.

Agreed. Some math on that, plus a clear focus on making this faster than
a btree is critical to this project.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Gregory Stark
Simon Riggs [EMAIL PROTECTED] writes:

 On Thu, 2008-07-17 at 16:37 -0700, Dann Corbit wrote:

 Large table unique index equality search should be very fast with hashed
 index (and the only place where any advantage will be seen).  Hashed
 indexes are useless for any search besides equality and gain more and
 more when the levels of the b-tree index increase.

 I think a comparison with a btree using a functional index should be
 shown.

To do that you'll have to think about the use cases you think hash should win
on. 

For cpu-bound databases with small indexes there might be a win if you can
avoid the binary search of all the elements on a page. (Have we modified btree
to do that or does it still scan sequentially on the leaf pages?)

For i/o-bound databases with very large indexes there should be an opportunity
where btree lookups are O(logn) and hash lookups can in theory be O(1).

However to get O(1) hash lookups need to do extra work at insert time. If they
don't expand the hash as necessary then they end up just being a linear
speedup to whatever lookup algorithm is used to scan the buckets. That isn't
going to win over btree which is already doing a binary search.

The extra work on insert time is O(nlogn) amortized, but I'm not sure
good amortized performance is good enough for Postgres. Users are unhappy when
they're average performance is good but 1/1000 inserts thrashes their i/o
rewriting the whole index... 

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Simon Riggs

On Fri, 2008-07-18 at 11:07 +0100, Gregory Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:

 hash lookups can in theory be O(1).

I'm not sure whether that applies here? I'm interested in how *this*
patch will work, not in more generic algorithm theory.

To patch authors: Can we please see a table showing expected number of
logical I/Os (i,e, block accesses) for btrees and hash indexes

e.g. for 100-byte rows...

rowsbtree   hash
-   
10^2
10^3
10^4
10^5
10^6
10^7
10^8

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Alvaro Herrera
Gregory Stark escribió:

 For cpu-bound databases with small indexes there might be a win if you can
 avoid the binary search of all the elements on a page. (Have we modified btree
 to do that or does it still scan sequentially on the leaf pages?)

Hmm?  It has used binary search since as long as I can remember ... see
_bt_first and _bt_binsrch.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Heikki Linnakangas

Gregory Stark wrote:

For i/o-bound databases with very large indexes there should be an opportunity
where btree lookups are O(logn) and hash lookups can in theory be O(1).


Ignoring the big-O complexity, if a hash index only stores a 32-bit hash 
code instead of the whole key, it could be a big win in storage size, 
and therefore in cache-efficiency and performance, when the keys are 
very long.


Granted, it's not very common to use a 1K text field as a key column...

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Gregory Stark
Heikki Linnakangas [EMAIL PROTECTED] writes:

 Gregory Stark wrote:
 For i/o-bound databases with very large indexes there should be an 
 opportunity
 where btree lookups are O(logn) and hash lookups can in theory be O(1).

 Ignoring the big-O complexity, if a hash index only stores a 32-bit hash code
 instead of the whole key, it could be a big win in storage size, and therefore
 in cache-efficiency and performance, when the keys are very long.

I think it has to show an improvement over an expression index over
(hashany(col)) and not just an improvement over an index over col due to col
being large.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Jonah H. Harris
On Fri, Jul 18, 2008 at 10:44 AM, Heikki Linnakangas
[EMAIL PROTECTED] wrote:
 Ignoring the big-O complexity, if a hash index only stores a 32-bit hash
 code instead of the whole key, it could be a big win in storage size, and
 therefore in cache-efficiency and performance, when the keys are very long.

Agreed.  My thinking is that there's either something inherently wrong
with the implementation, or we're performing so many disk I/Os that
it's nearly equivalent to b-tree.  Tom has a couple suggestions which
Xiao and I will explore.

 Granted, it's not very common to use a 1K text field as a key column...

Especially for direct equality comparison :)

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | [EMAIL PROTECTED]
Edison, NJ 08837 | http://www.enterprisedb.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Tom Lane
Jonah H. Harris [EMAIL PROTECTED] writes:
 Agreed.  My thinking is that there's either something inherently wrong
 with the implementation, or we're performing so many disk I/Os that
 it's nearly equivalent to b-tree.  Tom has a couple suggestions which
 Xiao and I will explore.

I finally got a chance to look through the patch in some detail.
If I haven't missed something, there are just two improvement ideas
embodied in it:

1. Store just the hash value, and not the index key value, in hash
index entries.  (This means that all hash indexscans become lossy
and have to be rechecked at the heap.)

2. Keep the contents of each index page ordered by hash value, and use
binary instead of linear search to find the matching item(s) during an
indexscan.  (This depends on #1 because recomputing the hash values
during the binary search would be too expensive --- although you could
also make it work by storing *both* the hash value and the original
key.)

I suppose that the main point of #1 is to reduce index size by
allowing more tuples to fit in each bucket.  However, the patch
neglects to adjust the target-fillfactor calculation in _hash_metapinit,
which means that the code won't actually put any more tuples per bucket
(on average) than it did before.  So the index isn't actually any
smaller and you get no I/O savings --- you just have more unused
space on a typical page.  Fixing that might help.

FWIW, I had always assumed that part of the solution to hash's woes
would involve decoupling the bucket size from the page size, so
that you could have multiple buckets per page.  But maybe the
binary-search idea makes that unnecessary.  I'm not sure.  A whole
lot depends on how evenly the buckets get filled.  You should probably
investigate how many tuples actually end up in each bucket with and
without the patch.

In the realm of micro-optimizations that might be significant, I think
you really need to get rid of all those _create_hash_desc calls,
particularly the one in _hash_checkqual which is part of the inner loop
of an indexscan.  Not only are they slow but they probably represent a
serious memory leak in a scan that returns many tuples.  For reading the
hash value out of an existing index tuple, I don't think you should be
bothering with a tupdesc at all --- don't use index_getattr, just map a
C struct onto the known layout of a indextuple with a single never-null
int field.  This would particularly make for a noticeable improvement in
the speed of _hash_binsearch.  The tupdescs used in storing an index
entry are probably needed, but you could just use a single statically
declared tupdesc (look at the ones in relcache.c for examples of
building one as a constant).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Kenneth Marshall
I just ran my original 16M word test case against the patched
version, and like Tom noted below, the tuples per bucket
calculation is wrong which results in identical index sizes
for both the original version and the hash-value-only version.

 I suppose that the main point of #1 is to reduce index size by
On Fri, Jul 18, 2008 at 12:23:25PM -0400, Tom Lane wrote:
 Jonah H. Harris [EMAIL PROTECTED] writes:
  Agreed.  My thinking is that there's either something inherently wrong
  with the implementation, or we're performing so many disk I/Os that
  it's nearly equivalent to b-tree.  Tom has a couple suggestions which
  Xiao and I will explore.
 
 I finally got a chance to look through the patch in some detail.
 If I haven't missed something, there are just two improvement ideas
 embodied in it:
 
 1. Store just the hash value, and not the index key value, in hash
 index entries.  (This means that all hash indexscans become lossy
 and have to be rechecked at the heap.)
 
 2. Keep the contents of each index page ordered by hash value, and use
 binary instead of linear search to find the matching item(s) during an
 indexscan.  (This depends on #1 because recomputing the hash values
 during the binary search would be too expensive --- although you could
 also make it work by storing *both* the hash value and the original
 key.)
 
 allowing more tuples to fit in each bucket.  However, the patch
 neglects to adjust the target-fillfactor calculation in _hash_metapinit,
 which means that the code won't actually put any more tuples per bucket
 (on average) than it did before.  So the index isn't actually any
 smaller and you get no I/O savings --- you just have more unused
 space on a typical page.  Fixing that might help.
 
 FWIW, I had always assumed that part of the solution to hash's woes
 would involve decoupling the bucket size from the page size, so
 that you could have multiple buckets per page.  But maybe the
 binary-search idea makes that unnecessary.  I'm not sure.  A whole
 lot depends on how evenly the buckets get filled.  You should probably
 investigate how many tuples actually end up in each bucket with and
 without the patch.
 
I think that while the binary-search idea will improve the lookup
over the original sequential scan of the bucket, it makes updates
much more expensive particularly with buckets approaching 100%
full. The idea that I have been mulling over tries to improve access
times by breaking a bucket in mini-virtual buckets within a page. We
restrict the size of the mini-bucket to be pagesize/(1/2^n). The
sweet spot should be around n=6 or 7 which for an 8k pagesize yields
a mini-bucket size of 32 or 64 bytes. Then the search for the value
in a page is to read the virtual bucket corresponding to the n bits
of the hash value.

The second piece is to take advantage of the fact that the size of
the mini-bucket is not an even multiple of the size of a hash index
tuple and aggregate all the lost space for use as the first overflow
page for all of a pages mini-buckets. This avoids the I/O needed to
read a full overflow page from disk and accomodates the imperfections
in the hash function distribution. The overflow pages, both the virtual
first and subsequent real pages would benefit from the binary lookups.
It may also be worth storing the high and low hash values specially to
avoid the search in a page if its value would not be on the page.

 In the realm of micro-optimizations that might be significant, I think
 you really need to get rid of all those _create_hash_desc calls,
 particularly the one in _hash_checkqual which is part of the inner loop
 of an indexscan.  Not only are they slow but they probably represent a
 serious memory leak in a scan that returns many tuples.  For reading the
 hash value out of an existing index tuple, I don't think you should be
 bothering with a tupdesc at all --- don't use index_getattr, just map a
 C struct onto the known layout of a indextuple with a single never-null
 int field.  This would particularly make for a noticeable improvement in
 the speed of _hash_binsearch.  The tupdescs used in storing an index
 entry are probably needed, but you could just use a single statically
 declared tupdesc (look at the ones in relcache.c for examples of
 building one as a constant).
 
+1

   regards, tom lane
 

I think that this sort of virtual bucket would allow us to take
better advantage of the O(1) behavior. What do you all think?

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-18 Thread Kenneth Marshall
FYI,

I just patched the fill-factor calculation and re-ran my test.
The index size dropped from 513M to 43M which is the same disk
footprint as the corresponding btree index.

Have a nice weekend.
Ken

On Fri, Jul 18, 2008 at 12:23:14PM -0500, Kenneth Marshall wrote:
 I just ran my original 16M word test case against the patched
 version, and like Tom noted below, the tuples per bucket
 calculation is wrong which results in identical index sizes
 for both the original version and the hash-value-only version.
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH]-hash index improving

2008-07-17 Thread Xiao Meng
The patch store hash code only in the index tuple.
It based on  Neil Conway's patch with an old version of PostgreSQL.
It passes the regression test but I didn't test the performance yet.
Anyone interested can make a performance test;-)
You can undefine the macro HASHVALUE_ONLY in hash.h to get the
original implementation.
It's a preliminary implementation and  I'm looking for input here.
Hope to hear from you.

-- 
Best Regards,
Xiao Meng

DKERC, Harbin Institute of Technology, China
Gtalk: [EMAIL PROTECTED]
MSN: [EMAIL PROTECTED]
http://xiaomeng.yo2.cn
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 6a5c000..416bf57 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -129,7 +129,11 @@ hashbuildCallback(Relation index,
 	IndexTuple	itup;
 
 	/* form an index tuple and point it at the heap tuple */
+#ifdef HASHVALUE_ONLY
+itup = _hash_form_tuple(index, values,isnull);
+#else
 	itup = index_form_tuple(RelationGetDescr(index), values, isnull);
+#endif
 	itup-t_tid = htup-t_self;
 
 	/* Hash indexes don't index nulls, see notes in hashinsert */
@@ -171,7 +175,12 @@ hashinsert(PG_FUNCTION_ARGS)
 	IndexTuple	itup;
 
 	/* generate an index tuple */
+#ifdef HASHVALUE_ONLY
+itup = _hash_form_tuple(rel, values, isnull);
+#else
 	itup = index_form_tuple(RelationGetDescr(rel), values, isnull);
+#endif
+
 	itup-t_tid = *ht_ctid;
 
 	/*
@@ -212,7 +221,11 @@ hashgettuple(PG_FUNCTION_ARGS)
 	bool		res;
 
 	/* Hash indexes are never lossy (at the moment anyway) */
-	scan-xs_recheck = false;
+#ifdef HASHVALUE_ONLY
+scan-xs_recheck = true;
+#else
+scan-xs_recheck = false;
+#endif
 
 	/*
 	 * We hold pin but not lock on current buffer while outside the hash AM.
diff --git a/src/backend/access/hash/hashinsert.c b/src/backend/access/hash/hashinsert.c
index 3eb226a..b79d4f8 100644
--- a/src/backend/access/hash/hashinsert.c
+++ b/src/backend/access/hash/hashinsert.c
@@ -52,9 +52,15 @@ _hash_doinsert(Relation rel, IndexTuple itup)
 	 */
 	if (rel-rd_rel-relnatts != 1)
 		elog(ERROR, hash indexes support only one index key);
+#ifdef HASHVALUE_ONLY
+	datum = index_getattr(itup, 1, _create_hash_desc(), isnull);
+	Assert(!isnull);
+hashkey = DatumGetUInt32(datum);
+#else
 	datum = index_getattr(itup, 1, RelationGetDescr(rel), isnull);
 	Assert(!isnull);
 	hashkey = _hash_datum2hashkey(rel, datum);
+#endif
 
 	/* compute item size too */
 	itemsz = IndexTupleDSize(*itup);
@@ -195,13 +201,26 @@ _hash_pgaddtup(Relation rel,
 			   Size itemsize,
 			   IndexTuple itup)
 {
-	OffsetNumber itup_off;
-	Page		page;
+	OffsetNumberitup_off;
+	Page		page;
+Datum   datum;
+boolisNull;
+uint32  hashkey;
 
 	_hash_checkpage(rel, buf, LH_BUCKET_PAGE | LH_OVERFLOW_PAGE);
 	page = BufferGetPage(buf);
 
+#ifdef HASHVALUE_ONLY
+datum = index_getattr(itup,
+  1,
+  _create_hash_desc(),
+  isNull);
+Assert(!isNull);
+hashkey = DatumGetUInt32(datum);
+itup_off = _hash_binsearch(page, hashkey);
+#else
 	itup_off = OffsetNumberNext(PageGetMaxOffsetNumber(page));
+#endif
 	if (PageAddItem(page, (Item) itup, itemsize, itup_off, false, false)
 		== InvalidOffsetNumber)
 		elog(ERROR, failed to add index item to \%s\,
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index b0b5874..bba64c4 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -785,7 +785,12 @@ _hash_splitbucket(Relation rel,
 	OffsetNumber omaxoffnum;
 	Page		opage;
 	Page		npage;
-	TupleDesc	itupdesc = RelationGetDescr(rel);
+	TupleDesc	itupdesc;
+#ifdef HASHVALUE_ONLY 
+itupdesc = _create_hash_desc();
+#else
+itupdesc = RelationGetDescr(rel);
+#endif
 
 	/*
 	 * It should be okay to simultaneously write-lock pages from each bucket,
@@ -854,9 +859,13 @@ _hash_splitbucket(Relation rel,
 		itup = (IndexTuple) PageGetItem(opage, PageGetItemId(opage, ooffnum));
 		datum = index_getattr(itup, 1, itupdesc, null);
 		Assert(!null);
-
+#ifdef HASHVALUE_ONLY
+		bucket = _hash_hashkey2bucket(DatumGetUInt32(datum),
+	  maxbucket, highmask, lowmask);
+#else
 		bucket = _hash_hashkey2bucket(_hash_datum2hashkey(rel, datum),
 	  maxbucket, highmask, lowmask);
+#endif
 
 		if (bucket == nbucket)
 		{
diff --git a/src/backend/access/hash/hashsearch.c b/src/backend/access/hash/hashsearch.c
index 258526b..b9a0307 100644
--- a/src/backend/access/hash/hashsearch.c
+++ b/src/backend/access/hash/hashsearch.c
@@ -178,6 +178,7 @@ _hash_first(IndexScanDesc scan, ScanDirection dir)
 		hashkey = _hash_datum2hashkey_type(rel, cur-sk_argument,
 		   cur-sk_subtype);
 
+so-hashso_sk_hash = hashkey;
 	/*
 	 * Acquire shared split lock so we can compute the target bucket safely
 	 * (see README).
@@ -289,6 +290,57 @@ _hash_step(IndexScanDesc scan, Buffer *bufP, ScanDirection dir)
 	 * continue to 

Re: [HACKERS] [PATCH]-hash index improving

2008-07-17 Thread Jonah H. Harris
On Thu, Jul 17, 2008 at 5:26 AM, Xiao Meng [EMAIL PROTECTED] wrote:
 The patch store hash code only in the index tuple.
 It based on  Neil Conway's patch with an old version of PostgreSQL.
 It passes the regression test but I didn't test the performance yet.
 Anyone interested can make a performance test;-)
 You can undefine the macro HASHVALUE_ONLY in hash.h to get the
 original implementation.
 It's a preliminary implementation and  I'm looking for input here.
 Hope to hear from you.

Tom, Simon, Heikki, Greg, we need to make sure this SoC project
succeeds and would appreciate your review and input.

Based on some of Kenneth Marshall and Tom Raney's past hash index test
cases, Xiao and I are going to work on some benchmarks for measuring
the performance-related aspects of this project.

Thanks!

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | [EMAIL PROTECTED]
Edison, NJ 08837 | http://www.enterprisedb.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-17 Thread Alvaro Herrera
Xiao Meng escribió:
 The patch store hash code only in the index tuple.
 It based on  Neil Conway's patch with an old version of PostgreSQL.
 It passes the regression test but I didn't test the performance yet.
 Anyone interested can make a performance test;-)
 You can undefine the macro HASHVALUE_ONLY in hash.h to get the
 original implementation.

I think having the HASHVALUE_ONLY define is not a good idea -- it just
makes the patch harder to read.  I suggest just removing the old code
and putting the new code in place.  (That's why we have revision
control.)


-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-17 Thread Jonah H. Harris
On Thu, Jul 17, 2008 at 12:42 PM, Alvaro Herrera
[EMAIL PROTECTED] wrote:
 I think having the HASHVALUE_ONLY define is not a good idea -- it just
 makes the patch harder to read.  I suggest just removing the old code
 and putting the new code in place.  (That's why we have revision
 control.)

Agreed.  I'm also getting a crash on it with large data sets, so we'll
make sure to get those fixed in the next version of the patch.

-Jonah

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-17 Thread Kenneth Marshall
On Thu, Jul 17, 2008 at 12:42:39PM -0400, Alvaro Herrera wrote:
 Xiao Meng escribi?:
  The patch store hash code only in the index tuple.
  It based on  Neil Conway's patch with an old version of PostgreSQL.
  It passes the regression test but I didn't test the performance yet.
  Anyone interested can make a performance test;-)
  You can undefine the macro HASHVALUE_ONLY in hash.h to get the
  original implementation.
 
 I think having the HASHVALUE_ONLY define is not a good idea -- it just
 makes the patch harder to read.  I suggest just removing the old code
 and putting the new code in place.  (That's why we have revision
 control.)
 
One thing it helps is building an old version and a new version
for comparative testing. Otherwise, you could end up with an apples-to-
oranges comparison. I certainly think that the final patch should not
have it, but it is useful now for testing and comparisons.

My two cents,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-17 Thread Kenneth Marshall
On Thu, Jul 17, 2008 at 02:00:07PM -0400, Jonah H. Harris wrote:
 On Thu, Jul 17, 2008 at 1:54 PM, Kenneth Marshall [EMAIL PROTECTED] wrote:
  I think having the HASHVALUE_ONLY define is not a good idea -- it just
  makes the patch harder to read.  I suggest just removing the old code
  and putting the new code in place.  (That's why we have revision
  control.)
 
  One thing it helps is building an old version and a new version
  for comparative testing. Otherwise, you could end up with an apples-to-
  oranges comparison. I certainly think that the final patch should not
  have it, but it is useful now for testing and comparisons.
 
 Yes, that's why Xiao did it that way.  However, we traditionally just
 submit a patch with only the changes and it's up to the person testing
 to have an identical build-tree without the patch for testing.
 Another reason for it is that even if you build without the define,
 the patch author may have mistakenly added something outside the ifdef
 which could impact testing.
 
 I agree with Alvaro that we should submit it as a standard change patch.

Okay, that makes sense.

Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-17 Thread Jonah H. Harris
On Thu, Jul 17, 2008 at 1:54 PM, Kenneth Marshall [EMAIL PROTECTED] wrote:
 I think having the HASHVALUE_ONLY define is not a good idea -- it just
 makes the patch harder to read.  I suggest just removing the old code
 and putting the new code in place.  (That's why we have revision
 control.)

 One thing it helps is building an old version and a new version
 for comparative testing. Otherwise, you could end up with an apples-to-
 oranges comparison. I certainly think that the final patch should not
 have it, but it is useful now for testing and comparisons.

Yes, that's why Xiao did it that way.  However, we traditionally just
submit a patch with only the changes and it's up to the person testing
to have an identical build-tree without the patch for testing.
Another reason for it is that even if you build without the define,
the patch author may have mistakenly added something outside the ifdef
which could impact testing.

I agree with Alvaro that we should submit it as a standard change patch.

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | [EMAIL PROTECTED]
Edison, NJ 08837 | http://www.enterprisedb.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-17 Thread Alvaro Herrera
Kenneth Marshall escribió:
 On Thu, Jul 17, 2008 at 12:42:39PM -0400, Alvaro Herrera wrote:

  I think having the HASHVALUE_ONLY define is not a good idea -- it just
  makes the patch harder to read.  I suggest just removing the old code
  and putting the new code in place.  (That's why we have revision
  control.)
  
 One thing it helps is building an old version and a new version
 for comparative testing. Otherwise, you could end up with an apples-to-
 oranges comparison. I certainly think that the final patch should not
 have it, but it is useful now for testing and comparisons.

For this purpose I think it would be easier to have a separate tree with
the patch, and one without it.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-17 Thread Jonah H. Harris
On Thu, Jul 17, 2008 at 5:26 AM, Xiao Meng [EMAIL PROTECTED] wrote:
 The patch store hash code only in the index tuple.
 It based on  Neil Conway's patch with an old version of PostgreSQL.
 It passes the regression test but I didn't test the performance yet.
 Anyone interested can make a performance test;-)
 You can undefine the macro HASHVALUE_ONLY in hash.h to get the
 original implementation.
 It's a preliminary implementation and  I'm looking for input here.
 Hope to hear from you.

I've spent some time today performing tests similar to those mentioned
here (http://archives.postgresql.org/pgsql-hackers/2007-09/msg00208.php)

Using a word list of 2650024 unique words (maximum length is 118
bytes), build times are still high, but I'm not really seeing any
performance improvements over b-tree.  I haven't profiled it yet, but
my test is as follows:

- Created the dict table
- Loaded the dict table
- Counted the records in the dict table
- Created the index
- Shutdown the database
- Randomly selected 200 entries from the word list and built a file
full of (SELECT * FROM dict WHERE word = 'word') queries using them.
- Cleared out the kernel cache
- Started the database
- Ran the query file

The result of this is between 5-10ms improvement in the overall
execution time of all 200 queries.  The time-per-query is practically
unnoticeable.  As this is in the range of noise, methinks there's a
larger problem with hash indexes.  I haven't looked heavily into their
implementation, but do you any of you know of any major design flaws?

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | [EMAIL PROTECTED]
Edison, NJ 08837 | http://www.enterprisedb.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-17 Thread Kenneth Marshall
On Thu, Jul 17, 2008 at 04:24:28PM -0400, Jonah H. Harris wrote:
 On Thu, Jul 17, 2008 at 5:26 AM, Xiao Meng [EMAIL PROTECTED] wrote:
  The patch store hash code only in the index tuple.
  It based on  Neil Conway's patch with an old version of PostgreSQL.
  It passes the regression test but I didn't test the performance yet.
  Anyone interested can make a performance test;-)
  You can undefine the macro HASHVALUE_ONLY in hash.h to get the
  original implementation.
  It's a preliminary implementation and  I'm looking for input here.
  Hope to hear from you.
 
 I've spent some time today performing tests similar to those mentioned
 here (http://archives.postgresql.org/pgsql-hackers/2007-09/msg00208.php)
 
 Using a word list of 2650024 unique words (maximum length is 118
 bytes), build times are still high, but I'm not really seeing any
 performance improvements over b-tree.  I haven't profiled it yet, but
 my test is as follows:
 
 - Created the dict table
 - Loaded the dict table
 - Counted the records in the dict table
 - Created the index
 - Shutdown the database
 - Randomly selected 200 entries from the word list and built a file
 full of (SELECT * FROM dict WHERE word = 'word') queries using them.
 - Cleared out the kernel cache
 - Started the database
 - Ran the query file
 
 The result of this is between 5-10ms improvement in the overall
 execution time of all 200 queries.  The time-per-query is practically
 unnoticeable.  As this is in the range of noise, methinks there's a
 larger problem with hash indexes.  I haven't looked heavily into their
 implementation, but do you any of you know of any major design flaws?
 
Jonah,

Thank you for running these tests. I was trying to reproduce my initial
tests on the original system to make it more apples to apples, but the
latest release needs more resources semaphore-wise than the 8.2 and
to fix it on Solaris 8 I will need a reboot. Would you mind posting
the timings for the hash_only index versus the hash_value versus the
btree index for the same test. Also, what is the on-disk size of all
three indexes? This will allow us to figure out the bucket/page load
or fill-factor for each scenario.

The basic implementation looked reasonable. I will take a look at
the patch this evening.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-17 Thread Simon Riggs

On Thu, 2008-07-17 at 16:24 -0400, Jonah H. Harris wrote:
 I'm not really seeing any performance improvements over b-tree.

I'd like to see a theoretical analysis on the benefit case before we
spend too many teraflops on investigation.

In which cases do we expect that hash indexes will beat btrees?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-17 Thread Dann Corbit
-Original Message-

 From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Simon Riggs
 Sent: Thursday, July 17, 2008 4:10 PM

 To: Jonah H. Harris

 Cc: Xiao Meng; pgsql-hackers@postgresql.org; Kenneth Marshall

 Subject: Re: [HACKERS] [PATCH]-hash index improving





 On Thu, 2008-07-17 at 16:24 -0400, Jonah H. Harris wrote:

  I'm not really seeing any performance improvements over b-tree.



 I'd like to see a theoretical analysis on the benefit case before we

 spend too many teraflops on investigation.



 In which cases do we expect that hash indexes will beat btrees?

Large table unique index equality search should be very fast with hashed
index (and the only place where any advantage will be seen).  Hashed
indexes are useless for any search besides equality and gain more and
more when the levels of the b-tree index increase.

Here is a hash index lookup that is frightfully fast:
http://www.corpit.ru/mjt/tinycdb.html

It's a constant database, but the file format might be worth
examination.  Here is a quickie definition of the CDB format:
http://cr.yp.to/cdb/cdb.txt

I think that the problem with hashed indexes tends to be the blocking of
index pages on disk.  For instance, the FastDB/GigaBase author was able
to make FastDB memory based hash indexes that are faster than the memory
based b-tree versions, but not for the disk based GigaBase database.
The only way to get better performance from hash based indexes is to
read fewer index pages than if a tree-based index were used.  So I think
that the scheme used to create the index pages is the focus to make them
worthwhile.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-17 Thread David Fetter
On Thu, Jul 17, 2008 at 02:11:20PM -0400, Alvaro Herrera wrote:
 Kenneth Marshall escribió:
  On Thu, Jul 17, 2008 at 12:42:39PM -0400, Alvaro Herrera wrote:
 
   I think having the HASHVALUE_ONLY define is not a good idea --
   it just makes the patch harder to read.  I suggest just removing
   the old code and putting the new code in place.  (That's why we
   have revision control.)
   
  One thing it helps is building an old version and a new version
  for comparative testing. Otherwise, you could end up with an
  apples-to- oranges comparison. I certainly think that the final
  patch should not have it, but it is useful now for testing and
  comparisons.
 
 For this purpose I think it would be easier to have a separate tree
 with the patch, and one without it.

Here's one tree.  Anyone can get an initial copy as follows:

git clone http://git.postgresql.org/git/~davidfetter/hash/.git

Xiao Meng, if you let me know where your git repo is, say by cloning
onto a machine I can see from the internet and applying your patches
to it, I can pull and let others see it :)

Yes, I know it's a little cumbersome, but we'll get something slicker
as we figure out what slicker really should mean.

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-17 Thread Jonah H. Harris
On Thu, Jul 17, 2008 at 7:37 PM, Dann Corbit [EMAIL PROTECTED] wrote:

 In which cases do we expect that hash indexes will beat btrees?

 Large table unique index equality search should be very fast with hashed
 index (and the only place where any advantage will be seen).

Yes, this is the exact use-case.  Likewise, Dan has provided a good
description regarding the primary implementation goals of a disk-based
hash table.

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | [EMAIL PROTECTED]
Edison, NJ 08837 | http://www.enterprisedb.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-17 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Xiao Meng escribió:
 You can undefine the macro HASHVALUE_ONLY in hash.h to get the
 original implementation.

 I think having the HASHVALUE_ONLY define is not a good idea -- it just
 makes the patch harder to read.

While we are griping about readability: failure to update the comments
to match the code is NOT, NOT, NOT acceptable.  I had barely started
to read the patch before encountering this insult to the reader:
 
 /* Hash indexes are never lossy (at the moment anyway) */
-scan-xs_recheck = false;
+#ifdef HASHVALUE_ONLY
+scan-xs_recheck = true;
+#else
+scan-xs_recheck = false;
+#endif

The fact that the patch doesn't touch backend/access/hash/README is
already grounds for rejection, but can't you be bothered to fix a
comment that is literally one line away from where you are making
it wrong?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-17 Thread Jonah H. Harris
On Fri, Jul 18, 2008 at 1:00 AM, Tom Lane [EMAIL PROTECTED] wrote:
 While we are griping about readability: failure to update the comments
 to match the code is NOT, NOT, NOT acceptable.  I had barely started
 to read the patch before encountering this insult to the reader:
 ...

As this is Xiao's first patch to the community, I'd appreciate it if
you guys would chill a little.  I'm fully aware of what needs to be
done with it clean-up wise, but given that we're under some time
constraints, I asked that he submit his preliminary patch for those
who wanted to preview/test it.

Again, this patch is nowhere near acceptance quality, nor was it
intended to be.  I'll make sure he gets the next one cleaned up prior
to submitting it.

The real question now has been listed above; why are hash index
queries, including this patch, no better than b-tree even for straight
equality comparisons?  Because hash is lossy, we could easily be
performing multiple I/Os for recheck, and that may be causing some of
the performance problems.  I haven't checked I/O for that yet, but was
wondering if you (Tom) knew of any major design/implementation
deficiency we should be taking into consideration regarding PG's hash
index implementation?

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | [EMAIL PROTECTED]
Edison, NJ 08837 | http://www.enterprisedb.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH]-hash index improving

2008-07-17 Thread Tom Lane
Jonah H. Harris [EMAIL PROTECTED] writes:
 The real question now has been listed above; why are hash index
 queries, including this patch, no better than b-tree even for straight
 equality comparisons?

Well, the theoretical advantage is that a hash probe is O(1) while a
btree probe is O(log N) (ignoring a lot of possible nonoptimalities
on each side of course).  So you'd need a test case large enough for
log N to be noticeably more than 1, which I think means in practice that
the upper levels of the btree don't fit in memory.  So small test cases
aren't going to show any improvement.

I think the historical problem with our hash implementation has been
that it was terribly space-inefficient, because of the fixed (and large)
bucket size.  A dense btree index can probably beat a sparse hash up to
exceedingly large values of N.  It's not clear to me how much the patch
at hand does to fix that.

The patch might introduce a new problem, which is extra heap visits
caused by the lossiness of the hash value.  Again, that doesn't hurt
much ideally, but maybe you chanced to use a test case where it does
hurt.  It would be worth sticking in a bit of debug instrumentation to
see whether the number of failed heap visits is significant.

In any case, the reported test seems to be dealing with index sizes in
the tens-of-megabytes range, and it doesn't surprise me a whole lot that
an O(log N) penalty isn't showing up there.  Try a test case where the
index size is significantly larger than your RAM.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers