On Thu, Jul 17, 2008 at 04:24:28PM -0400, Jonah H. Harris wrote: > On Thu, Jul 17, 2008 at 5:26 AM, Xiao Meng <[EMAIL PROTECTED]> wrote: > > The patch store hash code only in the index tuple. > > It based on Neil Conway's patch with an old version of PostgreSQL. > > It passes the regression test but I didn't test the performance yet. > > Anyone interested can make a performance test;-) > > You can undefine the macro HASHVALUE_ONLY in hash.h to get the > > original implementation. > > It's a preliminary implementation and I'm looking for input here. > > Hope to hear from you. > > I've spent some time today performing tests similar to those mentioned > here (http://archives.postgresql.org/pgsql-hackers/2007-09/msg00208.php) > > Using a word list of 2650024 unique words (maximum length is 118 > bytes), build times are still high, but I'm not really seeing any > performance improvements over b-tree. I haven't profiled it yet, but > my test is as follows: > > - Created the dict table > - Loaded the dict table > - Counted the records in the dict table > - Created the index > - Shutdown the database > - Randomly selected 200 entries from the word list and built a file > full of (SELECT * FROM dict WHERE word = '<word>') queries using them. > - Cleared out the kernel cache > - Started the database > - Ran the query file > > The result of this is between 5-10ms improvement in the overall > execution time of all 200 queries. The time-per-query is practically > unnoticeable. As this is in the range of noise, methinks there's a > larger problem with hash indexes. I haven't looked heavily into their > implementation, but do you any of you know of any major design flaws? > Jonah,
Thank you for running these tests. I was trying to reproduce my initial tests on the original system to make it more apples to apples, but the latest release needs more resources semaphore-wise than the 8.2 and to fix it on Solaris 8 I will need a reboot. Would you mind posting the timings for the hash_only index versus the hash_value versus the btree index for the same test. Also, what is the on-disk size of all three indexes? This will allow us to figure out the bucket/page load or fill-factor for each scenario. The basic implementation looked reasonable. I will take a look at the patch this evening. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers