On Mon, Oct 26, 2009 at 06:35:13PM -0700, Christophe Pettus wrote: > > On Oct 26, 2009, at 5:24 PM, Itagaki Takahiro wrote: > >> Hmmm, hashtext() returns int32. , >> Can you reduce the collision issue if we had hashtext64()? > > That would certainly reduce the chance of a collison considerably, assuming > the right algorithm. > > -- > -- Christophe Pettus > x...@thebuild.com > The current hash function can already support generating a 64-bit hash value by using both the b and c values. The function is called hashlittle2 and has this comment in the original Bob Jenkins 2006 code:
/* * hashlittle2: return 2 32-bit hash values * * This is identical to hashlittle(), except it returns two 32-bit hash * values instead of just one. This is good enough for hash table * lookup with 2^^64 buckets, or if you want a second hash if you're not * happy with the first, or if you want a probably-unique 64-bit ID for * the key. *pc is better mixed than *pb, so use *pc first. If you want * a 64-bit value do something like "*pc + (((uint64_t)*pb)<<32)". */ This should be a simple change. It would be worth running it by the developer community to figure out how to add this functionality in a way that will make 64-bit hashes available easily to other code in the DB, perhaps even using them in very large hash indexes. Regards, Ken -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers