On Mon, Oct 26, 2009 at 06:35:13PM -0700, Christophe Pettus wrote:
>
> On Oct 26, 2009, at 5:24 PM, Itagaki Takahiro wrote:
>
>> Hmmm, hashtext() returns int32. ,
>> Can you reduce the collision issue if we had hashtext64()?
>
> That would certainly reduce the chance of a collison considerably, assuming 
> the right algorithm.
>
> --
> -- Christophe Pettus
>    x...@thebuild.com
>
The current hash function can already support generating a 64-bit
hash value by using both the b and c values. The function is called
hashlittle2 and has this comment in the original Bob Jenkins 2006
code:

/*
 * hashlittle2: return 2 32-bit hash values
 *
 * This is identical to hashlittle(), except it returns two 32-bit hash
 * values instead of just one.  This is good enough for hash table
 * lookup with 2^^64 buckets, or if you want a second hash if you're not
 * happy with the first, or if you want a probably-unique 64-bit ID for
 * the key.  *pc is better mixed than *pb, so use *pc first.  If you want
 * a 64-bit value do something like "*pc + (((uint64_t)*pb)<<32)".
 */

This should be a simple change. It would be worth running it by
the developer community to figure out how to add this functionality
in a way that will make 64-bit hashes available easily to other
code in the DB, perhaps even using them in very large hash indexes.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to