Neil Conway <[EMAIL PROTECTED]> writes: > On Fri, 2007-04-27 at 10:02 -0400, Tom Lane wrote: >> Perhaps a sufficiently robust way would be to form the hash as the >> XOR of each supplied digit, circular-shifted by say 3 times the >> digit's weight.
> The only objection I have to this is that it means we need to have > another hash function in the backend. The Jenkins hash we use in > hash_any() has been studied and we can have at least some confidence in > its collision-resistance, etc. I'm still not very comfortable with that. You're proposing to add a pretty obvious failure mechanism --- any numeric-returning function that failed to "normalize" its output would now create a subtle, hard-to-find bug. Even if you can promise that all the functions in numeric.c get it right, what of user-written add-ons? And the only return for taking this risk is speculation that the performance of the hash function might be better. I think if you want to go this way, you should at least provide some evidence that we get a hashing performance benefit in exchange for adding a new restriction on numeric-value validity. Perhaps a suitable test would be to compare the number of hash collisions in a large set of randomly-chosen-but-distinct numeric values. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match