On 4/6/08, Tom Lane <[EMAIL PROTECTED]> wrote: > So adopting the mixing changes would make it faster yet. What we need > to be certain of is that this doesn't expose us to poorer hashing. > We know that it is critical that all bits of the input affect all bits > of the hash fairly uniformly --- otherwise we are subject to very > serious performance hits at higher levels in hash join, for instance. > The comments in the new code led me to worry that Jenkins had > compromised on that property in search of speed. I looked at his > website but couldn't find any real discussion of the design principles > for the new mixing code ...
Scroll at the end of doobs.html, there is the longer discussion. My understanding is following: His design is based on 2 properties of mixing function: - reversible - that means for each input tuple of (a,b,c) corresponds exactly one output tuple of (a,b,c). Such property guarantees that after repeatedly applying mixing function, no bits get lost. - avalanche - that means any single bit change in input (a,b,c) half of the output bits are affected. His "insight" (as he called it) when creating lookup3 was that the bulk mixing that is applied repeatedly does not need to have avalanche, it only needs to be reversible, meaning all the bits that went in, are still there after mixing repeatedly. And only final mixing needs to have avalanche as it produces final result, but it does not need to be reversible as it wont be applied repeatedly and most of the result is dropped anyway. IMHO his choices are reasonable. -- marko -- Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-patches