On Sun, Apr 06, 2008 at 12:02:25PM -0400, Tom Lane wrote:
> Kenneth Marshall <[EMAIL PROTECTED]> writes:
> > Okay, I will strip the VALGRIND paths. I did not see a real need for them
> > either.
> I have a patch ready to commit (as soon as I fix the regression test
> issues) that incorporates all the word-wide-ness stuff. All you really
> need to look at is the question of hash quality.
> I did confirm that the mixing changes account for a noticeable chunk
> of the runtime improvement. For instance on a Xeon
> hash_any_old(32K): 4.386922 s (CVS HEAD)
> hash_any(32K): 3.853754 s (CVS + word-wide calcs)
> hashword(32K): 3.041500 s (from patch)
> hashlittle(32K): 3.092297 s (from patch)
> hash_any_old(32K unaligned): 4.390311 s
> hash_any(32K unaligned): 4.380700 s
> hashlittle(32K unaligned): 3.464802 s
> hash_any_old(8 bytes): 1.580008 s
> hash_any(8 bytes): 1.293331 s
> hashword(8 bytes): 1.137054 s
> hashlittle(8 bytes): 1.112997 s
> So adopting the mixing changes would make it faster yet. What we need
> to be certain of is that this doesn't expose us to poorer hashing.
> We know that it is critical that all bits of the input affect all bits
> of the hash fairly uniformly --- otherwise we are subject to very
> serious performance hits at higher levels in hash join, for instance.
> The comments in the new code led me to worry that Jenkins had
> compromised on that property in search of speed. I looked at his
> website but couldn't find any real discussion of the design principles
> for the new mixing code ...
> regards, tom lane
Here is a section from http://burtleburtle.net/bob/hash/doobs.html
describing some testing that Bob Jenkins did concerning the mixing
properties of both his original hash function (our current hash_any)
and the new version (the patch)--the last paragraph in particular.
> A hash I wrote nine years later designed along the same lines as
> "My Hash", see http://burtleburtle.net/bob/c/lookup3.c. It takes
> 2n instructions per byte for mixing instead of 3n. When fitting
> bytes into registers (the other 3n instructions), it takes advantage
> of alignment when it can (a trick learned from Paul Hsieh's hash).
> It doesn't bother to reserve a byte for the length. That allows
> zero-length strings to require no mixing. More generally, the
> length that requires additional mixes is now 13-25-37 instead of
> One theoretical insight was that the last mix doesn't need to do
> well in reverse (though it has to affect all output bits). And the
> middle mixing steps don't have to affect all output bits (affecting
> some 32 bits is enough), though it does have to do well in reverse.
> So it uses different mixes for those two cases. "My Hash" (lookup2.c)
> had a single mixing operation that had to satisfy both sets of
> requirements, which is why it was slower.
> On a Pentium 4 with gcc 3.4.?, Paul's hash was usually faster than
> lookup3.c. On a Pentium 4 with gcc 3.2.?, they were about the same
> speed. On a Pentium 4 with icc -O2, lookup3.c was a little faster
> than Paul's hash. I don't know how it would play out on other chips
> and other compilers. lookup3.c is slower than the additive hash
> pretty much forever, but it's faster than the rotating hash for
> keys longer than 5 bytes.
> lookup3.c does a much more thorough job of mixing than any of my
> previous hashes (lookup2.c, lookup.c, One-at-a-time). All my
> previous hashes did a more thorough job of mixing than Paul Hsieh's
> hash. Paul's hash does a good enough job of mixing for most
> practical purposes.
> The most evil set of keys I know of are sets of keys that are all
> the same length, with all bytes zero, except with a few bits set.
> This is tested by frog.c.. To be even more evil, I had my hashes
> return b and c instead of just c, yielding a 64-bit hash value.
> Both lookup.c and lookup2.c start seeing collisions after 253
> frog.c keypairs. Paul Hsieh's hash sees collisions after 217
> keypairs, even if we take two hashes with different seeds.
> lookup3.c is the only one of the batch that passes this test. It
> gets its first collision somewhere beyond 263 keypairs, which is
> exactly what you'd expect from a completely random mapping to
> 64-bit values.
I am ready to do some comparison runs between the old hash function
and the new hash function to validate its mixing ability versus our
current function, although the results will seem almost anecdotal.
Do you happen to have particular hashing problems in mind that I
could use for testing? Depending upon the problem sizes you are
interested in gathering empirical data it may take many hours of
CPU time. If I will need more than a single CPU to perform the
testing in a timely fashion, I will need to gain access to our
local cluster resources.
Sent via pgsql-patches mailing list (firstname.lastname@example.org)
To make changes to your subscription: