On Wed, May 23, 2012 at 12:44 AM, Branko Čibej <br...@apache.org> wrote: >... > I'd really like to see you explain why this change of yours (33 -> 33^4) > is relevant in practice. It's not at all clear that this multiplier > gives a better key distribution than the time-honoured 33.
Actually, there are some reasoned/studied arguments for 33 ("it works well, but nobody knows why"). And 33^4 is likely a poor replacement :-P For PoCore's hash table[1], I did a survey of the research around hashing functions. I selected the FNV-1 hash function: http://www.isthe.com/chongo/tech/comp/fnv/ Comparisons of functions are here: http://www.eternallyconfuzzled.com/tuts/algorithms/jsw_tut_hashing.aspx The 33 variety is named as the "Bernstein hash". > It's my considered opinion that this fiddling around with hash function > implementations is way overboard. Just use apr_hashfunc_default already. > Unless you can prove that using your "optimized" version results in > siginificant savings in space and/or time, anything else is just piling > on more lines of code that need to be maintained for no good reason. I'm assuming Stefan ran some tests, and (iirc) saw a few percent increase. For that, maybe a new hash function is okay. (it isn't like he built a whole new type; just a new func) Cheers, -g [1] http://pocore.googlecode.com/svn/trunk/src/hash.c