RSE has done the maths on why *33 is a magic string hashing function... this from his "str" library in str_hash.c:
/* ** Str - String Library ** Copyright (c) 1999-2000 Ralf S. Engelschall <[EMAIL PROTECTED]> ** ** This file is part of Str, a string handling and manipulation ** library which can be found at http://www.engelschall.com/sw/str/. ** ** Permission to use, copy, modify, and distribute this software for ** any purpose with or without fee is hereby granted, provided that ** the above copyright notice and this permission notice appear in all ** copies. ** ** THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED ** WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF ** MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. ** IN NO EVENT SHALL THE AUTHORS AND COPYRIGHT HOLDERS AND THEIR ** CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, ** SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT ** LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF ** USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ** ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, ** OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT ** OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF ** SUCH DAMAGE. ** ** str_hash.c: hashing functions */ #include "str_p.h" /* * DJBX33A (Daniel J. Bernstein, Times 33 with Addition) * * This is Daniel J. Bernstein's popular `times 33' hash function as * posted by him years ago on comp.lang.c. It basically uses a function * like ``hash(i) = hash(i-1) * 33 + string[i]''. This is one of the * best hashing functions for strings. Because it is both computed very * fast and distributes very well. * * The magic of the number 33, i.e. why it works better than many other * constants, prime or not, has never been adequately explained by * anyone. So I try an own RSE-explanation: if one experimentally tests * all multipliers between 1 and 256 (as I did it) one detects that * even numbers are not useable at all. The remaining 128 odd numbers * (except for the number 1) work more or less all equally well. They * all distribute in an acceptable way and this way fill a hash table * with an average percent of approx. 86%. * * If one compares the Chi/2 values resulting of the various * multipliers, the 33 not even has the best value. But the 33 and a * few other equally good values like 17, 31, 63, 127 and 129 have * nevertheless a great advantage over the remaining values in the large * set of possible multipliers: their multiply operation can be replaced * by a faster operation based on just one bit-wise shift plus either a * single addition or subtraction operation. And because a hash function * has to both distribute good and has to be very fast to compute, those * few values should be preferred and seems to be also the reason why * Daniel J. Bernstein also preferred it. */ ...
