On 2012-11-02 15:01, Martin Truebner wrote: > I still do not see how changing a numbering scheme from based > on ten to a system based on thirty-six does create any clusters. > It doesn't. Robin has pretty much acknowledged that if the input data are uniform, a modulus hash will likewise be uniform. Others in this thread are more fixated on defending prime moduli.
But consider: if the input data are binary strings and you choose 2^n as a modulus, the hash will merely be the last n bits of the input datum. Powers of two are a bad choice for modulus-hashing EBCDIC text where that last character is likely to be clustered around displayable code points. OTOH if the input data are base-37 numbers a modulus 37 hash merely returns the last digit; again a bad choice. But computer scientists have a cultural bias toward base 2, not any other prime such as 37, even as number theorists have some cultural bias toward base 10 (evident, at least, in recreational essays). -- gil
