On 2012-11-02 15:01, Martin Truebner wrote:
> I still do not see how changing a numbering scheme from based
> on ten to a system based on thirty-six does create any clusters.
>
It doesn't.  Robin has pretty much acknowledged that if the
input data are uniform, a modulus hash will likewise be uniform.
Others in this thread are more fixated on defending prime
moduli.

But consider: if the input data are binary strings and you
choose 2^n as a modulus, the hash will merely be the last
n bits of the input datum.  Powers of two are a bad choice
for modulus-hashing EBCDIC text where that last character
is likely to be clustered around displayable code points.

OTOH if the input data are base-37 numbers a modulus 37
hash merely returns the last digit; again a bad choice.

But computer scientists have a cultural bias toward base
2, not any other prime such as 37, even as number theorists
have some cultural bias toward base 10 (evident, at least,
in recreational essays).

-- gil

Reply via email to