On 2 Feb 2010, at 03:05, Richard O'Keefe wrote:

A simple hash-function for strings is to simply exclusive-or the bytes and then reduce modulo a prime number,

Simply exclusive-oring the bytes will give you at most 256 distinct
results.  (For an ASCII source, 128 distinct results.)  After that,
there hardly seems to be any point in reduction modulo a prime.

Right - I just gave an example of how simple hash functions may be created. The original question deals with Int32s, not strings.

As already mentioned before, there are more advance libraries here
  http://en.wikipedia.org/wiki/Perfect_hash_function
but they are not written in Haskell. So you have to rewrite them or use the FFI.

This approach can't tell a CAT from an ACT or a DOG from a GOD, which
is another strike against it. (It also can't tell a TITTLE from a TILE,
or a BOTTLE from a BOLE, for obvious reasons.)

The hash function just tries to produce random lookup values, which are used to flatten the average depth of the lookup table at the entries of the array. So there is a tradeoff between a fast hash function, and one that does a good job.

  Hans


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to