AUGER Cédric <> writes:

> More seriously, hashtables are here to store values, so you don't want
> want to have a hash value bigger than (2^30)-1.
> In fact even such a value is too big for that purpouse IMHO, since on a
> 4Gio RAM computer, it will take half it to store the table, assuming
> each entry only takes 4 octets, that is it would be ok only for storing
> just the structure of the table (each entry has to be a pointer),
> so that in practice, either your table is ridiculously big to store
> too few values either it is relevant to have such a size and in this
> case your computer will spend all its time to swap since you won't
> have enough RAM.

But I have 128GB ram. :) That allows for a hashtable of size 2^33
entries pointing at simple values (e.g. ints) without swapping. And ram
sizes get bigger and bigger.

[Doesn't a Hashtbl of ints store the ints directly? That would allow
2^34 entries.]

>  Clearly you cannot a bijection from 63 bits
> integers to 31 bits integers; so there are a lot of collisions. I do
> not have a 64 arch, but I guesse that if you hash "2^48+168" you will
> get "168". I guess such a function is ways to be ideal, since when
> playing with bitvectors having two integers which differs by only one
> bit is very common, but at least it is a fast function, and has a good
> repartition (you have 2^32 collisions exactly per bucket).

Ignoring the upper bits makes for a verry verry poor hash function. Also
means it is 100% trivial to create a DOS by tuneing your input values to
only differ in the upper bits.


Caml-list mailing list.  Subscription management and archives:
Beginner's list:
Bug reports:

Reply via email to