AUGER Cédric <cedric.au...@lri.fr> writes: > More seriously, hashtables are here to store values, so you don't want > want to have a hash value bigger than (2^30)-1. > > In fact even such a value is too big for that purpouse IMHO, since on a > 4Gio RAM computer, it will take half it to store the table, assuming > each entry only takes 4 octets, that is it would be ok only for storing > just the structure of the table (each entry has to be a pointer), > so that in practice, either your table is ridiculously big to store > too few values either it is relevant to have such a size and in this > case your computer will spend all its time to swap since you won't > have enough RAM.
But I have 128GB ram. :) That allows for a hashtable of size 2^33 entries pointing at simple values (e.g. ints) without swapping. And ram sizes get bigger and bigger. [Doesn't a Hashtbl of ints store the ints directly? That would allow 2^34 entries.] > Clearly you cannot a bijection from 63 bits > integers to 31 bits integers; so there are a lot of collisions. I do > not have a 64 arch, but I guesse that if you hash "2^48+168" you will > get "168". I guess such a function is ways to be ideal, since when > playing with bitvectors having two integers which differs by only one > bit is very common, but at least it is a fast function, and has a good > repartition (you have 2^32 collisions exactly per bucket). Ignoring the upper bits makes for a verry verry poor hash function. Also means it is 100% trivial to create a DOS by tuneing your input values to only differ in the upper bits. MfG Goswin -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs