I think it is a well known bug of the hash function: let list = ref [];; for i = 1 to 30 do list := i :: !list; Printf.printf "%d " (Hashtbl.hash !list) done;;
1 65601 8392706 797307010 578301955 797307010 8392706 65601 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 As you can see, long lists all have the same hash value. It is due to the fact that the hash function does a depth-first traversal of the datastructure, starting with the last field, instead of a breadth-first traversal, and, as it only visits 10 nodes, they are always the same ones for all lists of length > 10. Thus, you should consider using your own hash function, probably only considering the ints in the list and its length. Then, use the functor in the Hashtbl module. --Fabrice On 07/27/2011 11:07 PM, Toby Kelsey wrote: > I have written two versions of a small program, one using an (int list) data > structure and the other an ((int*int) list); and I tested using Hashtbl, Set > and (Jean-Christophe Filliatre's) Trie to cache these elements in each > version. > The relative run times of the programs turns out to be: > > Hashtbl Set Trie > (int list) 1 2.1 2.0 > ((int*int) list) 34.7 2.6 2.1 > > > There is a slight inherent speed difference between the 2 versions but the > major effect is the cache type (in fact the caching difference must be larger > than these ratios due to non-cache computations). I expected Hashtbl might be > a bit slower than more specialised data structures, but the large speed > difference with different data structures was unexpected. Presumably the > slow-down is due to excessive hash collisions. > > I had expected that the generic Hashtbl would be written to give adequate > speed for all types of data and when I look at OCaml code on the web Hashtbl > is usually used, so it seems most OCaml programmers believe the standard > Hashtbl is a reasonable choice for most data types as well. > > I haven't tested the Batteries or OCaml-Core hash tables so these may be more > consistent, but if not, my question is can you predict how well Hashtbl will > work for different types of data and so what to use it with, or it is just > not reliable enough for general-purpose caching/memoization? > > If anyone wants to look at the code, the (int list) Hashtbl version is at > <http://rosettacode.org/wiki/Sokoban#OCaml> and the other versions are in > <http://pastebin.com/hGn1AL9L> temporarily. Apart from the caching and the > int/int pair changes they should be identical. > > Toby > -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
<<attachment: fabrice_le_fessant.vcf>>
