Am Montag, den 19.09.2011, 13:09 -0700 schrieb Ian Zimmerman: > I need a somewhat large (thousands) set of strings, created once during > startup and never modified after. What is a better choice, a (string, > unit) Hashtbl.t or the Set module? If the Set module still uses trees > as it did when I was young :-), access will be logarithmic versus > constant for Hashtbl. But on the other hand a hash function must > examine all of every string while the comparison of 2 strings stops at > the first nonmatching character. > > I am thinking about the time to build the set as well as the probing > time.
Nothing has changed so far. Besides naive Hashtbl and Set you have at least two more options: Hashtbl with self-defined hash function (i.e. one that stops after the n-th char), and Set with a hash-based comparison function (i.e. compare first the hashes of the strings, and only if the hashes are equal, compare the strings). In my experience, for a large constant set a Hashtbl usually works best. Set is usually only better when you can profit from functional updates. Gerd > > -- > Ian Zimmerman > gpg public key: 1024D/C6FF61AD > fingerprint: 66DC D68F 5C1B 4D71 2EE5 BD03 8A00 786C C6FF 61AD > Rule 420: All persons more than eight miles high to leave the court. > -- ------------------------------------------------------------ Gerd Stolpmann, Darmstadt, Germany [email protected] Creator of GODI and camlcity.org. Contact details: http://www.camlcity.org/contact.html Company homepage: http://www.gerd-stolpmann.de *** Searching for new projects! Need consulting for system *** programming in Ocaml? Gerd Stolpmann can help you. ------------------------------------------------------------ -- Caml-list mailing list. Subscription management and archives: https://sympa-roc.inria.fr/wws/info/caml-list Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
