Am Montag, den 19.09.2011, 13:09 -0700 schrieb Ian Zimmerman:
> I need a somewhat large (thousands) set of strings, created once during
> startup and never modified after.  What is a better choice, a (string,
> unit) Hashtbl.t or the Set module?  If the Set module still uses trees
> as it did when I was young :-), access will be logarithmic versus
> constant for Hashtbl.  But on the other hand a hash function must
> examine all of every string while the comparison of 2 strings stops at
> the first nonmatching character.
> 
> I am thinking about the time to build the set as well as the probing
> time.

Nothing has changed so far. Besides naive Hashtbl and Set you have at
least two more options: Hashtbl with self-defined hash function (i.e.
one that stops after the n-th char), and Set with a hash-based
comparison function (i.e. compare first the hashes of the strings, and
only if the hashes are equal, compare the strings). 

In my experience, for a large constant set a Hashtbl usually works best.
Set is usually only better when you can profit from functional updates.

Gerd

> 
> -- 
> Ian Zimmerman
> gpg public key: 1024D/C6FF61AD
> fingerprint: 66DC D68F 5C1B 4D71 2EE5  BD03 8A00 786C C6FF 61AD
> Rule 420: All persons more than eight miles high to leave the court.
> 

-- 
------------------------------------------------------------
Gerd Stolpmann, Darmstadt, Germany    [email protected]
Creator of GODI and camlcity.org.
Contact details:        http://www.camlcity.org/contact.html
Company homepage:       http://www.gerd-stolpmann.de
*** Searching for new projects! Need consulting for system
*** programming in Ocaml? Gerd Stolpmann can help you.
------------------------------------------------------------


-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Reply via email to