Romain Bardou wrote:
> Le 13/03/2012 14:23, Gerd Stolpmann a écrit :
> >
> >> The best compromise to me is to leave the default for Hashtbl, but
> >> properly document this aspect in the manual (with succint explanation
> >> and one relevant pointer). That way:
> >> - you don't break compatibility
> >> - you keep default reproducibility (which is a real feature)
> >> - you teach beginners like myself on tough aspects related to the use
> >> of a datastructure in some frequent use cases.
> >
> > Basically I like the idea of "teaching" users this way. The typical
> > user will understand the impact, and act accordingly. Nevertheless, I
> > would like it if it would be made as easy as possible to provide good
> > seeds if required. The Random module is definitely not good enough
> > (e.g. if you know when the program was started like for a cgi, and the
> > cgi reveals information it should better not like the pid, the Random
> > seed is made from less than 10 unpredictable bits, and on some systems
> even 0 bits).
> >
> > The ideal would be to guide the user to the decision whether
> > protection is necessary, and if the answer is yes, to give the
> > instructions how to do it (and provide all means for it, of course).
> 
> This teaching idea sounds great indeed, but on the other hand, where do
> we draw the line? If we push this reasoning too far, we could remove
> typing altogether and just tell the programmer to be careful. What is the
> difference here? Is a potential DoS attack "less bad" than a seg fault?
> 
> So although the idea of teaching the programmer through the documentation
> makes sense, I would put it the other way around: make the safer behavior
> the default, and give debugging tools with proper warnings. Here the tool
> is a "set_seed" function and the warning is in its documentation: "using
> the same seed everytime can lead to DoS attacks".

+1. Surely in projects where repeatability is important, the change in 
behaviour to randomly seeded tables would be quickly noticed (and can be 
quickly solved, if the appropriate "set_seed" or whatever is there) through 
failing unit tests and so on, surely? Repeatability seems the more niche use of 
a hash table, IMHO, even if it's by some of OCaml's bigger players! One could 
even imagine having things so that programs linked normally use a randomly seed 
hashtable and programs *linked* with -g use a fixed seed, for debugging (i.e. 
the current behaviour) - again, with suitable documentation explaining why you 
don't put debug builds of software on live web servers...


David 



-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Reply via email to