[Caml-list] Hashtbl and security

2011-12-30 Thread Gerd Stolpmann
Hi, there was recently a security alert for web services that use hash tables to store web form parameters sent via POST (so that millions of such parameters can be sent in a single request). It is possible to keep the web service busy for hours with such a DoS (denial of service) attack. The

Re: [Caml-list] Hashtbl and security

2011-12-30 Thread Yaron Minsky
For just this reason, the hashtables in Core have been reimplemented to use an AVL tree in the buckets. That way, even when you have pathological collisions, you degrade gracefully to O(log n) per operation, instead of O(n), where n is the number of keys in the hashtable. y On Fri, Dec 30, 2011

Re: [Caml-list] Hashtbl and security

2011-12-30 Thread Xavier Leroy
On 12/30/2011 05:44 PM, Gerd Stolpmann wrote: 1) Avoid hash tables in contexts where security is relevant. The alternative is Set (actually a balanced binary tree), which does not show this problem. Highly recommended. Nothing beats guaranteed O(log n) operations. 2) Use cryptographically

Re: [Caml-list] Hashtbl and security

2011-12-30 Thread Edgar Friendly
Not a problem, other than memory waste (which doesn't matter the key). Hashtbl.add prepends to the front of the list, so it'll be fast. And Hashtbl.find will match the head of the list, and return quickly too. Hashtbl.find_all will have to go through the whole list, but that's expected. E. On

RE: [Caml-list] Hashtbl and security

2011-12-30 Thread David Allsopp
Yaron Minsky wrote: For just this reason, the hashtables in Core have been reimplemented to use an AVL tree in the buckets.  That way, even when you have pathological collisions, you degrade gracefully to O(log n) per operation, instead of O(n), where n is the number of keys in the

Re: [Caml-list] Hashtbl and security

2011-12-30 Thread Yaron Minsky
It's not clever in that way. It does try to do a good job of keeping the memory impact of the tree low, but you maintain O(1) by having a low load factor, and therefore trees of constant size. You can take a look at the code here:

Re: [Caml-list] Hashtbl and security

2011-12-30 Thread Gerd Stolpmann
Am Freitag, den 30.12.2011, 18:06 +0100 schrieb Xavier Leroy: 3) Use randomized hash tables. The trick here is that there is not a single hash function h anymore, but a family h(1)...h(n). When the hash table is created, one of the functions is picked randomly. This makes it impossible to

[Caml-list] working %.pp.ml target with ocamfind/ocamlbuild

2011-12-30 Thread Anil Madhavapeddy
There's a very useful %.pp.ml target in OCamlbuild that runs the source through camlp4 and outputs the result. This doesn't work when ocamlfind is used, as it includes the -pp flags within the ocamlfind invocation. I took a look at adding support for this into ocamlbuild (when -use-ocamlfind

[Caml-list] Understanding usage by the runtime

2011-12-30 Thread orbitz
I am running a fork of mfp's ocamlmq. I am trying to track down an apparent memory leak, after a week of heavy usage the process is using up 2GB of RAM, and after looking in all the obvious places (all of the containers tracked in the state object in the mq appear to be empty or steady in

Re: [Caml-list] Hashtbl and security

2011-12-30 Thread oliver
On Fri, Dec 30, 2011 at 06:06:26PM +0100, Xavier Leroy wrote: On 12/30/2011 05:44 PM, Gerd Stolpmann wrote: 1) Avoid hash tables in contexts where security is relevant. The alternative is Set (actually a balanced binary tree), which does not show this problem. Highly recommended.

Re: [Caml-list] Hashtbl and security

2011-12-30 Thread oliver
On Sat, Dec 31, 2011 at 01:57:21AM +0100, oliver wrote: On Fri, Dec 30, 2011 at 06:06:26PM +0100, Xavier Leroy wrote: On 12/30/2011 05:44 PM, Gerd Stolpmann wrote: 1) Avoid hash tables in contexts where security is relevant. The alternative is Set (actually a balanced binary tree),

Re: [Caml-list] Hashtbl and security

2011-12-30 Thread oliver
On Fri, Dec 30, 2011 at 06:40:30PM +0100, ri...@happyleptic.org wrote: I will probably tell something very stupid, but HTML specs do not prevent a client to post 1M values with the same name, so whatever your hash function you cannot do much, can you? [...] That's a feature. The simplest