Branko Čibej <br...@apache.org> writes: > On 21.03.2012 10:29, Philip Martin wrote: >> In passing I note that the way APR "randomises" the hash is not all >> that random. It just adds a fixed prefix to the keys passed to the >> default hash function. I thought the point of the APR change was to >> make it harder for the user to predict hash collisions and I do wonder >> whether this has been achieved. If two strings S1 and S2 collide then >> does adding a constant prefix stop the collision? > > Yes, it does; however, since the prefix is constant per hash table, it > won't change the total number of collisions. > > The point of the randomization is that collisions aren't predictable > amongst different hash table instances, which makes collision-based DoS > attacks a bit harder. Which was the whole point of the exercise.
I'm surprised, are you sure? If I put N elements into the hash and iterate over then using apr_hash_next they come out in some order: 0, 1, 2 ... N-1 Changing the seed/prefix changes the order to something like: K, K+1 ... N-1, 0 ... K-1 So if I have 0, 1 ... X, Y ... N-1 where X and Y collide I would expect changing the seed to give: K, K+1 ... X, Y ... N-1, 0 ... K-1 Is that going to remove the collision? > Anyway, if we're aiming for stable ordering in the output, and the data > are currently stored in a hash table, then the only sane option is to > sort the data, not make the hash function more predictable. What some people care about is stability from one run to the next when using the same executable, other people want stability from one executable to the next and some want predictable order. -- uberSVN: Apache Subversion Made Easy http://www.uberSVN.com