Re: random sort of svn status and svn diff

Philip Martin Wed, 21 Mar 2012 05:40:50 -0700

Branko Čibej <[email protected]> writes:

> On 21.03.2012 10:29, Philip Martin wrote:
>> In passing I note that the way APR "randomises" the hash is not all
>> that random. It just adds a fixed prefix to the keys passed to the
>> default hash function. I thought the point of the APR change was to
>> make it harder for the user to predict hash collisions and I do wonder
>> whether this has been achieved. If two strings S1 and S2 collide then
>> does adding a constant prefix stop the collision?
>
> Yes, it does; however, since the prefix is constant per hash table, it
> won't change the total number of collisions.
>
> The point of the randomization is that collisions aren't predictable
> amongst different hash table instances, which makes collision-based DoS
> attacks a bit harder. Which was the whole point of the exercise.


I'm surprised, are you sure?  If I put N elements into the hash and
iterate over then using apr_hash_next they come out in some order:

  0, 1, 2 ... N-1

Changing the seed/prefix changes the order to something like:

   K, K+1 ... N-1, 0 ... K-1

So if I have

   0, 1 ... X, Y ... N-1

where X and Y collide I would expect changing the seed to give:

   K, K+1 ... X, Y ... N-1, 0 ... K-1

Is that going to remove the collision?

> Anyway, if we're aiming for stable ordering in the output, and the data
> are currently stored in a hash table, then the only sane option is to
> sort the data, not make the hash function more predictable.

What some people care about is stability from one run to the next when
using the same executable, other people want stability from one
executable to the next and some want predictable order.

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com

Re: random sort of svn status and svn diff

Reply via email to