Hi Marc and Thomas,

I followed your discussion with great interest. I agree that Thomas very light 
proposal is good to put in place, since it has almost no negative impact and 
only benefit. I think there is also a possibility to mitigate the object issue 
with something close (check integrity of what we get, to at least detect an 
issue), but that's not perfect of course.

That’s said, I would like to point you to this interesting question on 
StackOverflow 
(https://stackoverflow.com/questions/22029012/probability-of-64bit-hash-code-collisions)
 and remind you that base on the Birthday Paradox, with the released of 4.x, we 
have raised our worrying threshold of documents/objects from 65535, to more 
than 4 billion… and it took a while (4 versions of XWiki) before we had the 
strong feeling we need to raise. So, while before 4.x, the worrying threshold 
was really low, the effective happening of a collision was already low.

My own experience was the risk before 4.x was really high with generated names, 
much hight than with names use by real user. When I was it by that issue, I 
remember being really bad about it. This is also probably why you have raised 
this thread. The previous hash was too small and had also a discutable 
distribution.

The MD5 algorithm like many crypto hashes is particularly well suited for 
providing a good distribution 
(http://michiel.buddingh.eu/distribution-of-hash-values), the cutting at 64 
bits may lower this, but I doubt it would be significant for us. So, 
personally, I feel really comfortable with the current implementation, and I 
think you can sleep in peace as well.

Just my thought about not raising fears when it’s no more really justified.
Regards,

--
Denis Gervalle
SOFTEC sa - CEO

On 7 Feb 2018, 16:10 +0100, Denis Gervalle <denis.gerva...@softec.lu>, wrote:
>
> Hi Marc and Thomas,
>
> I followed your discussion with great interest. I agree that Thomas very 
> light proposal is good to put in place, since it has almost no negative 
> impact and only benefit. I think there is also a possibility to mitigate the 
> object issue with something close (check integrity of what we get, to at 
> least detect an issue), but that's not perfect of course.
>
> That’s said, I would like to point you to this interesting question on 
> StackOverflow 
> (https://stackoverflow.com/questions/22029012/probability-of-64bit-hash-code-collisions)
>  and remind you that base on the Birthday Paradox, with the released of 4.x, 
> we have raised our worrying threshold of documents/objects from 65535, to 
> more than 4 billion… and it took a while (4 versions of XWiki) before we had 
> the strong feeling we need to raise. So, while before 4.x, the worrying 
> threshold was really low, the effective happening of a collision was already 
> low.
>
> My own experience was the risk before 4.x was really high with generated 
> names, much hight than with names use by real user. When I was it by that 
> issue, I remember being really bad about it. This is also probably why you 
> have raised this thread. The previous hash was too small and had also a 
> discutable distribution.
>
> The MD5 algorithm like many crypto hashes is particularly well suited for 
> providing a good distribution 
> (http://michiel.buddingh.eu/distribution-of-hash-values), the cutting at 64 
> bits may lower this, but I doubt it would be significant for us. So, 
> personally, I feel really comfortable with the current implementation, and I 
> think you can sleep in peace as well.
>
> Just my thought about not raising fears when it’s no more really justified.
> Regards,

Reply via email to