> On Feb 11, 2015, at 7:23 AM, Richard Hipp <d...@sqlite.org> wrote:
> 
> On 2/11/15, j. van den hoff <veedeeh...@googlemail.com> wrote:
>> 
>> whatever the reason, the netbsd example (a worst case scenario, really)
>> would suggest to chose 12 instead of 10 as the future default length
>> to avoid collisions these next some hundred years.
> 
> Maybe the default prefix lengths should auto-adjust depending on the
> number of artifacts in the repository?

That could work.

If you rearrange the square approximation formula from the Wikipedia article to 
solve for m, then take the base-2 log of that to get bits, then divide by 4 to 
get bits per nibble (i.e. hex digits), you get:

   d = log2(n^2 / 2p) / 4

That is to say, it gives you the number of digits d required to achieve a given 
chance of collision p in a hash set size n.

So for my 97,000 hash repo, we need 13 digits to approach p=1e-6.

I suggest making the p value configurable, with a reasonable default.  1e-6 
sounds good to me.
_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to