Oh yeah, it's funny, I knew a choice of M = 13 gives you avg 8kb blocks, 
which is why it was chosen. But I didn't put it together and realize it 
means another algorithm choosing 13 isn't that much of a coincidence.

Thanks for the pointer to the rolling hash article! Somehow I hadn't come 
across it.

On Friday, July 31, 2020 at 4:43:48 PM UTC-4 ian wrote:

> Quoting Nick S (2020-07-31 16:03:00)
>
> > Is it actually a Rabin fingerprint?
>
> I haven't looked too carefully at the concrete implementation, it could
> be, but there are a number of different choices for a rolling hash
> function here:
>
> https://en.wikipedia.org/wiki/Rolling_hash
>
> > I noticed the number of bottom bits (M = 13) is the same as what they
> > give in the [1]Wikipedia article.
>
> Note that the choice of M is entirely orthogonal to the choice of hash
> function. M determines the average block size; in particular the
> probability of a split at a given point is 1/(2^M), so the average block
> size will be 2^M bytes. So for M = 13 you get 8KiB blocks on average.
>
> -Ian
>

-- 
You received this message because you are subscribed to the Google Groups 
"Perkeep" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/perkeep/79624c91-2494-4177-b9eb-0622a6203907n%40googlegroups.com.

Reply via email to