Oh yeah, it's funny, I knew a choice of M = 13 gives you avg 8kb blocks, which is why it was chosen. But I didn't put it together and realize it means another algorithm choosing 13 isn't that much of a coincidence.
Thanks for the pointer to the rolling hash article! Somehow I hadn't come across it. On Friday, July 31, 2020 at 4:43:48 PM UTC-4 ian wrote: > Quoting Nick S (2020-07-31 16:03:00) > > > Is it actually a Rabin fingerprint? > > I haven't looked too carefully at the concrete implementation, it could > be, but there are a number of different choices for a rolling hash > function here: > > https://en.wikipedia.org/wiki/Rolling_hash > > > I noticed the number of bottom bits (M = 13) is the same as what they > > give in the [1]Wikipedia article. > > Note that the choice of M is entirely orthogonal to the choice of hash > function. M determines the average block size; in particular the > probability of a split at a given point is 1/(2^M), so the average block > size will be 2^M bytes. So for M = 13 you get 8KiB blocks on average. > > -Ian > -- You received this message because you are subscribed to the Google Groups "Perkeep" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/perkeep/79624c91-2494-4177-b9eb-0622a6203907n%40googlegroups.com.
