> On Feb 10, 2015, at 4:34 PM, Richard Hipp <d...@sqlite.org> wrote:
> 
> On 2/10/15, Warren Young <w...@etr-usa.com> wrote:
>> 
>> Seems like a risky gamble to me.
> 
> Risk?  It's a low-probability of a minor ambiguity in the display

Further up the thread people were talking about parsing these numbers out of 
fossil output, then presumably acting on them in some scripted fashion.

So, the risk is basically one of ambiguous commands.  Does every fossil command 
that will accept a hash prefix do sensible things when you give it a hash that 
matches two different artifacts?

For example, will this:

   fossil tag add MYTAG 01234ABCDE

tag all objects with hashes beginning with 01234ABCDE, or will it complain?

I can see both behaviors causing problems with a script that just got the 
subset back from fossil, because it is presuming it to be an unambiguous 
reference to a specific artifact in the repository.

Before I saw this thread, I assumed Fossil was monitoring the set of hashes it 
had generated and was writing out only as many characters as necessary to give 
unique values.  The fact that it’s using a hard-coded length makes me somewhat 
uneasy.

Consider the famous case of the NetBSD Fossil repo.  There are half a million 
checkins.  Driving that probability table backwards, it looks like we’d expect 
about a 20% chance of a collision when using 10-character hash prefixes against 
that particular repo.

If you go with 12 characters, the probability drops to less than 0.1%.

If you monitor the list of generated hashes and emit only unambiguous values, 
the probability of ambiguous commands drops to 0.
_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Reply via email to