In the present hashcashe scheme, Leo will lose data if two external files have the same sha-1 hash: the file hashed later will *replace* the file hashed earlier. The question is, how likely is such a collision?
There are two conflicting points of view: 1. The number of all possible hashes is 16**40 == 8 ** 80. This is a truly enormous number, much bigger than the total documents that will ever be written in all of human history, no matter how long that turns out to be :-) 2. The number of all possible documents is (almost) infinitely larger. For example, the number of all ascii files containing 1000 characters is approx 128 ** 1000. Thus, there would be *lots* of collisions **if** all such files were hashed. There would be even more potential collisions in the set of all megabyte files. And so on. Most discussions of sha-1 collisions focus on cryptanalysis attacks, and do not seem to be relevant. Can anyone resolve the conflicting points of view? It's important now for Leo, and may get even more important later. Offhand, I can think of now way to "recover" from an unexpected collision. I suspect, but do know know for sure, that collisions would create havoc in a git repository. Can anyone say anything for sure on this topic? Edward --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "leo-editor" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/leo-editor?hl=en -~----------~----~----~----~------~----~------~--~---
