Hi Warren, Thanks for replying!
On 10/29/2015 02:46 PM, Warren Young wrote: > On Oct 28, 2015, at 6:37 PM, Eduard <[email protected]> wrote: >> >> I wish to discuss the issues surrounding the use of SHA1 in Fossil > > Have you read the prior discussions on this? > > > http://www.mail-archive.com/fossil-users%40lists.fossil-scm.org/msg18053.html > > http://www.mail-archive.com/fossil-users%40lists.fossil-scm.org/msg05970.html > > http://www.mail-archive.com/fossil-users%40lists.fossil-scm.org/msg21423.html I had read 2/3 of them, yes. Thanks for the third one! > >> First I propose that the use of SHA1 in Fossil is a serious problem. > > The known attacks on SHA-1 are still computationally expensive, and will > remain so for years. Not impossible, but still very difficult. We have time > to move, if we need to. I agree. I also believe that the best time to think about it is right now. The number of Fossil users will only increase with time (in fact I'm about to introduce four new people to Fossil), and so will the number of people potentially annoyed by a non-backwards compatible change in the specification. > But also, and much more importantly, most of the attacks on SHA-1 only apply > to standalone blob cases such as binary package validation, X.509 certificate > signing, etc. In Fossil, most of the SHA-1 checksummed artifacts are chained > in some way, so that you can only modify the leaves of branches. And individual files (that are part of commits). That won't show up in the timeline. >> If the attacker can intercept >> communications between the server and a developer > > …then you did not run Fossil over TLS, like you should if MITM is a > legitimate risk in your situation. :) > >> If the attacker is in control of the server > > …then he can serve you any content he likes, no matter how good your hash > algorithm is. True, but he shouldn't be able to convince me that ID "abcdef" corresponds to something other than the original artifact created with ID "abcdef". Again, I might know (through some other source, e.g. PGP-signed email) that artifact "abcdef" is genuine, and it shouldn't matter where I download it from. If artifact "abcdef" refers to "xyzzy", trusting the genuineness of "abcdef" should imply trusting that of "xyzzy". I also don't believe that the users and developers should have to trust the Fossil server (including mirrors) and its operator; I don't have to trust my Debian mirror to download packages (and their sources) from it. That would avoid happenings like the XcodeGhost incident. > The correct solution here is something like TLS with certificate pinning, GPG > signing, etc. That's the thing, GPG signing covers the contents of the manifest, which itself refers to the files inside it only by their SHA1 hash. If someone substitutes a file with a malicious file that hashes the same, it won't change anything in the manifest and the GPG signature will remain valid. >> The third solution is to change the Fossil specification to redefine the >> artifact ID to be the concatenation of the SHA1 and BetterHash > > A fourth solution is to use Modular Crypt Format to declare the hash for each > artifact, and for future Fossil versions to tolerate SHA-1 only in existing > artifacts, accepting new ones using only known-good algorithms: > > https://pythonhosted.org/passlib/modular_crypt_format.html > > This could be done without breaking the DB, simply because a 20-byte hash > must be SHA-1, since even a 160-bit BetterHash will have the MCF wrapper on > it, making it more than 20 bytes. > > The SQLite card format parser would have to be made more flexible, to make it > understand that if it sees a leading dollar sign, the following hash can be > variable-width. That is a great (and extensible!) solution! There are a few issues though: - Every artifact must be hashed by every known algorithm. The database size grows linearly with the number of hashing algorithms. - There must be an additional mechanism for upgrading the older hash version artifacts. Consider a checkin manifest from 3 years ago. It is very likely that no new checkin/branch will ever refer to it directly, so nobody will ever refer to it by new-hash. Worse yet, it is likely nobody will ever refer to the files inside that checkin by new-hash. If a preimage attack on old-hash becomes possible (or even easy), one could mess with the artifacts that are only referred to using old-hash. I don't believe the first issue will ever be a problem, though, since I personally don't think we'll ever need to go past BetterHash-512. As for the second issue, one solution is to rehash all of the older artifacts using new-hash and rewrite all of the control artifacts in terms of new-hash (this operation is fully deterministic and can be verified independently). This won't play well at all with shunned content (since we can't recompute hashes on artifacts we don't have anymore), and will definitely do very badly if one tries to put back shunned content (since we've probably put in some sort of placeholder null value in the manifest). I don't know whether the adding-back shunned content part is really an issue; we only shun things when we want them truly gone forever. But there is still the annoying issue that if two people don't have the same shunning lists, they will end up with radically different new-hash artifact sets (one checkin will have a placeholder whereas the other one doesn't, and that will change the artifact IDs of all of the descendants). So I guess exactly one person should upgrade the hashes once per project (which I don't believe to be a really terrible limitation, especially since their work can be verified independently). This also has the annoying side-effect of increasing the space taken up by control artifacts (since we're carrying both the old-hash and the new-hash versions), but I guess one could purge all of the old-hash control artifacts from the repository after a few years (once old-hash is no longer trusted at all). PGP-clearsigned manifests would probably also need to be re-signed in a timely manner (in all likelihood the hash function PGP used when signing the manifest has also been deprecated). (This could be done after-the-fact using a signed tag.) There is also the issue that (e.g. URL) references to old-hash artifacts will be broken. I'm not sure how I feel about that; one could say that they *should* be broken because we can no longer be certain about what they point to (assuming that we no longer trust old-hash's security). Intraproject references can always be fixed (in an automated manner), but interproject references will likely be much harder to upgrade. It might be highly useful to write a tool which scans a text file for artifact-referencing URLs and tries to resolve the hash-upgraded version automatically assuming that the referenced repository is available locally. I'm not sure whether in this approach the new-hash control artifacts should explicitly list the old-hash artifacts as parents (or maybe as some new card type). It may be useful as a quick way to identify the old-hash corresponding control artifact (it may even resolve some ambiguity when verifying the transition from old-hash to new-hash). Thoughts? (Also are there any issues on any of the supported platforms with having dollar signs in filenames (or URLs)? Just a random thought.) Best, Eduard
signature.asc
Description: OpenPGP digital signature
_______________________________________________ fossil-users mailing list [email protected] http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

