On Oct 29, 2015, at 3:40 PM, Eduard <[email protected]> wrote: > > On 10/29/2015 02:46 PM, Warren Young wrote: >> On Oct 28, 2015, at 6:37 PM, Eduard wrote: >>> >>> I wish to discuss the issues surrounding the use of SHA1 in Fossil >> >> Have you read the prior discussions on this? >> >> >> http://www.mail-archive.com/fossil-users%40lists.fossil-scm.org/msg18053.html >> >> http://www.mail-archive.com/fossil-users%40lists.fossil-scm.org/msg05970.html >> >> http://www.mail-archive.com/fossil-users%40lists.fossil-scm.org/msg21423.html > > I had read 2/3 of them, yes. Thanks for the third one!
The third one’s the mother lode. Don’t be fooled by the mail-archive.com UI, which presents only 10 or so results at a time. That thread went on and on and on. Hopefully we can avoid retreading some of the same ground in this one. >> most of the attacks on SHA-1 only apply to standalone blob cases > > And individual files (that are part of commits). That won't show up in > the timeline. Do you mean newly-added files? They’re shown on the Check-in details screen, the most likely thing you’re going to click on on the Timeline page. For a newly-added file to have an effect on a repo, it’ll probably also require modification to an existing file, such as the Makefile. (Exceptions are files included by wildcard.) >>> If the attacker is in control of the server >> >> …then he can serve you any content he likes, no matter how good your hash >> algorithm is. > > True, but he shouldn't be able to convince me that ID "abcdef" > corresponds to something other than the original artifact created with > ID "abcdef”. How are you going to know that the legitimate file has ID abcdef? Cross-reference to another repo? What if there is only one central repo? If an evildoer has taken over the central server, they are just providing a pile of artifacts, and you are trusting that those artifacts are legitimate. Granted, you can’t do such a swap to people with existing checkouts, since that will break the sync algorithm, but an evil Fossil instance could probably be made to detect whether it is being asked for a clean checkout or a sync update of an existing one. > I might know (through some other source, e.g. > PGP-signed email) that artifact "abcdef" is genuine, and it shouldn't > matter where I download it from. How many people will be doing such cross-checking? Again I bring up the XcodeGhost example. People do foolish things in the name of expediency. > I don't have to > trust my Debian mirror to download packages (and their sources) from it. You’re referring to the fact that DEBs are GPG-signed, I assume? That works because the Debian gatekeepers can sign the packages on an assumed-secure box. (Such central package repos have been compromised in the past.) The distro includes a copy of the central source’s public key, so if the package signature doesn’t decrypt correctly, it isn’t legitimate. Where can you put such a root of trust in the Fossil case? There is no central presumed-secure site with Fossil. Remember, you were just positing that the central repo’s server got rooted. You also can’t solve it by having people with checkin bits submit a GPG public key to the repo along with their login creds and sign checkins, because those keys live on the same compromised server. The evildoer can just generate a new set of keys, re-sign the compromised artifacts, and store the new keys in the user table instead of the original keys. It’s the problem with all PKIs: who do you trust? > That would avoid happenings like the XcodeGhost incident. Apple has a code-signing mechanism, too, and packages from Apple are always signed. But, the client-side checker (Gatekeeper) is not mandatory, and developers often turn it off entirely, since it gets in the way of developing software. Plus, you can bypass Gatekeeper for $99: get a code signing cert from Apple and sign your evil packages with it. It’ll work until Apple catches you and revokes your cert. Almost no one checks *who* signed the package; all they know is that the OS let them install it when they double-clicked it. > - Every artifact must be hashed by every known algorithm. I’m assuming it's possible to change from one algorithm to another mid-stream, as long as the client knows all of the algorithms in use, and is told where the change points occur. Do you know for a fact that you cannot do this? > The database > size grows linearly with the number of hashing algorithms. If so, it’s only a handful of bytes per artifact, per algorithm. The real cost would be the computation time. > Consider a checkin manifest from 3 years ago. It is > very likely that no new checkin/branch will ever refer to it directly, > so nobody will ever refer to it by new-hash. Worse yet, it is likely > nobody will ever refer to the files inside that checkin by new-hash. If > a preimage attack on old-hash becomes possible (or even easy), one could > mess with the artifacts that are only referred to using old-hash. Yes, if you want old artifacts to be unassailable, you’d have to recompute all the hashes. But, I think you’re not realizing that artifact chaining removes the attraction of replacing old artifacts. As I understand it, you can’t replace an artifact 10 checkins back from the tip of the branch without recomputing the 9 other hashes on the way back to the tip. Therefore, an attack that takes a year of CPU time to attack a leaf node takes 10 years to attack a node 10 checkins back from the tip. This came up in that third-linked thread. > I personally don't think we'll ever need to go past BetterHash-512. I’m not sure if you’re saying that 512 bits will be enough forever, or that we already have the last hash algorithm we will ever need. History says either is a foolish prediction, and that the best hash algorithms remain state-of-the-art for only about a dozen years. Maybe the sunsetting of Moore’s Law will break us out of that pattern. But if Fossil is going to go through the pain of replacing SHA-1, it should be done in a way that makes it easier to do again later. > Thoughts? tl;dr. >:) As I said in my previous email, I don’t see a reason to work out the migration strategy before we work out the *ifs* and *whys*. > (Also are there any issues on any of the supported platforms with having > dollar signs in filenames (or URLs)? Just a random thought.) Why do file names come into it? The MCF tag would be in one of the cards, which live in the DB. I don’t even see that it has to be reported in the UI, or accepted on the command line, since the chance of two algorithms having a conflicting hash are near zero. Even if you do come across, say, a 10 hex digit prefix of two hashes that are the same under, say, SHA-1 and SHA-512, Fossil already knows how to stop and make you be specific about which one you mean, if it can’t see that one is obviously correct. Therefore, command line usage will remain unchanged in this scheme: “fossil up EA5D538D23A7C” will most often uniquely identify one artifact, or none, not 2+. _______________________________________________ fossil-users mailing list [email protected] http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

