> Getting back towards topic, the hash function employed by Git is showing > signs of bitrot, which, given people's desire to introduce malware > backdoors and legal backdoors into Linux, could well become a problem in > the very near future. > > "James A. Donald" <jam...@echeque.com>
> I believe attacks on Git's use of SHA-1 would require second pre-image > attacks, and I don't think anyone has demonstrated such a thing for > SHA-1 at this point. None the less, I agree that it would be better if > Git eventually used better hash functions. Attacks only get better with > time, and SHA-1 is certainly creaking. > > Emphasis on "eventually", however. This is a "as soon as convenient, not > as soon as possible" sort of situation -- more like within a year than > within a week. > > Yet another reason why you always should make the crypto algorithms you > use pluggable in any system -- you *will* have to replace them some day. > -- > Perry E. Metzger pe...@piermont.com > Of course, I still believe in hash algorithm agility: regardless of how > preimage attacks will be found, we need to be able to deal with them > immediately. > > --Paul Hoffman, Director I tried telling this to Linus within a few weeks of the design, while he was still writing git. He rejected the advice. Perhaps a delegation of cryptographers should approach him -- before it's too late. His biggest argument was that the important git trees would be "off-net" and would not depend on public trees. I think git is getting enough use (e.g. by thousands of development projects other than the Linux kernel) that those assumptions are probably no longer valid. His secondary argument was that git only uses the hash as a collision-free oracle, not a cryptographic hash. But that's exactly the problem. If malicious people can make his oracle produce collisions, other parts of the git code will make false assumptions that can be exploited. His final argument is the same one I heard NSA make to Diffie and Hellman about DES in 1976: "the crypto will never be the weakest link in the system, so it doesn't really have to be that strong". That argument was wrong then and it's wrong now. The cost of using a strong cryptosystem isn't significantly greater than the cost of using a weak cryptosystem; and cracking the crypto HAS become the weakest link in the overall security of many systems (CSS is an obvious one). See: http://www.toad.com/des-stanford-meeting.html John To: torva...@osdl.org, g...@toad.com Subject: SHA1 is broken; be sure to parameterize your hash function Date: Sat, 23 Apr 2005 15:21:07 -0700 From: John Gilmore <g...@new.toad.com> It's interesting watching git evolve. I have one comment, which is that the code and the contributors are throwing around the term "SHA1 hash" a lot. They shouldn't. SHA1 has been broken; it's possible to generate two different blobs that hash to the same SHA1 hash. (MD5 has totally failed; there's a one-machine one-day crack. SHA1 is still *hard* to crack.) But as Jon Callas and Bruce Schneier said: "Attacks always get better; they never get worse. It's time to walk, but not run, to the fire exits. You don't see smoke, but the fire alarms have gone off. It's time for us all to migrate away from SHA-1." See the summary with bibliography at: http://www.schneier.com/crypto-gram-0503.html Since we don't have a reliable long-term hash function today, you'll have to change hash functions a few years out. Some foresight now will save much later pain in keeping big trees like the kernel secure. Either that, or you'll want to re-examine git's security assumptions now: what are the implications if multiple different blobs can be intentionally generated that have the same hash? My initial guess is that changing hash functions will be easier than making git work in the presence of unreliable hashing. In the git sources, you'll need to install a better hash function when one is invented. For now, just make sure the code and the repositories are modular -- they don't care what hash function is in use. Whether that means making a single git repository able to use several hash functions, or merely making it possible to have one repository that uses SHA1 and another that uses some future WonderHash, is a system design decision for you and the git contributors to make. The simplest case -- copying a repository with one hash function into a new repository using a different hash function -- will change not only all the hashes, but also the contents of objects that use hash values to point to other objects. If any of those objects are signed (e.g. by PGP keys) then those signatures will not be valid in the new copy. Adding support now for SHA256 as well as SHA1 would make it likely that at least git has no wired-in dependencies on the *names* or *lengths* of hashes, and let you explore the system level issues. (I wouldn't build in the assumption that each different hash function produces a different length output, either, though these two happen to.) Enjoy, John Gilmore Date: Mon, 25 Apr 2005 13:38:40 -0700 (PDT) From: Linus Torvalds <torva...@osdl.org> To: Seth David Schoen <sch...@eff.org> cc: John Gilmore <g...@toad.com>, Kees Cook <k...@osdl.org> Subject: Re: John Gilmore on SHA-1 [...@toad.com: Pls forward to Linus: SHA1 is broken] ... As to your SHA1 concerns: > It's interesting watching git evolve. I have one comment, which is > that the code and the contributors are throwing around the term "SHA1 > hash" a lot. They shouldn't. SHA1 has been broken; it's possible to > generate two different blobs that hash to the same SHA1 hash. Actually, even the theoretical breaking has not been proven for a pre-existing SHA1 hash (ie you need to control both the starting point for it), and more importantly, git really uses the SHA1 has a _hash_, not necessarily as a cryptographically secure one. IOW, security doesn't actually depend on the hash being cryptographic, and all git really wants is to avoid collisions, ie it wants it to hash the contents well. That, sha1 definitely does, and even an md5sum would suffice (but having 160 bits instead of "just" 128 obviously adds to the space, so that's always a bonus). Of course, the fact that sha1 is also very expensive to try to fool is a big bonus, since it means that it's just another layer on the real security model. But the _real_ security comes from the fact that git is distributed, which means that a developer should never actually use a public tree for his development. For example, I've got two separate firewall layers (and a NAT) in between me and the internet, and my personal tree is on that machine. I never actually trust or use the external trees - I just push the result to them. This is something you cannot do with a centralized SCM server like SVN or other traditional crud. A centralized one obviously has to be accessible to all the developers, which means that it's forced to be open enough to be much more easily attackable, and also means that there is a single point of failure also from a security standpoint. In contrast, even if somebody were to compromise my machine, that does _not_ automatically compromise the trees of other developers. They'd still have all the pristine objects, and never even fetch an object from me that has the same name (ie sha1 hash) as one they already have. In other words, to really break a git archive, you need to - be able to replace an existing SHA1 hash'ed object with one that hashes to the same thing (_not_ the breakage that has been shown to be possible already) - the replacement has to still honor all the other git consistency checks (even "blob" objects have them: they need to have a valid header with a valid length, so it's not sufficient to just find another object that hashes to the right thing, you have to find an object with a valid header that hashes to the right thing) - you have to break in to _all_ archives that already have that object and replace it quietly enough that nobody notices. Quite frankly, it's not worth worrying about. It's a hell of a lot easier to just break a source archive with other means (ie pay a developer ten million dollars to just insert the back door you want inserted). Linus --------------------------------------------------------------------- The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to majord...@metzdowd.com