From: Zooko O'Whielacronx zo...@zooko.com
Subject: [tahoe-dev] SHA-1 broken! (was: Request for hash-dependency in
Tahoe security.)
To: nejuc...@gmail.com, tahoe-...@allmydata.org
Date: Wed, 29 Apr 2009 15:59:05 -0600
Reply-To: tahoe-...@allmydata.org
On Apr 29, 2009, at 11:51 AM, Nathan wrote:
http://eurocrypt2009rump.cr.yp.to/837a0a8086fa6ca714249409ddfae43d.pdf
Wow! These slides say that they discovered a way to find collisions
in SHA-1 at a cost of only 2^52 computations. If this turns out to
be right (and the authors are respected cryptographers -- the kind of
people who really hate to be wrong about something like this) then it
is very exciting! SHA-1 was already known to be vulnerable to attack
by a moderately well-funded organization such as a national security
agency, national military, corporation, or organized criminal group.
Now it turns out that finding SHA-1 collisions is in the reach of a
dedicated hobbyist or an eccentric genius [1]. Let's put a rough
number on it. I might be a little bit off, but you can build a
COPACOBANA machine for about $10,000 [2], and it can brute-force a 56-
bit DES key in about six and a half days. 2^52 SHA-1 operations
should take roughly the same amount of time and money. As another
example I guess that distributed computation engines [3] and botnets
[4] might be able to generate a SHA-1 collision in seconds.
Plus of course the amplifying effects of birthday attacks and rainbow
tables and so on mean that the longer you keep your COPACOBANA or
your botnet generating SHA-1 collisions, the more SHA-1 users around
the world become vulnerable to you. So basically, if these slides are
right then relying on SHA-1 collision-resistance has been revealed as
a major vulnerability!
Almost all hash functions in civilian, open use are either MD5 or
SHA-1. For example, decentralized revision control tools such as
monotone, git, and hg rely on SHA-1. Interesting times!
As Shawn already correctly pointed out (and as Nathan probably
already knew), Tahoe doesn't use SHA-1, so we're not affected by this
new discovery. Tahoe-LAFS uses SHA-256 (in the double-hashing mode
suggested by Ferguson and Schneier and named SHA-256d). We also
add our own tagging and salting prefix to avoid certain problems. We
aren't currently vulnerable to hash collision attacks, and we plan
never to get into that position (about which more below).
Nonetheless, it would be a very good exercise to spell out what sorts
of problems could result if attackers could violate what sorts of
properties of the hash function(s) used in Tahoe. The basic uses of
secure hashes in Tahoe are for integrity-checking of immutable files
and for digital signatures on mutable files and directories.
If an attacker could generate two different inputs which yielded the
same hash output (that is, to find a hash collision), then they
could give you a single immutable file cap that produced two (or
more) different files when you used it to download the file. We
believe that nobody is currently able to do that, so currently if
someone gives you an immutable file cap, you can rely on there being
at most one file which can be downloaded using that cap.
For mutable files it is even safer. If an attacker could find an
input which yielded the same output as someone *else*'s input (that
is, to find a second pre-image), then that attacker could write
changes to a mutable file or directory that they were not authorized
to write to. Finding a second pre-image is probably much harder than
finding a collision -- for example nobody has yet figured out how to
find a second pre-image in SHA-1. That's why I say it is even
safer. You already assume that the person who can write to a mutable
file can make it so that two or more different file contents would be
downloaded from using the same mutable-file read cap, but for
immutable files we hold them to a higher standard and prevent even
the original uploader of the immutable file from being able to make
more than one file that matches the immutable-file read cap.
There are a lot more details of how Tahoe uses hash functions that I
would be happy to work out when I have time, but those are the most
important ones, and the immutable file caps are the most likely to
turn out to be vulnerable. (Although, as I've said, even the
immutable file caps are extremely unlikely to be vulnerable.)
(Hm, this puts an interesting twist on Vincent and Nathan's idea of
layering Mercurial-or-Bazaar on top of Tahoe. Tahoe uses stronger
cryptography (and also more flexible cryptography, by the way), so if
you have uploaded your Mercurial repository to Tahoe then even when
SHA-1 turns out to be weak (as it has), you can still rely on the
integrity of your repository.)
ps: For the case of Merkel Trees, are any security guarantees
preserved in the face of hash collision