On Wednesday,2009-08-26, at 19:49 , Brian Warner wrote: > Attack B is where Alice uploads a file, Bob gets the filecap and > downloads it, Carol gets the same filecap and downloads it, and > Carol desires to see the same file that Bob saw. ... The attackers > (who may be Alice and/or other parties) get to craft the filecap > and the shares however they like. The attackers win if Bob and > Carol accept different documents.
Right, and if we add algorithm agility then this attack is possible even if both SHA-2 and SHA-3 are perfectly secure! Consider this variation of the scenario: Alice generates a filecap and gives it to Bob. Bob uses it to fetch a file, reads the file and sends the filecap to Carol along with a note saying that he approves this file. Carol uses the filecap to fetch the file. The Bob-and- Carol team loses if she gets a different file than the one he got. Now suppose Alice is malicious and knows how to produce output which appears to come from Tahoe-SHA2-SHA3, suppose Bob uses Tahoe-SHA3, and suppose Carol uses Tahoe-SHA2. Then Alice can generate two files, one which shows Bob what Alice wants him to see and the other which shows Carol what Alice wants her to see. So by adding an optional new hash algorithm intended to strengthen Tahoe-LAFS against the possibility that someone can break SHA2, we might (if we're not careful) open up a hole that can be exploited even by someone who can't break SHA2. One defense against the attack above would be to make sure that, as long as you might want to share files with someone who might still use Tahoe-SHA2, then you don't upgrade to Tahoe-SHA3 -- instead you have to stick with the intermediate bi-lingual version, Tahoe-SHA2- SHA3, which produces both kinds of hashes and checks both kinds of hashes. But how can you tell whether there are still some Tahoe-SHA2 users out there somewhere that you might eventually want to share a file with? Also, might this approach somehow accidentally prolong Tahoe-LAFS's vulnerability to a flaw in SHA2? So to use this defense, Bob would use Tahoe-SHA2-SHA3, and he would always verify both hashes before approving the file. If one hash matched but the other didn't, his Tahoe-LAFS software would warn him that something is very wrong with Alice or her Tahoe-LAFS software. (This means that we have to spend the CPU cycles verifying old- fashioned hashes, and worse that we have to make file capabilities twice as big in order to hold both hashes, which could negatively impact the user experience.) Another, complementary, defense against this sort of attack would be that if you receive a filecap which has a hash in it that you don't know how to check, then you should *erase* that hash from the filecap before you pass that filecap on to your friend. Then if Alice has a malicious Tahoe-SHA2-SHA3, Bob has Tahoe-SHA2, and Carol has Tahoe- SHA3, Bob will give Carol a filecap with only a SHA2 hash in it, which Carol will not know how to check, thus defeating Alice's evil scheme. The bottom line is that the whole idea of adding algorithm agility and an optional hash algorithm seems to entail complication and danger, and Tahoe-LAFS is very likely going to take the alternate route: a future version of Tahoe-LAFS will probably define a completely different type of immutable file capability which is syntactically distinct from the current type (i.e. it starts with a distinct leading character or it is a different length so that it cannot be confused with the old kind by a program and hopefully not by a human either), and which uses only SHA3. Then you will not be able to produce a single filecap which can be verified with both SHA2 and SHA3. You can, of course, produce two different filecaps, one in the old format and one in the new format. This sounds good to me because if Alice sends a pair of filecaps to Bob then it will be obvious to Bob that the two could point to different files, at Alice's disgression. > I always get confused about the difference between first-preimage > and second-preimage, but I think there's a correspondence here. In > Attack A, the attacker doesn't get to choose the filecap (i.e. the > hash of the message): they've got to create shares to match a > specific pre-determined cap. In Attack B, Alice can craft an > arbitrarily complex message, taking advantage of a known collision > or whatever. Pre-image is figuring out the input x that someone used to compute y = H(x), when they give you only y. Second-pre-image is when someone else chooses an x and tells you x and then you find a different x2 != x such that H(x) = H(x2). Collision is when you come up with any two values, x and x2 != x such that H(x) = H(x2). Tahoe-LAFS's semantics of immutable file caps is that the cap is an *identifier* of the file, not just a digital signature or message authentication code on the file, as demonstrated in the Alice->Bob- >Carol scenario above. Therefore, Tahoe-LAFS requires collision resistance from its secure hash algorithm and not just second- preimage-resistance. It is too bad we can't make do with second- preimage-resistance, because we have much greater confidence in the second-preimage-resistance of our hash functions than in collision- resistance. SHA1, for example, has second-preimage-resistance (as far as we know) but not collision-resistance. (By the way, I believe that git has the same semantics for its hashes that Tahoe-LAFS has for its immutable file caps and that, contrary to Linus Torvalds , Perry Metzger, et al. that git users are vulnerable to exploitation by collisions. I'll try to write up my reasoning at some point.) Regards, Zooko _______________________________________________ tahoe-dev mailing list [email protected] http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
