Folks:

`We're going to be deploying a new crypto scheme in Tahoe-LAFS next`

`year -- the year 2010. Tahoe-LAFS is used for long-term storage, and`

`I won't be surprised if people store files on Tahoe-LAFS in 2010 and`

`then rely on the confidentiality and integrity of those files for`

`many years or even decades to come. (People started storing files on`

`Tahoe-LAFS in 2008 and so far they show no signs of losing interest`

`in the integrity and confidentiality of those files.)`

`This long-term focus makes Tahoe-LAFS's job harder than the job of`

`protecting transient network packets. If someone figures out in 2020`

`or 2030 how to spoof a network transaction that you sent in 2010 (see`

`[1]), it'll be far too late to do you any harm, but if they figure`

`out in 2030 how to alter a file that you uploaded to a Tahoe-LAFS`

`grid in 2010, that might harm you.`

`Therefore I've been thinking about how to make Tahoe-LAFS robust`

`against the possibility that SHA-256 will turn out to be insecure.`

`A very good way to do this is to design Tahoe-LAFS so that it relies`

`as little as possible on SHA-256's security properties. The property`

`that seems to be the hardest for a secure hash function to provide is`

`collision-resistance. We are analyzing new crypto schemes to see how`

`many security properties of Tahoe-LAFS we can continue to guarantee`

`even if the collision-resistance of the underlying secure hash`

`function fails, and similarly for the other properties of the secure`

`hash function which might fail [2].`

`This note is not about that design process, though, but about how to`

`maximize the chance that the underlying hash function does provide`

`the desired security properties.`

`We could use a different hash function than SHA-256 -- there are many`

`alternatives. SHA-512 would probably be safer, but it is extremely`

`expensive on the cheap, low-power 32-bit ARM CPUs that are one of our`

`design targets [3], and the output size of 512 bits is too large to`

`fit into Tahoe-LAFS capabilities. There are fourteen candidates left`

`in the SHA-3 contest at the moment. Several of them have`

`conservative designs and good performance, but there is always the`

`risk that they will be found to have catastrophic design flaws or`

`that a great advance in hash function cryptanalysis will suddenly`

`show how to crack them. Of course, a similar risk applies to SHA-256!`

`So I turn to the question of how to combine multiple hash functions`

`to build a hash function which is secure even if one or more of the`

`underlying hash functions turns out to be weak.`

`I've read several interesting papers on the subject -- such as [4, 5]`

`and especially "Robust Multi-Property Combiners for Hash Functions`

`Revisited" by Marc Fischlin, Anja Lehmann, and Krzysztof Pietrzak`

`[6]. The good news is that it turns out to be doable! The latter`

`two papers show nice strong theoretical results -- ways to combine`

`hash functions so that the resulting combination is as strong or`

`stronger than the two underlying hash functions. The bad news is`

`that the proposal in [6] builds a combined function whose output is`

`twice the size of the output of a single hash function. There is a`

`good theoretical reason for this [4], but it won't work for our`

`practical engineering requirements -- we need hash function outputs`

`as small as possible (partially due to usability issues)`

`The other bad news is that the construction proposed in [6] is`

`complicated, underspecified, and for the strongest version of it, it`

`imposes a limit on the length of the inputs that you can feed to your`

`hash function. It grows to such complexity and incurs such`

`limitations because it is, if I may call it this, "too theoretical".`

`It is designed to guarantee certain output properties predicated on`

`minimal theoretical assumptions about the properties of the`

`underlying hash functions. This is a fine goal, but in practice we`

`don't want to pay such a high cost in complexity and performance in`

`order to gain such abstract improvement. We should be able to "hedge`

`our bets" and achieve a comfortable margin of safety with a very`

`simple and efficient scheme by making stronger, less formal, but very`

`plausible assumptions about the underlying hash functions. Read on.`

`I propose the following combined hash function C, built out of two`

`hash functions H1 and H2:`

C(x) = H1(H1(x) || H2(x))

`The first observation is that if H1 is collision-resistant then so is`

`C. In practice I would expect to use SHA-256 for H1, so the`

`resulting combiner C[SHA-256, H2] will be at least as strong as`

`SHA-256. (One could even think of this combiner C as just being a`

`tricky way to strengthen SHA-256 by using the output of H2(x) as a`

`randomized salt -- see [7].)`

`The next observation is that finding a pair of inputs x1, x2 which`

`collide in *both* H1 and in H2 is likely to be much harder than`

`finding a pair of inputs that collide in H1 and finding a pair of`

`inputs that collide in H2 (see [5]).`

`Now the reason that a combiner like this one is not published in`

`theoretical crypto literature is that it obviously could fail if the`

`outer hash function H1 fails. For example, even if H2 is collision-`

`resistant, if H1 turns out to be susceptible to collisions, then`

`theoretically speaking C[H1, H2] might be susceptible to collisions.`

`However, in real life C[H1, H2] would most likely still be collision`

`resistant!`

`All practical attacks on real hash functions so far (if I understand`

`correctly) are multi-block attacks in which the attacker is able to`

`feed a sufficiently long and unconstrained input to the hash`

`functions that the effects of the later parts of his inputs are able`

`to manipulate the state generated by the earlier parts of his`

`inputs. My combiner C uses H1 in its outer invocation on a single-`

`block-sized input, which means no such multi-block attacks are`

`possible on the outer invocation. In addition, the inputs that the`

`attacker gets to feed to the outer invocation of H1 are highly`

`constrained. Basically, he would already have to be very good at`

`manipulating the inner invocations H1 and H2 in ways that he isn't`

`supposed to before he can even begin to manipulate the outer`

`invocation of H1.`

`A measure of the practical security of a combiner like this one would`

`be "how safe would it be if it were instantiated using broken`

`practical hash functions such as MD5 and SHA1?". It appears to me`

`(from an admittedly cursory analysis) that there is no realistic way`

`to find collisions in C[MD5, SHA1] even though there are realistic`

`ways to find collisions in MD5 and in SHA1. Of course, I'm not`

`proposing to use C[MD5, SHA1]! I'm proposing to use C[SHA-256, _]`

`where _ is some other hash function which is believed to be strong.`

`The example of instantiating C with MD5 and SHA1 just goes to show`

`that C is a hash function which is stronger than either of its two`

`underlying hash functions.`

`The other desirable security properties such as second-preimage`

`resistance and pre-image resistance seem to follow the same pattern`

`as collision-resistance -- C[H1, H2] seems to be much stronger than`

`H1 or H2 alone.`

Regards, Zooko [1] http://extendedsubset.com/Renegotiating_TLS.pdf [2] http://allmydata.org/trac/tahoe/wiki/NewCaps/WhatCouldGoWrong [3] http://bench.cr.yp.to/results-hash.html#arm-apollo

`[4] Krzysztof Pietrzak: "Non-Trivial Black-Box Combiners for`

`Collision-Resistant Hash-Functions don't Exist"`

`[5] Jonathan J. Hoch, Adi Shamir: "On the Strength of the`

`Concatenated Hash Combiner when All the Hash Functions are Weak"`

`[6] Marc Fischlin, Anja Lehmann, Krzysztof Pietrzak: "Robust Multi-`

`Property Combiners for Hash Functions Revisited"`

[7] http://webee.technion.ac.il/~hugo/rhash/rhash.pdf --------------------------------------------------------------------- The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to majord...@metzdowd.com