### Re: hedging our bets -- in case SHA-256 turns out to be insecure

On 11/12/09, David-Sarah Hopwood david-sa...@jacaranda.org wrote: Sandy Harris wrote: On 11/8/09, Zooko Wilcox-O'Hearn zo...@zooko.com wrote: Therefore I've been thinking about how to make Tahoe-LAFS robust against the possibility that SHA-256 will turn out to be insecure. [...] Since you are encrypting the files anyway, I wonder if you could use one of the modes developed for IPsec where a single pass with a block cipher gives both encrypted text and a hash-like authentication output. That gives you a free value to use as H3 in my scheme or H2 in yours, and its security depends on the block cipher, not on any hash. Tahoe is intended to provide resistance to collision attacks by the creator of an immutable file: the creator should not be able to generate files with different contents, that can be read and verified by the same read capability. An authenticated encryption mode won't provide that -- unless, perhaps, it relies on a collision-resistant hash. I was suggesting using the authentication data in the construction: C(x) = H1(H2(x)||A(x)) where H1 is a hash with he required output size, H2 a hash with a large block size and A the authentication data from your encryption. This is likely a very bad idea if you already use that data in some other way, e.g. for authenticating stored data. However, if C is going to be your authentication mechanism, then this might be a cheap way to get one input to it. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com

### Re: hedging our bets -- in case SHA-256 turns out to be insecure

On Wed, Nov 11, 2009 at 10:03:45AM +0800, Sandy Harris wrote: C(x) = H1(H1(x) || H2(x)) This requires two hash(x) operations. A naive implementation needs two passes through the data and avoiding that does not appear to be trivial. This is not ideal since you seem very concerned about overheads. If performance is really an issue one could code a combined H1/H2 function which would interleave the operations, which would prevent needing two passes (which _is_ really important since memory and disk latencies are usually the biggest factor in performance). Direct interleaving would also offer better ILP. But even updating both hashes at the same time would prevent needing two full passes; something like the code below would offer much better cache utilization, and would not be at all difficult to implement: while data_left: block = input.read_one_block() h1.compress(block) h2.compress(block) If it was really important, choosing a nonstandard H2 could offer even better performance; for instance let H1=SHA-256 and H2=SHA-~256, where SHA-~256 is precisely SHA-256 but with all of its constants bitwise inverted. One could compute both functions in parallel using SIMD (SSE2, ARM's NEON, etc) [and they could share the message expansion, which is quite costly in SHA-2]. It's not clear from a quick read of the paper Zooko referenced (On the Strength of the Concatenated Hash Combiner when All the Hash Functions are Weak) if this would actually meet the requirements of sufficiently different for the results there to apply, though. What about this construction: C(x) = H1(H2(x) || H3(x)) One trouble with this construction that Zooko's does not have is that it can fail even if H1 is collision resistant due to an inner collision. H3 might be some really cheap fast function invented for the situation. As I recall, the GOST hash just used a sum of input blocks, and that's enough to defeat the multi-block attacks. If it is simple enough, you can code it into your implementation of H2 so you only need one pass. The GOST hash does use the sum of input blocks (as the final input to the compression function) but it has a number of other components; it is actually quite slow compared to modern hashes. -Jack - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com

### Re: hedging our bets -- in case SHA-256 turns out to be insecure

On Nov 11, 2009, at 10:03 AM, Sandy Harris wrote: On 11/8/09, Zooko Wilcox-O'Hearn zo...@zooko.com wrote: Therefore I've been thinking about how to make Tahoe-LAFS robust against the possibility that SHA-256 will turn out to be insecure. NIST are dealing with that via the AHS process. Shouldn't you just use their results? We could use a different hash function ... There are fourteen candidates left in the SHA-3 contest at the moment. Several of them have conservative designs and good performance, but there is always the risk that they will be found to have catastrophic design flaws or that a great advance in hash function cryptanalysis will suddenly show how to crack them. Yes, but there's also a risk that whatever you come up with will turn out to be flawed. I agree. The logic of a unknown flaw being fixed flies in the face of prudent cryptanalysis. If you don't know the flaw, how can do you know you can or have fixed it. What if there is an unknown flaw in the fix? Wrap that again? Turtles all the way down. Putting multiple insecure algorithms together does guarantee a secure one. The only solution that works is a new hash algorithm that is secure against this (and all other) vulnerabilities. It may include SHA 256 as a primitive, but a true fix is fundamentally a new hash algorithm. This process is being worked on by a large number of smart people. I can guarantee you that this kind of construction has been looked at. It is my opinion that putting a bandaid around SHA 256 just in case is not cryptanalysis, it's marketing. Jim P.S. once Sha-3 comes out, your bandaid will look silly. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com

### Re: hedging our bets -- in case SHA-256 turns out to be insecure

On 11/8/09, Zooko Wilcox-O'Hearn zo...@zooko.com wrote: Therefore I've been thinking about how to make Tahoe-LAFS robust against the possibility that SHA-256 will turn out to be insecure. NIST are dealing with that via the AHS process. Shouldn't you just use their results? We could use a different hash function ... There are fourteen candidates left in the SHA-3 contest at the moment. Several of them have conservative designs and good performance, but there is always the risk that they will be found to have catastrophic design flaws or that a great advance in hash function cryptanalysis will suddenly show how to crack them. Yes, but there's also a risk that whatever you come up with will turn out to be flawed. I propose the following combined hash function C, built out of two hash functions H1 and H2: C(x) = H1(H1(x) || H2(x)) This requires two hash(x) operations. A naive implementation needs two passes through the data and avoiding that does not appear to be trivial. This is not ideal since you seem very concerned about overheads. What about this construction: C(x) = H1(H2(x) || H3(x)) H1 is something that gives the output size you require. Use SHA-256 or choose an AHS candidate conservatively. This only hashes a few blocks so you need not worry much about overheads here. H2 is the 512-bit variant of a different AHS candidate, or Whirlpool, or even Skein-1024. Here speed is a criterion, though of course not the only one. H3 might be some really cheap fast function invented for the situation. As I recall, the GOST hash just used a sum of input blocks, and that's enough to defeat the multi-block attacks. If it is simple enough, you can code it into your implementation of H2 so you only need one pass. Since you are encrypting the files anyway, I wonder if you could use one of the modes developed for IPsec where a single pass with a block cipher gives both encrypted text and a hash-like authentication output. That gives you a free value to use as H3 in my scheme or H2 in yours, and its security depends on the block cipher, not on any hash. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com

### Re: hedging our bets -- in case SHA-256 turns out to be insecure

On Nov 8, 2009, at 6:30 AM, Zooko Wilcox-O'Hearn wrote: I propose the following combined hash function C, built out of two hash functions H1 and H2: C(x) = H1(H1(x) || H2(x)) I'd worry about using this construction if H1's input block and output size were the same, since one might be able to leverage some kind of extension attack. That's not a problem for SHA256 or SHA512, but it's something to keep in mind if this is supposed to be a general construction, given that all likely hash functions will be constructed by some kind of iteration over fixed-size blocks. Rather than simply concatenating H1(x) and H2(x), you might do better to interlace them. Even alternating bytes - cheap enough that you'd never notice - should break up any structure that designs of practical hash functions might exhibit. (As a matter of theory, a vulnerability of alternate bytes is as likely as a vulnerability of leading bytes; but given the way we actually build hash functions, as a practical matter the latter seems much more likely.) -- Jerry - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com

### hedging our bets -- in case SHA-256 turns out to be insecure

Folks: We're going to be deploying a new crypto scheme in Tahoe-LAFS next year -- the year 2010. Tahoe-LAFS is used for long-term storage, and I won't be surprised if people store files on Tahoe-LAFS in 2010 and then rely on the confidentiality and integrity of those files for many years or even decades to come. (People started storing files on Tahoe-LAFS in 2008 and so far they show no signs of losing interest in the integrity and confidentiality of those files.) This long-term focus makes Tahoe-LAFS's job harder than the job of protecting transient network packets. If someone figures out in 2020 or 2030 how to spoof a network transaction that you sent in 2010 (see [1]), it'll be far too late to do you any harm, but if they figure out in 2030 how to alter a file that you uploaded to a Tahoe-LAFS grid in 2010, that might harm you. Therefore I've been thinking about how to make Tahoe-LAFS robust against the possibility that SHA-256 will turn out to be insecure. A very good way to do this is to design Tahoe-LAFS so that it relies as little as possible on SHA-256's security properties. The property that seems to be the hardest for a secure hash function to provide is collision-resistance. We are analyzing new crypto schemes to see how many security properties of Tahoe-LAFS we can continue to guarantee even if the collision-resistance of the underlying secure hash function fails, and similarly for the other properties of the secure hash function which might fail [2]. This note is not about that design process, though, but about how to maximize the chance that the underlying hash function does provide the desired security properties. We could use a different hash function than SHA-256 -- there are many alternatives. SHA-512 would probably be safer, but it is extremely expensive on the cheap, low-power 32-bit ARM CPUs that are one of our design targets [3], and the output size of 512 bits is too large to fit into Tahoe-LAFS capabilities. There are fourteen candidates left in the SHA-3 contest at the moment. Several of them have conservative designs and good performance, but there is always the risk that they will be found to have catastrophic design flaws or that a great advance in hash function cryptanalysis will suddenly show how to crack them. Of course, a similar risk applies to SHA-256! So I turn to the question of how to combine multiple hash functions to build a hash function which is secure even if one or more of the underlying hash functions turns out to be weak. I've read several interesting papers on the subject -- such as [4, 5] and especially Robust Multi-Property Combiners for Hash Functions Revisited by Marc Fischlin, Anja Lehmann, and Krzysztof Pietrzak [6]. The good news is that it turns out to be doable! The latter two papers show nice strong theoretical results -- ways to combine hash functions so that the resulting combination is as strong or stronger than the two underlying hash functions. The bad news is that the proposal in [6] builds a combined function whose output is twice the size of the output of a single hash function. There is a good theoretical reason for this [4], but it won't work for our practical engineering requirements -- we need hash function outputs as small as possible (partially due to usability issues) The other bad news is that the construction proposed in [6] is complicated, underspecified, and for the strongest version of it, it imposes a limit on the length of the inputs that you can feed to your hash function. It grows to such complexity and incurs such limitations because it is, if I may call it this, too theoretical. It is designed to guarantee certain output properties predicated on minimal theoretical assumptions about the properties of the underlying hash functions. This is a fine goal, but in practice we don't want to pay such a high cost in complexity and performance in order to gain such abstract improvement. We should be able to hedge our bets and achieve a comfortable margin of safety with a very simple and efficient scheme by making stronger, less formal, but very plausible assumptions about the underlying hash functions. Read on. I propose the following combined hash function C, built out of two hash functions H1 and H2: C(x) = H1(H1(x) || H2(x)) The first observation is that if H1 is collision-resistant then so is C. In practice I would expect to use SHA-256 for H1, so the resulting combiner C[SHA-256, H2] will be at least as strong as SHA-256. (One could even think of this combiner C as just being a tricky way to strengthen SHA-256 by using the output of H2(x) as a randomized salt -- see [7].) The next observation is that finding a pair of inputs x1, x2 which collide in *both* H1 and in H2 is likely to be much harder than finding a pair of inputs that collide in H1 and finding a pair