Re: hedging our bets -- in case SHA-256 turns out to be insecure

2009-11-17 Thread Sandy Harris
On 11/12/09, David-Sarah Hopwood david-sa...@jacaranda.org wrote:
 Sandy Harris wrote:
   On 11/8/09, Zooko Wilcox-O'Hearn zo...@zooko.com wrote:
  
Therefore I've been thinking about how to make Tahoe-LAFS robust against
   the possibility that SHA-256 will turn out to be insecure.

 [...]

  Since you are encrypting the files anyway, I wonder if you could
   use one of the modes developed for IPsec where a single pass
   with a block cipher gives both encrypted text and a hash-like
   authentication output.  That gives you a free value to use as
   H3 in my scheme or H2 in yours, and its security depends on
   the block cipher, not on any hash.


 Tahoe is intended to provide resistance to collision attacks by the
  creator of an immutable file: the creator should not be able to generate
  files with different contents, that can be read and verified by the same
  read capability.

  An authenticated encryption mode won't provide that -- unless, perhaps,
  it relies on a collision-resistant hash.

I was suggesting using the authentication data in the construction:

 C(x) = H1(H2(x)||A(x))

where H1 is a hash with he required output size, H2 a hash with
a large block size and A the authentication data from your
encryption.

This is likely a very bad idea if you already use that data in some
other way, e.g. for authenticating stored data. However, if C is
going to be your authentication mechanism, then this might be
a cheap way to get one input to it.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com


Re: hedging our bets -- in case SHA-256 turns out to be insecure

2009-11-16 Thread Jack Lloyd
On Wed, Nov 11, 2009 at 10:03:45AM +0800, Sandy Harris wrote:

   C(x) = H1(H1(x) || H2(x))
 
 This requires two hash(x) operations. A naive implementation needs
 two passes through the data and avoiding that does not appear to
 be trivial. This is not ideal since you seem very concerned about
 overheads.

If performance is really an issue one could code a combined H1/H2
function which would interleave the operations, which would prevent
needing two passes (which _is_ really important since memory and disk
latencies are usually the biggest factor in performance). Direct
interleaving would also offer better ILP.

But even updating both hashes at the same time would prevent needing
two full passes; something like the code below would offer much better
cache utilization, and would not be at all difficult to implement:

while data_left:
  block = input.read_one_block()
  h1.compress(block)
  h2.compress(block)

If it was really important, choosing a nonstandard H2 could offer even
better performance; for instance let H1=SHA-256 and H2=SHA-~256, where
SHA-~256 is precisely SHA-256 but with all of its constants bitwise
inverted. One could compute both functions in parallel using SIMD
(SSE2, ARM's NEON, etc) [and they could share the message expansion,
which is quite costly in SHA-2]. It's not clear from a quick read of
the paper Zooko referenced (On the Strength of the Concatenated Hash
Combiner when All the Hash Functions are Weak) if this would actually
meet the requirements of sufficiently different for the results
there to apply, though.

 What about this construction:
 
   C(x) = H1(H2(x) || H3(x))

One trouble with this construction that Zooko's does not have is that
it can fail even if H1 is collision resistant due to an inner
collision.

 H3 might be some really cheap fast function invented for the situation.
 As I recall, the GOST hash just used a sum of input blocks, and that's
 enough to defeat the multi-block attacks. If it is simple enough, you
 can code it into your implementation of H2 so you only need one
 pass.

The GOST hash does use the sum of input blocks (as the final input to
the compression function) but it has a number of other components; it
is actually quite slow compared to modern hashes.

-Jack

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com


Re: hedging our bets -- in case SHA-256 turns out to be insecure

2009-11-16 Thread james hughes

On Nov 11, 2009, at 10:03 AM, Sandy Harris wrote:

 On 11/8/09, Zooko Wilcox-O'Hearn zo...@zooko.com wrote:
 
 Therefore I've been thinking about how to make Tahoe-LAFS robust against
 the possibility that SHA-256 will turn out to be insecure.
 
 NIST are dealing with that via the AHS process. Shouldn't you just use
 their results?
 
 We could use a different hash function ...
 There are fourteen candidates left in the SHA-3
 contest at the moment.  Several of them have conservative designs and good
 performance, but there is always the risk that they will be found to have
 catastrophic design flaws or that a great advance in hash function
 cryptanalysis will suddenly show how to crack them.
 
 Yes, but there's also a risk that whatever you come up with will turn
 out to be flawed.

I agree.

The logic of a unknown flaw being fixed flies in the face of prudent 
cryptanalysis. If you don't know the flaw, how can do you know you can or have 
fixed it. 

What if there is an unknown flaw in the fix? Wrap that again? Turtles all the 
way down. 

Putting multiple insecure algorithms together does guarantee a secure one.

The only solution that works is a new hash algorithm that is secure against 
this (and all other) vulnerabilities. It may include SHA 256 as a primitive, 
but a true fix is fundamentally a new hash algorithm. 

This process is being worked on by a large number of smart people. I can 
guarantee you that this kind of construction has been looked at. 

It is my opinion that putting a bandaid around SHA 256 just in case is not 
cryptanalysis, it's marketing.

Jim

P.S. once Sha-3 comes out, your bandaid will look silly.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com


Re: hedging our bets -- in case SHA-256 turns out to be insecure

2009-11-11 Thread Sandy Harris
On 11/8/09, Zooko Wilcox-O'Hearn zo...@zooko.com wrote:

  Therefore I've been thinking about how to make Tahoe-LAFS robust against
 the possibility that SHA-256 will turn out to be insecure.

NIST are dealing with that via the AHS process. Shouldn't you just use
their results?

  We could use a different hash function ...
 There are fourteen candidates left in the SHA-3
 contest at the moment.  Several of them have conservative designs and good
 performance, but there is always the risk that they will be found to have
 catastrophic design flaws or that a great advance in hash function
 cryptanalysis will suddenly show how to crack them.

Yes, but there's also a risk that whatever you come up with will turn
out to be flawed.

  I propose the following combined hash function C, built out of two hash
 functions H1 and H2:

  C(x) = H1(H1(x) || H2(x))

This requires two hash(x) operations. A naive implementation needs
two passes through the data and avoiding that does not appear to
be trivial. This is not ideal since you seem very concerned about
overheads.

What about this construction:

  C(x) = H1(H2(x) || H3(x))

H1 is something that gives the output size you require. Use SHA-256 or
choose an AHS candidate conservatively. This only hashes a few blocks
so you need not worry much about overheads here.

H2 is the 512-bit variant of a different AHS candidate, or Whirlpool, or
even Skein-1024. Here speed is a criterion, though of course not the
only one.

H3 might be some really cheap fast function invented for the situation.
As I recall, the GOST hash just used a sum of input blocks, and that's
enough to defeat the multi-block attacks. If it is simple enough, you
can code it into your implementation of H2 so you only need one
pass.

Since you are encrypting the files anyway, I wonder if you could
use one of the modes developed for IPsec where a single pass
with a block cipher gives both encrypted text and a hash-like
authentication output.  That gives you a free value to use as
H3 in my scheme or H2 in yours, and its security depends on
the block cipher, not on any hash.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com


Re: hedging our bets -- in case SHA-256 turns out to be insecure

2009-11-09 Thread Jerry Leichter

On Nov 8, 2009, at 6:30 AM, Zooko Wilcox-O'Hearn wrote:
I propose the following combined hash function C, built out of two  
hash functions H1 and H2:


C(x) = H1(H1(x) || H2(x))
I'd worry about using this construction if H1's input block and output  
size were the same, since one might be able to leverage some kind of  
extension attack.  That's not a problem for SHA256 or SHA512, but it's  
something to keep in mind if this is supposed to be a general  
construction, given that all likely hash functions will be constructed  
by some kind of iteration over fixed-size blocks.


Rather than simply concatenating H1(x) and H2(x), you might do better  
to interlace them.  Even alternating bytes - cheap enough that you'd  
never notice - should break up any structure that designs of practical  
hash functions might exhibit.  (As a matter of theory, a vulnerability  
of alternate bytes is as likely as a vulnerability of leading bytes;  
but given the way we actually build hash functions, as a practical  
matter the latter seems much more likely.)

-- Jerry

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to majord...@metzdowd.com


hedging our bets -- in case SHA-256 turns out to be insecure

2009-11-08 Thread Zooko Wilcox-O'Hearn

Folks:

We're going to be deploying a new crypto scheme in Tahoe-LAFS next  
year -- the year 2010.  Tahoe-LAFS is used for long-term storage, and  
I won't be surprised if people store files on Tahoe-LAFS in 2010 and  
then rely on the confidentiality and integrity of those files for  
many years or even decades to come.  (People started storing files on  
Tahoe-LAFS in 2008 and so far they show no signs of losing interest  
in the integrity and confidentiality of those files.)


This long-term focus makes Tahoe-LAFS's job harder than the job of  
protecting transient network packets.  If someone figures out in 2020  
or 2030 how to spoof a network transaction that you sent in 2010 (see  
[1]), it'll be far too late to do you any harm, but if they figure  
out in 2030 how to alter a file that you uploaded to a Tahoe-LAFS  
grid in 2010, that might harm you.


Therefore I've been thinking about how to make Tahoe-LAFS robust  
against the possibility that SHA-256 will turn out to be insecure.


A very good way to do this is to design Tahoe-LAFS so that it relies  
as little as possible on SHA-256's security properties.  The property  
that seems to be the hardest for a secure hash function to provide is  
collision-resistance.  We are analyzing new crypto schemes to see how  
many security properties of Tahoe-LAFS we can continue to guarantee  
even if the collision-resistance of the underlying secure hash  
function fails, and similarly for the other properties of the secure  
hash function which might fail [2].


This note is not about that design process, though, but about how to  
maximize the chance that the underlying hash function does provide  
the desired security properties.


We could use a different hash function than SHA-256 -- there are many  
alternatives.  SHA-512 would probably be safer, but it is extremely  
expensive on the cheap, low-power 32-bit ARM CPUs that are one of our  
design targets [3], and the output size of 512 bits is too large to  
fit into Tahoe-LAFS capabilities.  There are fourteen candidates left  
in the SHA-3 contest at the moment.  Several of them have  
conservative designs and good performance, but there is always the  
risk that they will be found to have catastrophic design flaws or  
that a great advance in hash function cryptanalysis will suddenly  
show how to crack them.  Of course, a similar risk applies to SHA-256!


So I turn to the question of how to combine multiple hash functions  
to build a hash function which is secure even if one or more of the  
underlying hash functions turns out to be weak.


I've read several interesting papers on the subject -- such as [4, 5]  
and especially Robust Multi-Property Combiners for Hash Functions  
Revisited by Marc Fischlin, Anja Lehmann, and Krzysztof Pietrzak  
[6].  The good news is that it turns out to be doable!  The latter  
two papers show nice strong theoretical results -- ways to combine  
hash functions so that the resulting combination is as strong or  
stronger than the two underlying hash functions.  The bad news is  
that the proposal in [6] builds a combined function whose output is  
twice the size of the output of a single hash function.  There is a  
good theoretical reason for this [4], but it won't work for our  
practical engineering requirements -- we need hash function outputs  
as small as possible (partially due to usability issues)


The other bad news is that the construction proposed in [6] is  
complicated, underspecified, and for the strongest version of it, it  
imposes a limit on the length of the inputs that you can feed to your  
hash function.  It grows to such complexity and incurs such  
limitations because it is, if I may call it this, too theoretical.   
It is designed to guarantee certain output properties predicated on  
minimal theoretical assumptions about the properties of the  
underlying hash functions.  This is a fine goal, but in practice we  
don't want to pay such a high cost in complexity and performance in  
order to gain such abstract improvement.  We should be able to hedge  
our bets and achieve a comfortable margin of safety with a very  
simple and efficient scheme by making stronger, less formal, but very  
plausible assumptions about the underlying hash functions.  Read on.


I propose the following combined hash function C, built out of two  
hash functions H1 and H2:


C(x) = H1(H1(x) || H2(x))

The first observation is that if H1 is collision-resistant then so is  
C.  In practice I would expect to use SHA-256 for H1, so the  
resulting combiner C[SHA-256, H2] will be at least as strong as  
SHA-256.  (One could even think of this combiner C as just being a  
tricky way to strengthen SHA-256 by using the output of H2(x) as a  
randomized salt -- see [7].)


The next observation is that finding a pair of inputs x1, x2 which  
collide in *both* H1 and in H2 is likely to be much harder than  
finding a pair of inputs that collide in H1 and finding a pair