More problems with hash functions

"Hal Finney" Fri, 20 Aug 2004 14:09:22 -0700

Another of the Crypto talks that was relevant to hash function security
was by Antoine Joux, discoverer of the SHA-0 collision that required
2^51 work.  Joux showed how most modern hash functions depart from the
ideal of a random function.


The problem is with the iterative nature of most hash functions, which
are structured like this (view with a fixed with font):


IV  --->  COMP   --->   COMP   --->   COMP   --->  Output
           ^             ^             ^
           |             |             |
           |             |             |
         Block 1       Block 2       Block 3

The idea is that there is a compression function COMP, which takes
two inputs: a state vector, which is the size of the eventual hash
output; and an input block.  The hash input is padded and divided
into fixed-size blocks, and they are run through the hash function
via the pattern above.

This pattern applies to pretty much all the hashes in common use,
including MDx, RIPEMDx, and SHA-x.

The problem Joux notes is that this structure makes the hash more
vulnerable than a random function should be to multicollisions:
cases where we can find many values which hash to the same value.
Ordinary collisions require two values with the same hash and for an
n-bit hash will take at most 2^(n/2) work to find via the birthday attack.

Multicollisions would find 3, 4, or generally k values all with the
same hash, and in a true random function would take 2^(n*(k-1)/k) work.
So to find a 4-collision should take 2^(n*3/4) work.

Joux points out that due to the iterative structure, things are much
simpler.  To find a 4-collision, we could use a two-block message,
and first find a 2-collision in the first block.  This will take at
most 2^(n/2) work (much less with the newly broken hashes).  We then
know the output of the first block, and using that as the input to the
second block, we search for a one-block collision in the second block,
using that input.  This will take the same amount of work, 2^(n/2)
at most.  With the resulting two single-block collisions we can create
4 collisions of the hash: if the colliding first-block messages are M1
and M1', and likewise the second block messages are M2 and M2', then the
four collisions are M1 M2, M1' M2, M1 M2', and M1' M2'.  The total amount
of work was only twice that of finding a single collision, 2*(2^(n/2))
at most, instead of 2^(n*3/4) as it would have been for a true random
function.

In general, if we use k blocks and find a (2-) collision in each one,
we can create a 2^k collision with only k times the work to find a
single collision, rather than 2^(n*(k-1)/k).

Joux then extended this argument to point out that attempts to increase
the security of hash functions by concatenating the outputs of two
independent functions don't actually increase their theoretical security.
For example, defining H(x) = SHA1(x) || RIPEMD160(x) still gives you only
about 160 bits of strength, not 320 as you might have hoped.  The reason
is because you can find a 2^80 multicollision in SHA1 using only 80*2^80
work at most, by the previous paragraph.  And among all of these 2^80
values you have a good chance that two of them will collide in RIPEMD160.
So that is the total work to find a collision in the construction.

It was pointed out in the questions that another reason for concatenating
hashes is not to try to increase the theoretical security, but for
practical considerations in case one of them gets broken.  This is
probably why SSL, for example, used MD5 along with SHA1.  That is still
a valid reason.

Nevertheless, Joux's results cast doubt on the very strategy of building
hashes out of iterating compression functions.  It appears that there is
no hope of creating hashes in this way which approximate the theoretical
model of a random function, which is the usual design goal for hash
functions.  This will probably further motivate researchers to explore
new directions in hash function design.

By the way, another comment after the talk came from Dr. Wang, who
discovered the collisions in MD5 etc.  She said her techniques could be
extended to generate multicollisions in the broken hashes very easily,
without having to go through the construction Joux described.  That
doesn't change Joux's fundamental point about the weakness in iterative
hashes, but further demonstrates the power of the Wang technique.

Hal Finney

---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]

More problems with hash functions

Reply via email to