On 09/03/2010 03:45 AM, Ben Laurie wrote:
That's the whole point - a hash function used on an arbitrary message
produces one of its possible outputs. Feed that hash back in and it
produces one of a subset of its possible outputs. Each time you do this,
you lose a little entropy (I can't remember how much, but I do remember
David Wagner explaining it to me when I discovered this for myself quite
a few years ago).
I found this to be interesting:
Danilo Gligoroski, Vlastimil Klima: Practical consequences of the
aberration of narrow-pipe hash designs from ideal random functions, IACR
eprint, Report 2010/384, pdf.
http://eprint.iacr.org/2010/384.pdf
The theoretical loss is -log2(1/e) = about 0.66 bits of entropy per
log2(N additional iterations).
This assumes that there is no systematic correlation between the hash
input and the calculation of the output, which is not really a good
assumption with the MD's and SHA's in current use. They accept, process,
and output vectors of 32- or 64-bit words, even preserving their order
to some extent. So it would seem reasonable to expect that to the extent
that these actual functions differed from an ideal random function they
could easily have the type of systematic bias which would be amplified
through repeated iteration.
I played with some simulations with randomly-generated mappings, the
observed value would at times wander over 1.0 BoE/log2 N.
It seems like this entropy loss could be largely eliminated by hashing
the previous two intermediate results on each iteration instead of just
one. But this basically amounts to widening the data path, so perhaps it
would be cheating for the purposes of this discussion.
- Marsh
_______________________________________________
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography