All hash functions I'm aware of consist of an inner compression function that hashes a fixed size block of data into a smaller fixed size block and an outer composition function that applies the inner function iteratively to the variable length data to be hashed. Essentially you're proposing a modification to the outer layer of the hash construction.
All of the standard hash functions since MD4 have been constructed so that a collision in the inner compression function is likely to lead to a collision in the hash function. MD2 did not have that property. It computed a cheap checksum of the variable length data in parallel with the digesting process and digested the checksum following the data. I have often wondered whether such a cheap addition would strengthen the newer hashes. (It would fix the suffixing attacks that motivated the development of HMAC). It's not obvious whether this would make the functions more secure or just make them harder to analyze. Perhaps someone from the research community could comment on why the checksum was removed in the evolution from MD2 to MD4. Your proposed encoding has the disadvantage that it would require two passes over the message being digested. This would be bad news for hardware implementations and should be avoided if possible. You note with the construction: H'(x)=Random || H(Random || x) (reminiscent of the salted hash calculation for UNIX passwords) that the hash gets longer. The hash need not get longer. If you have 40 random bits and the first 120 bits of H(Random || x), you match the size of SHA-1 and get improved security against most practical attacks. If your system depends on a fixed length hash, you're in trouble already because the fixed length is probably 128 bits and the world is headed toward 256. A problem that does exist with this construction is that some uses of hash functions assume that if you hash the same data you get the same hash (or indirectly, that if you sign the same data you get the same signature). In particular, you now need separate functions for generating a hash and for checking one. --Charlie Kaufman -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ben Laurie Sent: Monday, March 21, 2005 3:57 AM To: Cryptography; [EMAIL PROTECTED] Subject: Propping up SHA-1 (or MD5) It was suggested at the SAAG meeting at the Minneapolis IETF that a way to deal with weakness in hash functions was to create a new hash function from the old like so: H'(x)=Random || H(Random || x) However, this allows an attacker to play with Random (the advice I've seen is that if one is going to use an IV with a hash function, then one should transfer the IV with integrity checks to deny attackers this freedom). Another objection is that this construction changes the API at the sender end, which could lead to a great deal of complexity when the use of the hash API is deeply embedded. A third is that the length of the hash is changed, which could break existing protocols. Musing on these points, I wondered about the construction: H'(x)=H(H(x) || H(H(x) || x)) which doesn't allow an attacker any choice, doesn't change APIs and doesn't change the length of the hash. Does this have any merit? Note that this is essentially an HMAC where the key is H(x). I omitted the padding because it seems to me that this actually makes HMAC weaker against the current attacks. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ "There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit." - Robert Woodruff --------------------------------------------------------------------- The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED] --------------------------------------------------------------------- The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]