While I agree that it is somewhat lax of Apple to be using MD5 for checking its updates it's far from clear to me that an attack of the sort described above would ever be practical. The problem is that the while there are methods for finding has collisions by brute force, not only do they require a work factor around the size of the square root of the output space, and a large amount of storage, but the work factor is multiplied by a factor linear in the size of the data that follows the part of the message that you can manipulate. When all you are doing is trying to find a single collision on the hash function (or perhaps trying to forge a matching pair of small key exchange messages) this can be made quite small but when it comes to breaking a package distribution system it is much harder.
Consider that the real package is represented by M, the subtly modified version is M1 and the bogus package is represented by M2. Each of these has some high resolution TIFF image T that we can get away with tweaking into T1 and T2. We probably don't have control where this image comes out in the package because the file ordering is deterministic.
The file ordering may be deterministic, but someone who is well versed in the configuration control and release engineering process might well be able to have a chosen file placed at the end of of the package. The method for getting stuff at the end needn't be perfect. The attacker can keep trying until he succeeds.
The package can be broken down into three parts: the bit before the tweakable part, the tweakable image and the bit after that:
M = M1a | T | M1b
M1 = M1a | T1 | M1b
M2 = M2a | T2 | M2b
For the purposes of finding a collision we can pre-compute the hash blocks of M1a and M2a but we can't do the same for M1b and M2b. This means if L(x) is the function for the number of hash blocks in message x then the cost of finding a hash collision in our packages is L(T1|M1b) + L(T2|M2b) times harder than just finding a simple collision by brute force. In MD5 the blocks are 64 bytes, so for every 1K in M1b the process of finding a collision gets 16 times harder. Finding a collision in SHA1 is about 2^16 times harder than finding a collision in MD5, so unless you find something that you can alter without it being noticed in the last 2MB of the package then this sort of forgery is actually going to be harder than finding a simple hash collision in SHA1! In practice you'll probably find something that you can alter in the last few hundred KB but still the raw processing cost will be a few orders of magnitude harder than a simple hash collision problem. Furthermore, since you have to process the "tail" of the message any specialised hardware for doing this will have a hard time doing it's processing in a usefully pipelined fashion and will need access to large amounts of very fast RAM to store the tail of the message, and this will also cost you another order of magnitude or so.
Having a tail 2 MB or longer may make the processing time comparable to finding an SHA1 collision, but it is still a 64-bit problem and thus requires far less memory than finding an SHA1 collision.
I am not saying that my attack is easy, but that it is feasible for a large organization and very dangerous and stealthy if it succeeds. History has shown, over and over and over again, the folly of ignoring cryptographic attacks that are theoretically possible but seem too hard to implement. On the other hand, defending against my attack certainly is easy, just publish an SHA1 (or stronger) hash alongside the MD5 hash.
--------------------------------------------------------------------- The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]