Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
[Moderator's note: I'm ending forwards/posts on this topic, unless someone has something stunningly new to say. --Perry] --- begin forwarded text To: [EMAIL PROTECTED] From: "Arnold G. Reinhold" <[EMAIL PROTECTED]> Subject: Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases Sender: [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] List-Id: Macintosh Cryptography List-Post: <mailto:[EMAIL PROTECTED]> List-Help: <mailto:[EMAIL PROTECTED]> List-Subscribe: <http://www.vmeng.com/mailman/listinfo/mac_crypto>, <mailto:[EMAIL PROTECTED]> List-Archive: <http://www.vmeng.com/pipermail/mac_crypto/> Date: Tue, 13 Apr 2004 23:53:35 -0400 >On 13 Apr 2004, at 4:04, R. A. Hettinga wrote: >>From: "Joseph Ashwood" <[EMAIL PROTECTED]> >>>It's not clear to me that you need all this complexity. All you need >>>if to arrange that the attacker does not know exactly what will be >>>signed until it has been signed. So you append some randomness from a >>>good random number source to the end of the file just before you sign >>>it, and you're safe. >> >>I'm not quite sure that's a good solution, that random tail provides exactly >>what the attacker needs to make this as easy as possible. Since the random >>tail cannot be know beforehand it cannot be known by the user of the patch. >>If anything this would actually make an attack easier. It is only if the >>random data is from a _bad_ random source that you might actually gain some >>security (a bad source would at the very least have redundancy, internal or >>external, that could be verified by the end user, making it more complex to >>compute valid numbers). Instead it would probably be more useful to include >>the same random number between each file, this should short circuit all but >>the most fatal of hash flaws, but might open up other possibilities (I don't >>have the time right now to prove things about it). > >It's true that the putting randomness on the tail is a bad idea if >the attacker stands any chance of controlling the supposedly random >data. That's why you need to buy a good solid hardware security >module with the ability to limit the use of the signing keys to only >be used by your custom code that adds hardware generated random >padding! > I suppose it is educational to think about different ways to solve a problem like this, but this is getting silly. All that is needed to prevent my attack is to use calculate and publish the SHA1 hash along with the MD5 hash. Easy as can be. No special hardware required. Arnold Reinhold ___ mac_crypto mailing list [EMAIL PROTECTED] http://www.vmeng.com/mailman/listinfo/mac_crypto --- end forwarded text -- - R. A. Hettinga The Internet Bearer Underwriting Corporation <http://www.ibuc.com/> 44 Farquhar Street, Boston, MA 02131 USA "... however it may deserve respect for its usefulness and antiquity, [predicting the end of the world] has not been found agreeable to experience." -- Edward Gibbon, 'Decline and Fall of the Roman Empire' - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
On Mon, Apr 12, 2004 at 06:00:26PM -0700, Joseph Ashwood wrote: > > From: Nicko van Someren <[EMAIL PROTECTED]> > > > > It's not clear to me that you need all this complexity. All you need > > if to arrange that the attacker does not know exactly what will be > > signed until it has been signed. So you append some randomness from a > > good random number source to the end of the file just before you sign > > it, and you're safe. > > I'm not quite sure that's a good solution, that random tail provides exactly > what the attacker needs to make this as easy as possible. Since the random > tail cannot be know beforehand it cannot be known by the user of the patch. > If anything this would actually make an attack easier. It is only if the > random data is from a _bad_ random source that you might actually gain some > security (a bad source would at the very least have redundancy, internal or > external, that could be verified by the end user, making it more complex to > compute valid numbers). Instead it would probably be more useful to include > the same random number between each file, this should short circuit all but > the most fatal of hash flaws, but might open up other possibilities (I don't > have the time right now to prove things about it). You and Nicko are discussing different attacks. One attack is "Find two patches which have the same MD5". This falls to a birthday-paradox attack with 2^64 work, and is somewhat easier if the patch format includes a random tail. (The collision attack needs 64 bits of controllable input in the message.) Short of a compromise of the developer's machine, it's hard to see how this type of hash collision can be leveraged into getting a malicious patch distributed as genuine. Another attack is "Submit a carefully sculpted change through normal channels which results in the new patch having an attacker-determined hash value". The work factor is somewhat difficult to estimate, but is probably a significant multiple higher than 2^64, primarily due to the more-complex iterator due to having to emulate the workflow of the developer when computing hash values. In this attack, the assumption is that the attacker submits a diff (a very innocuous one, but with a special structure), which is integrated, and the result is a (binary) patch which is signed by the developer. The attacker was able to craft the diff so as to cause MD5(patchA) to match MD5(patchB) and then distributes patchB with the signature from patchA. This second attack is made infeasible by simply appending 64 bits of randomness to the patch before signing it. Since the attacker could not possibly know those bits, so his work required increases by a *power* (I believe -- haven't done the math carefully) of 64. These are straightforward extensions of Van Oorschot & Wiener's 1994 paper "Parallel Collision Search with Application to Hash Functions and Discrete Logarithms". The obvious conclusion is that nobody should be using MD5 any more; follow the OpenBSD project's lead and distribute SHA1 alongside your MD5 distribution. Or SHA-256 if you're really paranoid. > On a related note does anyone happen to know of any useful papers on > patching, specifically patch integrity/source verification? I don't see how this is any different from the generic source validation problem. -andy - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
Sorry about being late to the party, I've been a bit busy lately. > From: Nicko van Someren <[EMAIL PROTECTED]> > Subject: Re: [Mac_crypto] Apple should use SHA! (or stronger) to > authenticate software releases > To: [EMAIL PROTECTED] > Sender: [EMAIL PROTECTED] > Reply-To: [EMAIL PROTECTED] > List-Id: Macintosh Cryptography > List-Post: <mailto:[EMAIL PROTECTED]> > List-Help: <mailto:[EMAIL PROTECTED]> > List-Subscribe: <http://www.vmeng.com/mailman/listinfo/mac_crypto>, > <mailto:[EMAIL PROTECTED]> > List-Archive: <http://www.vmeng.com/pipermail/mac_crypto/> > Date: Wed, 7 Apr 2004 12:53:56 +0100 > > It's not clear to me that you need all this complexity. All you need > if to arrange that the attacker does not know exactly what will be > signed until it has been signed. So you append some randomness from a > good random number source to the end of the file just before you sign > it, and you're safe. I'm not quite sure that's a good solution, that random tail provides exactly what the attacker needs to make this as easy as possible. Since the random tail cannot be know beforehand it cannot be known by the user of the patch. If anything this would actually make an attack easier. It is only if the random data is from a _bad_ random source that you might actually gain some security (a bad source would at the very least have redundancy, internal or external, that could be verified by the end user, making it more complex to compute valid numbers). Instead it would probably be more useful to include the same random number between each file, this should short circuit all but the most fatal of hash flaws, but might open up other possibilities (I don't have the time right now to prove things about it). On a related note does anyone happen to know of any useful papers on patching, specifically patch integrity/source verification? Joe - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
--- begin forwarded text From: Nicko van Someren <[EMAIL PROTECTED]> Subject: Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases To: [EMAIL PROTECTED] Sender: [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] List-Id: Macintosh Cryptography List-Post: <mailto:[EMAIL PROTECTED]> List-Help: <mailto:[EMAIL PROTECTED]> List-Subscribe: <http://www.vmeng.com/mailman/listinfo/mac_crypto>, <mailto:[EMAIL PROTECTED]> List-Archive: <http://www.vmeng.com/pipermail/mac_crypto/> Date: Wed, 7 Apr 2004 12:53:56 +0100 It's not clear to me that you need all this complexity. All you need if to arrange that the attacker does not know exactly what will be signed until it has been signed. So you append some randomness from a good random number source to the end of the file just before you sign it, and you're safe. Nicko On 6 Apr 2004, at 16:43, R. A. Hettinga wrote: > --- begin forwarded text > > Date: Tue, 6 Apr 2004 10:33:54 +0100 > To: "R. A. Hettinga" <[EMAIL PROTECTED]> > Cc: [EMAIL PROTECTED] > Subject: Re: [Mac_crypto] Apple should use SHA! (or stronger) to > authenticate software releases > User-Agent: Mutt/1.3.28i > From: Zefram <[EMAIL PROTECTED]> ... > This disucssion suggests a simple countermeasure: put something at the > beginning of the package that depends on all the significant content > of the package. For example, the first file in the archive could be > a list of digests of individual files. This means an attacker has to > either process the entire archive in searching for a collision or find > tweakable space not covered by the leading checksum file. > > Having such dead space (not trivially checksummed) is quite likely in > an uncompressed archive. For example, tar pads each file to an > integral > multiple of its block size. However, compression of the entire archive > will make such space more difficult to arrange. > > A more extreme approach that removes that risk entirely is to take > the digest of two concatenated copies of the package. Precalculation > can still be used to speed up the search for a collision, but the > unoptimisable tail is now guaranteed to be at least as long as the > entire package. I see a couple of potentially useful variations on > this: > > 0. publish Digest(Digest(Archive) || Archive) > (similar to Digest(Archive || Archive)) > > 1. publish Digest(Archive || Tail) where Tail is large and public > (increases difficulty by a fixed amount that depends on > the length of Tail -- suitable for small packages) > > Are these approaches sound? I'm a crypto-plumber, not a cryptographer. > > I'm wondering whether we can define a new type of cryptographic > primitive > here: the non-precomputable message digest. The interesting feature of > an NPCMD would be that computing the digests of two related messages > cannot be optimised to take any less time than the computation of the > digests of two unrelated messages of the same sizes. The suggestions > I made above are clearly not NPCMDs, but if one does > > M[0] = M ; original message > M[i+1] = Digest(M[i]) || M[i] > D = Digest(M[n]) ; final digest > > with Digest being a good conventional message digest, then this > appears to > approach a NPCMD as n approaches infinity. Of course, computation time > for this construction is linear in n (for a particular message size), > so this is not a practical way to achieve this goal. Can a NPCMD be > done in a more reasonable time? I imagine structures that might lead > to O(l*log(l)) time, where l is the message length, but that's just > my speculation. > > -zefram > > --- end forwarded text ___ mac_crypto mailing list [EMAIL PROTECTED] http://www.vmeng.com/mailman/listinfo/mac_crypto --- end forwarded text -- - R. A. Hettinga The Internet Bearer Underwriting Corporation <http://www.ibuc.com/> 44 Farquhar Street, Boston, MA 02131 USA "... however it may deserve respect for its usefulness and antiquity, [predicting the end of the world] has not been found agreeable to experience." -- Edward Gibbon, 'Decline and Fall of the Roman Empire' - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
> > But if you are given the choice between using MD5 and SHA1, I'd prefer > > SHA1, but I wouldn't be concerned with someone using MD5 isntead of SHA1 > > for the time being. In other words, if I were to do a risk analysis, I would > > identify > > the use of MD5 instead of SHA1 as one of the major risks. > > > > "were" or "were not"? I wanted to write "I would *not* identify the use of MD5 instead of SHA1 as one of the major risks". In other words, using MD5 instead of SHA1 would be low risk compared to the other threats that exist. Sorry, the mistake changes to whole sense of the phrase. --Anton - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
R. A. Hettinga wrote: > In practice you'll probably find something that you can alter in >the last few hundred KB but still the raw processing cost will be a few >orders of magnitude harder than a simple hash collision problem. [etc.] This disucssion suggests a simple countermeasure: put something at the beginning of the package that depends on all the significant content of the package. For example, the first file in the archive could be a list of digests of individual files. This means an attacker has to either process the entire archive in searching for a collision or find tweakable space not covered by the leading checksum file. Having such dead space (not trivially checksummed) is quite likely in an uncompressed archive. For example, tar pads each file to an integral multiple of its block size. However, compression of the entire archive will make such space more difficult to arrange. A more extreme approach that removes that risk entirely is to take the digest of two concatenated copies of the package. Precalculation can still be used to speed up the search for a collision, but the unoptimisable tail is now guaranteed to be at least as long as the entire package. I see a couple of potentially useful variations on this: 0. publish Digest(Digest(Archive) || Archive) (similar to Digest(Archive || Archive)) 1. publish Digest(Archive || Tail) where Tail is large and public (increases difficulty by a fixed amount that depends on the length of Tail -- suitable for small packages) Are these approaches sound? I'm a crypto-plumber, not a cryptographer. I'm wondering whether we can define a new type of cryptographic primitive here: the non-precomputable message digest. The interesting feature of an NPCMD would be that computing the digests of two related messages cannot be optimised to take any less time than the computation of the digests of two unrelated messages of the same sizes. The suggestions I made above are clearly not NPCMDs, but if one does M[0] = M ; original message M[i+1] = Digest(M[i]) || M[i] D = Digest(M[n]) ; final digest with Digest being a good conventional message digest, then this appears to approach a NPCMD as n approaches infinity. Of course, computation time for this construction is linear in n (for a particular message size), so this is not a practical way to achieve this goal. Can a NPCMD be done in a more reasonable time? I imagine structures that might lead to O(l*log(l)) time, where l is the message length, but that's just my speculation. -zefram - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
On 5 Apr 2004, at 23:43, Arnold G. Reinhold wrote: Having a tail 2 MB or longer may make the processing time comparable to finding an SHA1 collision, but it is still a 64-bit problem and thus requires far less memory than finding an SHA1 collision. Just because SHA-1 is O(2^80) and this problem is O(2^64) does NOT necessarily mean that this problem is easier in practice. Complexity orders are helpful for comparing like with like, and helpful when working out what's best in the limit as the sizes go up, but they give no immediate information about the absolute cost. Incidentally, there are attacks on hash functions that do not require masses of memory. The amount of memory only determines how much effort you have to apply to find the exact solution once you've found where to look. In particular, changing the amount of memory does not change the O() order of the complexity! Upon reflection, if the attacker can arrange the specific case in which the original and the bogus files are exactly the same length, and that all the information in the file after the part that the "tweakable" section is the the same in both files, then you can make use of the limited memory of the hash functions and look for collisions right after the tweakable part, which means you get to ignore the length of the tail of the file. In these circumstances the attack you describe is probably practical, at least for governments if not for Microsoft. Nicko - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
At 4:51 PM +0100 4/5/04, Nicko van Someren wrote: ... While I agree that it is somewhat lax of Apple to be using MD5 for checking its updates it's far from clear to me that an attack of the sort described above would ever be practical. The problem is that the while there are methods for finding has collisions by brute force, not only do they require a work factor around the size of the square root of the output space, and a large amount of storage, but the work factor is multiplied by a factor linear in the size of the data that follows the part of the message that you can manipulate. When all you are doing is trying to find a single collision on the hash function (or perhaps trying to forge a matching pair of small key exchange messages) this can be made quite small but when it comes to breaking a package distribution system it is much harder. Consider that the real package is represented by M, the subtly modified version is M1 and the bogus package is represented by M2. Each of these has some high resolution TIFF image T that we can get away with tweaking into T1 and T2. We probably don't have control where this image comes out in the package because the file ordering is deterministic. The file ordering may be deterministic, but someone who is well versed in the configuration control and release engineering process might well be able to have a chosen file placed at the end of of the package. The method for getting stuff at the end needn't be perfect. The attacker can keep trying until he succeeds. The package can be broken down into three parts: the bit before the tweakable part, the tweakable image and the bit after that: M = M1a | T | M1b M1 = M1a | T1 | M1b M2 = M2a | T2 | M2b For the purposes of finding a collision we can pre-compute the hash blocks of M1a and M2a but we can't do the same for M1b and M2b. This means if L(x) is the function for the number of hash blocks in message x then the cost of finding a hash collision in our packages is L(T1|M1b) + L(T2|M2b) times harder than just finding a simple collision by brute force. In MD5 the blocks are 64 bytes, so for every 1K in M1b the process of finding a collision gets 16 times harder. Finding a collision in SHA1 is about 2^16 times harder than finding a collision in MD5, so unless you find something that you can alter without it being noticed in the last 2MB of the package then this sort of forgery is actually going to be harder than finding a simple hash collision in SHA1! In practice you'll probably find something that you can alter in the last few hundred KB but still the raw processing cost will be a few orders of magnitude harder than a simple hash collision problem. Furthermore, since you have to process the "tail" of the message any specialised hardware for doing this will have a hard time doing it's processing in a usefully pipelined fashion and will need access to large amounts of very fast RAM to store the tail of the message, and this will also cost you another order of magnitude or so. Having a tail 2 MB or longer may make the processing time comparable to finding an SHA1 collision, but it is still a 64-bit problem and thus requires far less memory than finding an SHA1 collision. I am not saying that my attack is easy, but that it is feasible for a large organization and very dangerous and stealthy if it succeeds. History has shown, over and over and over again, the folly of ignoring cryptographic attacks that are theoretically possible but seem too hard to implement. On the other hand, defending against my attack certainly is easy, just publish an SHA1 (or stronger) hash alongside the MD5 hash. Arnold Reinhold - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
The attacks by Dobbertin on MD5 only allow to find collisions in the compression function, not the whole MD5 hash. But it is a sign that something might be fishy about MD5. MD5 output is 128 bits. There are two types of collision finding attacks that can be applied. In the first you are given a hash value y = H(x), for some x, and try to find a different input x' that hashes to the same output: H(x) = H(x') = y. This relates to 2nd-preimage resistance. This can be done on MD5 in 2^128 work factor. The other attack is to find to arbitrary inputs x, x' such that H(x) = H(x'). This relates to collision resistance. This can be done with good probability in 2^64 work factor. Now, the problem of having a malicious source code hash to the same value as good/valid source code seems to be related more to the former, that is you have some code that is checked-in, that gives some hash value Y, and you want to find a different code (malicious one) that hashes to the same value. You might be able to play with the valid code as well, giving you more flexibility for the search of a collision, but you can't play to much without having this noticed by other developers. I think that there are many other problems that are more of concern. For example hacking a web site (or mirror site) that contains code for download, and changing the code along with the hash value of the code, or preventing a developer from inserting some kind of trap door or Trojan. But if you are given the choice between using MD5 and SHA1, I'd prefer SHA1, but I wouldn't be concerned with someone using MD5 isntead of SHA1 for the time being. In other words, if I were to do a risk analysis, I would identify the use of MD5 instead of SHA1 as one of the major risks. --Anton - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
--- begin forwarded text To: [EMAIL PROTECTED] From: Vinnie Moscaritolo <[EMAIL PROTECTED]> Subject: Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases Sender: [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] List-Id: Macintosh Cryptography List-Post: <mailto:[EMAIL PROTECTED]> List-Help: <mailto:[EMAIL PROTECTED]> List-Subscribe: <http://www.vmeng.com/mailman/listinfo/mac_crypto>, <mailto:[EMAIL PROTECTED]> List-Archive: <http://www.vmeng.com/pipermail/mac_crypto/> Date: Mon, 5 Apr 2004 08:10:26 -0800 one more thing for all it's worth.. MD5 is not a FIPS-140-2 approved algorithm. http://csrc.nist.gov/cryptval/ this would technically prevent osx from being used in any Federal or Mil environment. Apple will eventually have to address this concern. At 6:17 AM -0500 4/4/04, Arnold G. Reinhold wrote: >The cryptographic hash function MD5 has long been used to >authenticate software packages, particularly in the Linux/Unix/open >source community. This has carried over to Apple's OS-X. The MD5 >hash of an entire package is calculated and its value is transmitted >separately from the package. Users who download the package compute >the hash of the copy they received and match that value against the >original. -- Vinnie Moscaritolo ITCB-IMSH PGP: 3F903472C3AF622D5D918D9BD8B100090B3EF042 --- "When the pin is pulled, Mr. Grenade is not our friend." - USMC training bulletin. ___ mac_crypto mailing list [EMAIL PROTECTED] http://www.vmeng.com/mailman/listinfo/mac_crypto --- end forwarded text -- - R. A. Hettinga The Internet Bearer Underwriting Corporation <http://www.ibuc.com/> 44 Farquhar Street, Boston, MA 02131 USA "... however it may deserve respect for its usefulness and antiquity, [predicting the end of the world] has not been found agreeable to experience." -- Edward Gibbon, 'Decline and Fall of the Roman Empire' - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
--- begin forwarded text From: Nicko van Someren <[EMAIL PROTECTED]> Subject: Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases To: [EMAIL PROTECTED] Sender: [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] List-Id: Macintosh Cryptography List-Post: <mailto:[EMAIL PROTECTED]> List-Help: <mailto:[EMAIL PROTECTED]> List-Subscribe: <http://www.vmeng.com/mailman/listinfo/mac_crypto>, <mailto:[EMAIL PROTECTED]> List-Archive: <http://www.vmeng.com/pipermail/mac_crypto/> Date: Mon, 5 Apr 2004 16:51:56 +0100 On 4 Apr 2004, at 12:17, Arnold G. Reinhold wrote: ... > A safer attack would be for the agent to insert an apparently innocent > modification to the package, with the modification selected so that > the MD5 hash of the package with the malicious code matches the hash > of the officially released package. Since the attacker (or whomever he > is working for) controls the malicious code, calculating the value of > this modification is subject to a meet-in-the-middle attack and > presents presents a 64-bit problem. Solving such a problem is within > the means of a well-funded attacker today* and will become easier in > the future. > > The modification could be designed to get past code reviews in a > number of ways. For example, 64 low order bits in a JPEG icon might be > altered. The agent would have to make the last modification to the > software package prior to release and perhaps send a final pre-release > version of the package to someone on the outside who does the > collision calculation, but those are hardly insurmountable hurdles. > In situations where new releases are relatively frequent, it may > suffice for this attack to succeed only occasionally, allowing > periodic entry into selected systems to recover private keys, for > example. The attacker merely submits modifications late in the > release cycle and if his happens to be last then the full attack is > mounted. While I agree that it is somewhat lax of Apple to be using MD5 for checking its updates it's far from clear to me that an attack of the sort described above would ever be practical. The problem is that the while there are methods for finding has collisions by brute force, not only do they require a work factor around the size of the square root of the output space, and a large amount of storage, but the work factor is multiplied by a factor linear in the size of the data that follows the part of the message that you can manipulate. When all you are doing is trying to find a single collision on the hash function (or perhaps trying to forge a matching pair of small key exchange messages) this can be made quite small but when it comes to breaking a package distribution system it is much harder. Consider that the real package is represented by M, the subtly modified version is M1 and the bogus package is represented by M2. Each of these has some high resolution TIFF image T that we can get away with tweaking into T1 and T2. We probably don't have control where this image comes out in the package because the file ordering is deterministic. The package can be broken down into three parts: the bit before the tweakable part, the tweakable image and the bit after that: M = M1a | T | M1b M1 = M1a | T1 | M1b M2 = M2a | T2 | M2b For the purposes of finding a collision we can pre-compute the hash blocks of M1a and M2a but we can't do the same for M1b and M2b. This means if L(x) is the function for the number of hash blocks in message x then the cost of finding a hash collision in our packages is L(T1|M1b) + L(T2|M2b) times harder than just finding a simple collision by brute force. In MD5 the blocks are 64 bytes, so for every 1K in M1b the process of finding a collision gets 16 times harder. Finding a collision in SHA1 is about 2^16 times harder than finding a collision in MD5, so unless you find something that you can alter without it being noticed in the last 2MB of the package then this sort of forgery is actually going to be harder than finding a simple hash collision in SHA1! In practice you'll probably find something that you can alter in the last few hundred KB but still the raw processing cost will be a few orders of magnitude harder than a simple hash collision problem. Furthermore, since you have to process the "tail" of the message any specialised hardware for doing this will have a hard time doing it's processing in a usefully pipelined fashion and will need access to large amounts of very fast RAM to store the tail of the message, and this will also cost you another order of magnitude or so. Of course none of this is an argument for not using SHA1. All of these complexities apply just as well to SHA1, meaning that forging packages checked with SHA1 will be four or five orders of magnitude harder than finding a sin
Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
Dobbertin's 1996 collision demonstration is another good reason not to use md5, but is obviously hasn't gotten the open source community or Apple to stop. Whether my attack will be any more successful in effecting change remains to be seen. Publishing SHA1 hashes in parallel with md5 seems like such an inexpensive thing to do, but one should never underestimate cryptographic inertia. For the record, I first published my attack on Perry Metzger's cryptography list in February, 2002. Arnold Reinhold At 5:56 PM -0400 4/4/04, Don Davis wrote: hi, mr. reinhold -- there's stronger reason than the ones you cite, to distrust md5 as a message-digest. see these old sci.crypt threads, and the google-search below, for discussions of hans dobbertin's 1996 crack of md5: http://tinyurl.com/2ox7g http://tinyurl.com/3x446 http://google.com/search?q=dobbertin+md5&num=30 btw, in a phone conversation, dobbertin emphasized to me that his attack only works when md5 is used as a message-digest; it doesn't work when md5 is used with a key to prepare a MAC. he also mentioned that while sha-1 may be vulnerable to an attack of a similar style (because sha-1 is similar in struc- ture to md5), he himself was forbiddden by german law to work to cryptanalyze sha-1, because he worked at that time for the german federal security service, and so wasn't allowed to attack the USG's standard ciphers. now he's at ruhr university (in bochum), but i don't know whether he's more of a free agent. - don davis, boston To: [EMAIL PROTECTED] From: "Arnold G. Reinhold" <[EMAIL PROTECTED]> Subject: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases Sender: [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] List-Id: Macintosh Cryptography List-Archive: <http://www.vmeng.com/pipermail/mac_crypto/> Date: Sun, 4 Apr 2004 06:17:55 -0500 The cryptographic hash function MD5 has long been used to authenticate software packages, particularly in the Linux/Unix/open source community. This has carried over to Apple's OS-X. The MD5 hash of an entire package is calculated and its value is transmitted separately from the package. Users who download the package compute the hash of the copy they received and match that value against the original. ... - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED] - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
hi, mr. reinhold -- there's stronger reason than the ones you cite, to distrust md5 as a message-digest. see these old sci.crypt threads, and the google-search below, for discussions of hans dobbertin's 1996 crack of md5: http://tinyurl.com/2ox7g http://tinyurl.com/3x446 http://google.com/search?q=dobbertin+md5&num=30 btw, in a phone conversation, dobbertin emphasized to me that his attack only works when md5 is used as a message-digest; it doesn't work when md5 is used with a key to prepare a MAC. he also mentioned that while sha-1 may be vulnerable to an attack of a similar style (because sha-1 is similar in struc- ture to md5), he himself was forbiddden by german law to work to cryptanalyze sha-1, because he worked at that time for the german federal security service, and so wasn't allowed to attack the USG's standard ciphers. now he's at ruhr university (in bochum), but i don't know whether he's more of a free agent. - don davis, boston > To: [EMAIL PROTECTED] > From: "Arnold G. Reinhold" <[EMAIL PROTECTED]> > Subject: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate > software > releases > Sender: [EMAIL PROTECTED] > Reply-To: [EMAIL PROTECTED] > List-Id: Macintosh Cryptography > List-Archive: <http://www.vmeng.com/pipermail/mac_crypto/> > Date: Sun, 4 Apr 2004 06:17:55 -0500 > > The cryptographic hash function MD5 has long been used to > authenticate software packages, particularly in the Linux/Unix/open > source community. This has carried over to Apple's OS-X. The MD5 hash > of an entire package is calculated and its value is transmitted > separately from the package. Users who download the package compute > the hash of the copy they received and match that value against the > original. ... - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
[Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
--- begin forwarded text To: [EMAIL PROTECTED] From: "Arnold G. Reinhold" <[EMAIL PROTECTED]> Subject: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases Sender: [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] List-Id: Macintosh Cryptography List-Post: <mailto:[EMAIL PROTECTED]> List-Help: <mailto:[EMAIL PROTECTED]> List-Subscribe: <http://www.vmeng.com/mailman/listinfo/mac_crypto>, <mailto:[EMAIL PROTECTED]> List-Archive: <http://www.vmeng.com/pipermail/mac_crypto/> Date: Sun, 4 Apr 2004 06:17:55 -0500 The cryptographic hash function MD5 has long been used to authenticate software packages, particularly in the Linux/Unix/open source community. This has carried over to Apple's OS-X. The MD5 hash of an entire package is calculated and its value is transmitted separately from the package. Users who download the package compute the hash of the copy they received and match that value against the original. Putting aside the question of how the the hash value is safely transmitted, there is a potential attack on this method due to the 128 bit length of the MD5 hash output. If all the individuals having input to the creation of the original software package are trustworthy, then 128 bits provides adequate security. Someone trying to substitute a version of that package containing a malicious modification (Trojan horse, virus, backdoor) would have to solve a 128 bit problem to create an infected package that passed the hash verification. That is considered computationally infeasible, at least until the advent of quantum computing. One might think the above argument proves MD5 is sufficient. After all, if an attacker had an agent working inside the organization that produced the package, that agent could simply incorporate the malicious software patch in the original package. However such an insertion is very risky. A sophisticated software company would likely have code reviews that would make introduction of the malicious code difficult. Use of a source control system makes is easy to track down whoever inserted the malicious change once it is discovered. The malicious code would be distributed to everyone, increasing the likelihood of detection. In an open source model, a defect in source code is particularly hard to hide. The agent risks being uncovered and and perhaps prosecuted, the organization he works for risks being identified and the technical means that the malicious code employed would be compromised. A safer attack would be for the agent to insert an apparently innocent modification to the package, with the modification selected so that the MD5 hash of the package with the malicious code matches the hash of the officially released package. Since the attacker (or whomever he is working for) controls the malicious code, calculating the value of this modification is subject to a meet-in-the-middle attack and presents presents a 64-bit problem. Solving such a problem is within the means of a well-funded attacker today* and will become easier in the future. The modification could be designed to get past code reviews in a number of ways. For example, 64 low order bits in a JPEG icon might be altered. The agent would have to make the last modification to the software package prior to release and perhaps send a final pre-release version of the package to someone on the outside who does the collision calculation, but those are hardly insurmountable hurdles. In situations where new releases are relatively frequent, it may suffice for this attack to succeed only occasionally, allowing periodic entry into selected systems to recover private keys, for example. The attacker merely submits modifications late in the release cycle and if his happens to be last then the full attack is mounted. The obvious solution to this problem is to use a wider hash for package authentication. For example, SHA-256 would present an attacker using this approach with a 128-bit problem. Even SHA1 would be preferable, making such an attack an 80 bit problem. If both MD5 and SHA1 hashes are provided, the attacker faces the problem of forging them both. It costs almost nothing to provide a wider hash along with the MD5 hash whenever a new package is released. It seems the prudent thing for Apple to do. Arnold Reinhold * From: http://www.rsasecurity.com/rsalabs/faq/3-6-6.html "Van Oorschot and Wiener [VW94] have considered a brute-force search for collisions (see Question 2.1.6) in hash functions, and they estimate a collision search machine designed specifically for MD5 (costing $10 million in 1994) could find a collision for MD5 in 24 days on average. The general techniques can be applied to other hash functions." VW94] P. van Oorschot and M. Wiener, Parallel collision search with application to hash functions and discrete logarithms, Proceedings of 2nd ACM Conference on Computer and Communication Security(1994). *