Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
--- begin forwarded text From: Nicko van Someren [EMAIL PROTECTED] Subject: Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases To: [EMAIL PROTECTED] Sender: [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] List-Id: Macintosh Cryptography mac_crypto.vmeng.com List-Post: mailto:[EMAIL PROTECTED] List-Help: mailto:[EMAIL PROTECTED] List-Subscribe: http://www.vmeng.com/mailman/listinfo/mac_crypto, mailto:[EMAIL PROTECTED] List-Archive: http://www.vmeng.com/pipermail/mac_crypto/ Date: Wed, 7 Apr 2004 12:53:56 +0100 It's not clear to me that you need all this complexity. All you need if to arrange that the attacker does not know exactly what will be signed until it has been signed. So you append some randomness from a good random number source to the end of the file just before you sign it, and you're safe. Nicko On 6 Apr 2004, at 16:43, R. A. Hettinga wrote: --- begin forwarded text Date: Tue, 6 Apr 2004 10:33:54 +0100 To: R. A. Hettinga [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases User-Agent: Mutt/1.3.28i From: Zefram [EMAIL PROTECTED] ... This disucssion suggests a simple countermeasure: put something at the beginning of the package that depends on all the significant content of the package. For example, the first file in the archive could be a list of digests of individual files. This means an attacker has to either process the entire archive in searching for a collision or find tweakable space not covered by the leading checksum file. Having such dead space (not trivially checksummed) is quite likely in an uncompressed archive. For example, tar pads each file to an integral multiple of its block size. However, compression of the entire archive will make such space more difficult to arrange. A more extreme approach that removes that risk entirely is to take the digest of two concatenated copies of the package. Precalculation can still be used to speed up the search for a collision, but the unoptimisable tail is now guaranteed to be at least as long as the entire package. I see a couple of potentially useful variations on this: 0. publish Digest(Digest(Archive) || Archive) (similar to Digest(Archive || Archive)) 1. publish Digest(Archive || Tail) where Tail is large and public (increases difficulty by a fixed amount that depends on the length of Tail -- suitable for small packages) Are these approaches sound? I'm a crypto-plumber, not a cryptographer. I'm wondering whether we can define a new type of cryptographic primitive here: the non-precomputable message digest. The interesting feature of an NPCMD would be that computing the digests of two related messages cannot be optimised to take any less time than the computation of the digests of two unrelated messages of the same sizes. The suggestions I made above are clearly not NPCMDs, but if one does M[0] = M ; original message M[i+1] = Digest(M[i]) || M[i] D = Digest(M[n]) ; final digest with Digest being a good conventional message digest, then this appears to approach a NPCMD as n approaches infinity. Of course, computation time for this construction is linear in n (for a particular message size), so this is not a practical way to achieve this goal. Can a NPCMD be done in a more reasonable time? I imagine structures that might lead to O(l*log(l)) time, where l is the message length, but that's just my speculation. -zefram --- end forwarded text ___ mac_crypto mailing list [EMAIL PROTECTED] http://www.vmeng.com/mailman/listinfo/mac_crypto --- end forwarded text -- - R. A. Hettinga mailto: [EMAIL PROTECTED] The Internet Bearer Underwriting Corporation http://www.ibuc.com/ 44 Farquhar Street, Boston, MA 02131 USA ... however it may deserve respect for its usefulness and antiquity, [predicting the end of the world] has not been found agreeable to experience. -- Edward Gibbon, 'Decline and Fall of the Roman Empire' - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
Dobbertin's 1996 collision demonstration is another good reason not to use md5, but is obviously hasn't gotten the open source community or Apple to stop. Whether my attack will be any more successful in effecting change remains to be seen. Publishing SHA1 hashes in parallel with md5 seems like such an inexpensive thing to do, but one should never underestimate cryptographic inertia. For the record, I first published my attack on Perry Metzger's cryptography list in February, 2002. Arnold Reinhold At 5:56 PM -0400 4/4/04, Don Davis wrote: hi, mr. reinhold -- there's stronger reason than the ones you cite, to distrust md5 as a message-digest. see these old sci.crypt threads, and the google-search below, for discussions of hans dobbertin's 1996 crack of md5: http://tinyurl.com/2ox7g http://tinyurl.com/3x446 http://google.com/search?q=dobbertin+md5num=30 btw, in a phone conversation, dobbertin emphasized to me that his attack only works when md5 is used as a message-digest; it doesn't work when md5 is used with a key to prepare a MAC. he also mentioned that while sha-1 may be vulnerable to an attack of a similar style (because sha-1 is similar in struc- ture to md5), he himself was forbiddden by german law to work to cryptanalyze sha-1, because he worked at that time for the german federal security service, and so wasn't allowed to attack the USG's standard ciphers. now he's at ruhr university (in bochum), but i don't know whether he's more of a free agent. - don davis, boston To: [EMAIL PROTECTED] From: Arnold G. Reinhold [EMAIL PROTECTED] Subject: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases Sender: [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] List-Id: Macintosh Cryptography mac_crypto.vmeng.com List-Archive: http://www.vmeng.com/pipermail/mac_crypto/ Date: Sun, 4 Apr 2004 06:17:55 -0500 The cryptographic hash function MD5 has long been used to authenticate software packages, particularly in the Linux/Unix/open source community. This has carried over to Apple's OS-X. The MD5 hash of an entire package is calculated and its value is transmitted separately from the package. Users who download the package compute the hash of the copy they received and match that value against the original. ... - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED] - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
--- begin forwarded text From: Nicko van Someren [EMAIL PROTECTED] Subject: Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases To: [EMAIL PROTECTED] Sender: [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] List-Id: Macintosh Cryptography mac_crypto.vmeng.com List-Post: mailto:[EMAIL PROTECTED] List-Help: mailto:[EMAIL PROTECTED] List-Subscribe: http://www.vmeng.com/mailman/listinfo/mac_crypto, mailto:[EMAIL PROTECTED] List-Archive: http://www.vmeng.com/pipermail/mac_crypto/ Date: Mon, 5 Apr 2004 16:51:56 +0100 On 4 Apr 2004, at 12:17, Arnold G. Reinhold wrote: ... A safer attack would be for the agent to insert an apparently innocent modification to the package, with the modification selected so that the MD5 hash of the package with the malicious code matches the hash of the officially released package. Since the attacker (or whomever he is working for) controls the malicious code, calculating the value of this modification is subject to a meet-in-the-middle attack and presents presents a 64-bit problem. Solving such a problem is within the means of a well-funded attacker today* and will become easier in the future. The modification could be designed to get past code reviews in a number of ways. For example, 64 low order bits in a JPEG icon might be altered. The agent would have to make the last modification to the software package prior to release and perhaps send a final pre-release version of the package to someone on the outside who does the collision calculation, but those are hardly insurmountable hurdles. In situations where new releases are relatively frequent, it may suffice for this attack to succeed only occasionally, allowing periodic entry into selected systems to recover private keys, for example. The attacker merely submits modifications late in the release cycle and if his happens to be last then the full attack is mounted. While I agree that it is somewhat lax of Apple to be using MD5 for checking its updates it's far from clear to me that an attack of the sort described above would ever be practical. The problem is that the while there are methods for finding has collisions by brute force, not only do they require a work factor around the size of the square root of the output space, and a large amount of storage, but the work factor is multiplied by a factor linear in the size of the data that follows the part of the message that you can manipulate. When all you are doing is trying to find a single collision on the hash function (or perhaps trying to forge a matching pair of small key exchange messages) this can be made quite small but when it comes to breaking a package distribution system it is much harder. Consider that the real package is represented by M, the subtly modified version is M1 and the bogus package is represented by M2. Each of these has some high resolution TIFF image T that we can get away with tweaking into T1 and T2. We probably don't have control where this image comes out in the package because the file ordering is deterministic. The package can be broken down into three parts: the bit before the tweakable part, the tweakable image and the bit after that: M = M1a | T | M1b M1 = M1a | T1 | M1b M2 = M2a | T2 | M2b For the purposes of finding a collision we can pre-compute the hash blocks of M1a and M2a but we can't do the same for M1b and M2b. This means if L(x) is the function for the number of hash blocks in message x then the cost of finding a hash collision in our packages is L(T1|M1b) + L(T2|M2b) times harder than just finding a simple collision by brute force. In MD5 the blocks are 64 bytes, so for every 1K in M1b the process of finding a collision gets 16 times harder. Finding a collision in SHA1 is about 2^16 times harder than finding a collision in MD5, so unless you find something that you can alter without it being noticed in the last 2MB of the package then this sort of forgery is actually going to be harder than finding a simple hash collision in SHA1! In practice you'll probably find something that you can alter in the last few hundred KB but still the raw processing cost will be a few orders of magnitude harder than a simple hash collision problem. Furthermore, since you have to process the tail of the message any specialised hardware for doing this will have a hard time doing it's processing in a usefully pipelined fashion and will need access to large amounts of very fast RAM to store the tail of the message, and this will also cost you another order of magnitude or so. Of course none of this is an argument for not using SHA1. All of these complexities apply just as well to SHA1, meaning that forging packages checked with SHA1 will be four or five orders of magnitude harder than finding a single simple SHA1 collision. On the other hand, it tells you that the prospect of any successful hash collision attack on Apple's packages are still some way
Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
--- begin forwarded text To: [EMAIL PROTECTED] From: Vinnie Moscaritolo [EMAIL PROTECTED] Subject: Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases Sender: [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] List-Id: Macintosh Cryptography mac_crypto.vmeng.com List-Post: mailto:[EMAIL PROTECTED] List-Help: mailto:[EMAIL PROTECTED] List-Subscribe: http://www.vmeng.com/mailman/listinfo/mac_crypto, mailto:[EMAIL PROTECTED] List-Archive: http://www.vmeng.com/pipermail/mac_crypto/ Date: Mon, 5 Apr 2004 08:10:26 -0800 one more thing for all it's worth.. MD5 is not a FIPS-140-2 approved algorithm. http://csrc.nist.gov/cryptval/ this would technically prevent osx from being used in any Federal or Mil environment. Apple will eventually have to address this concern. At 6:17 AM -0500 4/4/04, Arnold G. Reinhold wrote: The cryptographic hash function MD5 has long been used to authenticate software packages, particularly in the Linux/Unix/open source community. This has carried over to Apple's OS-X. The MD5 hash of an entire package is calculated and its value is transmitted separately from the package. Users who download the package compute the hash of the copy they received and match that value against the original. -- Vinnie Moscaritolo ITCB-IMSH PGP: 3F903472C3AF622D5D918D9BD8B100090B3EF042 --- When the pin is pulled, Mr. Grenade is not our friend. - USMC training bulletin. ___ mac_crypto mailing list [EMAIL PROTECTED] http://www.vmeng.com/mailman/listinfo/mac_crypto --- end forwarded text -- - R. A. Hettinga mailto: [EMAIL PROTECTED] The Internet Bearer Underwriting Corporation http://www.ibuc.com/ 44 Farquhar Street, Boston, MA 02131 USA ... however it may deserve respect for its usefulness and antiquity, [predicting the end of the world] has not been found agreeable to experience. -- Edward Gibbon, 'Decline and Fall of the Roman Empire' - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
The attacks by Dobbertin on MD5 only allow to find collisions in the compression function, not the whole MD5 hash. But it is a sign that something might be fishy about MD5. MD5 output is 128 bits. There are two types of collision finding attacks that can be applied. In the first you are given a hash value y = H(x), for some x, and try to find a different input x' that hashes to the same output: H(x) = H(x') = y. This relates to 2nd-preimage resistance. This can be done on MD5 in 2^128 work factor. The other attack is to find to arbitrary inputs x, x' such that H(x) = H(x'). This relates to collision resistance. This can be done with good probability in 2^64 work factor. Now, the problem of having a malicious source code hash to the same value as good/valid source code seems to be related more to the former, that is you have some code that is checked-in, that gives some hash value Y, and you want to find a different code (malicious one) that hashes to the same value. You might be able to play with the valid code as well, giving you more flexibility for the search of a collision, but you can't play to much without having this noticed by other developers. I think that there are many other problems that are more of concern. For example hacking a web site (or mirror site) that contains code for download, and changing the code along with the hash value of the code, or preventing a developer from inserting some kind of trap door or Trojan. But if you are given the choice between using MD5 and SHA1, I'd prefer SHA1, but I wouldn't be concerned with someone using MD5 isntead of SHA1 for the time being. In other words, if I were to do a risk analysis, I would identify the use of MD5 instead of SHA1 as one of the major risks. --Anton - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
At 4:51 PM +0100 4/5/04, Nicko van Someren wrote: ... While I agree that it is somewhat lax of Apple to be using MD5 for checking its updates it's far from clear to me that an attack of the sort described above would ever be practical. The problem is that the while there are methods for finding has collisions by brute force, not only do they require a work factor around the size of the square root of the output space, and a large amount of storage, but the work factor is multiplied by a factor linear in the size of the data that follows the part of the message that you can manipulate. When all you are doing is trying to find a single collision on the hash function (or perhaps trying to forge a matching pair of small key exchange messages) this can be made quite small but when it comes to breaking a package distribution system it is much harder. Consider that the real package is represented by M, the subtly modified version is M1 and the bogus package is represented by M2. Each of these has some high resolution TIFF image T that we can get away with tweaking into T1 and T2. We probably don't have control where this image comes out in the package because the file ordering is deterministic. The file ordering may be deterministic, but someone who is well versed in the configuration control and release engineering process might well be able to have a chosen file placed at the end of of the package. The method for getting stuff at the end needn't be perfect. The attacker can keep trying until he succeeds. The package can be broken down into three parts: the bit before the tweakable part, the tweakable image and the bit after that: M = M1a | T | M1b M1 = M1a | T1 | M1b M2 = M2a | T2 | M2b For the purposes of finding a collision we can pre-compute the hash blocks of M1a and M2a but we can't do the same for M1b and M2b. This means if L(x) is the function for the number of hash blocks in message x then the cost of finding a hash collision in our packages is L(T1|M1b) + L(T2|M2b) times harder than just finding a simple collision by brute force. In MD5 the blocks are 64 bytes, so for every 1K in M1b the process of finding a collision gets 16 times harder. Finding a collision in SHA1 is about 2^16 times harder than finding a collision in MD5, so unless you find something that you can alter without it being noticed in the last 2MB of the package then this sort of forgery is actually going to be harder than finding a simple hash collision in SHA1! In practice you'll probably find something that you can alter in the last few hundred KB but still the raw processing cost will be a few orders of magnitude harder than a simple hash collision problem. Furthermore, since you have to process the tail of the message any specialised hardware for doing this will have a hard time doing it's processing in a usefully pipelined fashion and will need access to large amounts of very fast RAM to store the tail of the message, and this will also cost you another order of magnitude or so. Having a tail 2 MB or longer may make the processing time comparable to finding an SHA1 collision, but it is still a 64-bit problem and thus requires far less memory than finding an SHA1 collision. I am not saying that my attack is easy, but that it is feasible for a large organization and very dangerous and stealthy if it succeeds. History has shown, over and over and over again, the folly of ignoring cryptographic attacks that are theoretically possible but seem too hard to implement. On the other hand, defending against my attack certainly is easy, just publish an SHA1 (or stronger) hash alongside the MD5 hash. Arnold Reinhold - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
[Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
--- begin forwarded text To: [EMAIL PROTECTED] From: Arnold G. Reinhold [EMAIL PROTECTED] Subject: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases Sender: [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] List-Id: Macintosh Cryptography mac_crypto.vmeng.com List-Post: mailto:[EMAIL PROTECTED] List-Help: mailto:[EMAIL PROTECTED] List-Subscribe: http://www.vmeng.com/mailman/listinfo/mac_crypto, mailto:[EMAIL PROTECTED] List-Archive: http://www.vmeng.com/pipermail/mac_crypto/ Date: Sun, 4 Apr 2004 06:17:55 -0500 The cryptographic hash function MD5 has long been used to authenticate software packages, particularly in the Linux/Unix/open source community. This has carried over to Apple's OS-X. The MD5 hash of an entire package is calculated and its value is transmitted separately from the package. Users who download the package compute the hash of the copy they received and match that value against the original. Putting aside the question of how the the hash value is safely transmitted, there is a potential attack on this method due to the 128 bit length of the MD5 hash output. If all the individuals having input to the creation of the original software package are trustworthy, then 128 bits provides adequate security. Someone trying to substitute a version of that package containing a malicious modification (Trojan horse, virus, backdoor) would have to solve a 128 bit problem to create an infected package that passed the hash verification. That is considered computationally infeasible, at least until the advent of quantum computing. One might think the above argument proves MD5 is sufficient. After all, if an attacker had an agent working inside the organization that produced the package, that agent could simply incorporate the malicious software patch in the original package. However such an insertion is very risky. A sophisticated software company would likely have code reviews that would make introduction of the malicious code difficult. Use of a source control system makes is easy to track down whoever inserted the malicious change once it is discovered. The malicious code would be distributed to everyone, increasing the likelihood of detection. In an open source model, a defect in source code is particularly hard to hide. The agent risks being uncovered and and perhaps prosecuted, the organization he works for risks being identified and the technical means that the malicious code employed would be compromised. A safer attack would be for the agent to insert an apparently innocent modification to the package, with the modification selected so that the MD5 hash of the package with the malicious code matches the hash of the officially released package. Since the attacker (or whomever he is working for) controls the malicious code, calculating the value of this modification is subject to a meet-in-the-middle attack and presents presents a 64-bit problem. Solving such a problem is within the means of a well-funded attacker today* and will become easier in the future. The modification could be designed to get past code reviews in a number of ways. For example, 64 low order bits in a JPEG icon might be altered. The agent would have to make the last modification to the software package prior to release and perhaps send a final pre-release version of the package to someone on the outside who does the collision calculation, but those are hardly insurmountable hurdles. In situations where new releases are relatively frequent, it may suffice for this attack to succeed only occasionally, allowing periodic entry into selected systems to recover private keys, for example. The attacker merely submits modifications late in the release cycle and if his happens to be last then the full attack is mounted. The obvious solution to this problem is to use a wider hash for package authentication. For example, SHA-256 would present an attacker using this approach with a 128-bit problem. Even SHA1 would be preferable, making such an attack an 80 bit problem. If both MD5 and SHA1 hashes are provided, the attacker faces the problem of forging them both. It costs almost nothing to provide a wider hash along with the MD5 hash whenever a new package is released. It seems the prudent thing for Apple to do. Arnold Reinhold * From: http://www.rsasecurity.com/rsalabs/faq/3-6-6.html Van Oorschot and Wiener [VW94] have considered a brute-force search for collisions (see Question 2.1.6) in hash functions, and they estimate a collision search machine designed specifically for MD5 (costing $10 million in 1994) could find a collision for MD5 in 24 days on average. The general techniques can be applied to other hash functions. VW94] P. van Oorschot and M. Wiener, Parallel collision search with application to hash functions and discrete logarithms, Proceedings of 2nd ACM Conference on Computer and Communication Security(1994). ** SHA1 is available in OS-X as part of openssl. Type openssl
Re: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases
hi, mr. reinhold -- there's stronger reason than the ones you cite, to distrust md5 as a message-digest. see these old sci.crypt threads, and the google-search below, for discussions of hans dobbertin's 1996 crack of md5: http://tinyurl.com/2ox7g http://tinyurl.com/3x446 http://google.com/search?q=dobbertin+md5num=30 btw, in a phone conversation, dobbertin emphasized to me that his attack only works when md5 is used as a message-digest; it doesn't work when md5 is used with a key to prepare a MAC. he also mentioned that while sha-1 may be vulnerable to an attack of a similar style (because sha-1 is similar in struc- ture to md5), he himself was forbiddden by german law to work to cryptanalyze sha-1, because he worked at that time for the german federal security service, and so wasn't allowed to attack the USG's standard ciphers. now he's at ruhr university (in bochum), but i don't know whether he's more of a free agent. - don davis, boston To: [EMAIL PROTECTED] From: Arnold G. Reinhold [EMAIL PROTECTED] Subject: [Mac_crypto] Apple should use SHA! (or stronger) to authenticate software releases Sender: [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] List-Id: Macintosh Cryptography mac_crypto.vmeng.com List-Archive: http://www.vmeng.com/pipermail/mac_crypto/ Date: Sun, 4 Apr 2004 06:17:55 -0500 The cryptographic hash function MD5 has long been used to authenticate software packages, particularly in the Linux/Unix/open source community. This has carried over to Apple's OS-X. The MD5 hash of an entire package is calculated and its value is transmitted separately from the package. Users who download the package compute the hash of the copy they received and match that value against the original. ... - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]