Re: Debunking the PGP backdoor myth for good. [was RE: Hypothesis: PGP backdoor (was: A security bug in PGP products?)]
On 8/28/06, Dave Korn [EMAIL PROTECTED] wrote: The author has made the *exact* same error as when someone comes up with a magical compression algorithm that they say can compress absolutely any data down to a tiny size. They always get the data to compress, sure, but they always have problems with the decompression stage. They tend to think that this is just a minor bug in their decompressor they need more time to work on, but no amount of time will let them work around the fundamental mathematical fact that they've not got sufficient information in their compressed file to reconstruct the original. Thanks, sorry for the hassle, me (and others) should've checked it before asking. Ad. compression algorithm: I conjecture there exists an algorithm (not necessarily *finite*) that can compress large numbers (strings/files/...) into small space, more precisely, it can compress number that is N bytes long into O(P(log N)) bytes, for some polynomial P. Here's a lead: Take as an example group of Z_p* with p prime (in another words: DLP). The triplet (Z, p, generator g) is a compression of a string of p-1 numbers, each number about log2(p) bits. (legend: DTM - deterministic Turing machine, NTM - nondeterministic Turing machine) There exists a way (not necessarily fast/polynomial with DTM) that a lot of strings can be compressed into the mentioned triplets. There are \phi(p-1) different strings that can be compressed with these triplets. Exact number of course depends on factorization of p-1. Decompression is simple: take generator g and compute g, g^2, g^3, g^4, ... in Z_p* I conjecture that for every permutation on 1..N there exists a function that compresses the permutation into a short representation. It is possible that only NTM, possibly with infinite algorithm (e.g. a human) can do it in some short finite time. Again, I could've/should've proven or disproven the conjecture, I just don't want to do it - yet ;-) Thanks OM - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Impossible compression still not possible. [was RE: Debunking the PGP backdoor myth for good. [was RE: Hypothesis: PGP backdoor (was: A security bug in PGP products?)]]
On 28 August 2006 15:30, Ondrej Mikle wrote: Ad. compression algorithm: I conjecture there exists an algorithm (not necessarily *finite*) that can compress large numbers (strings/files/...) into small space, more precisely, it can compress number that is N bytes long into O(P(log N)) bytes, for some polynomial P. [ My maths isn't quite up to this. Is it *necessarily* the case that /any/ polynomial of log N /necessarily/ grows slower than N? If not, there's nothing to discuss, because this is then a conventional compression scheme in which some inputs lead to larger, not smaller, outputs. Hmm. It would seem to me that if P(x)==e^(2x), P(log n) will grow exponentially faster than N. I'm using log to mean natural log here, substitute 10 for e in that formula if we're talking about log10, the base isn't important. However, assuming that we're only talking about P that *do* grow more slowly, I'll discuss the problem with this theory anyway. ] There are many, but there are no algorithms that can compress *all* large numbers into small space; for all compression algorithms, some sets of input data must result in *larger* output than input. There is *no* way round the sheer counting theory aspect of this. There are only 2^N unique files of N bits. If you compress large files of M bits into small files of N bits, and you decompression algorithm produces deterministic output, then you can only possibly generate 2^N files from the compressed ones. Take as an example group of Z_p* with p prime (in another words: DLP). The triplet (Z, p, generator g) is a compression of a string of p-1 numbers, each number about log2(p) bits. (legend: DTM - deterministic Turing machine, NTM - nondeterministic Turing machine) There exists a way (not necessarily fast/polynomial with DTM) that a lot of strings can be compressed into the mentioned triplets. There are \phi(p-1) different strings that can be compressed with these triplets. Exact number of course depends on factorization of p-1. Decompression is simple: take generator g and compute g, g^2, g^3, g^4, ... in Z_p* This theory has been advanced many times before. The Oh, if I can just find the right parameters for a PRNG, maybe I can get it to reconstruct my file as if by magic. (Or subsitute FFT, or wavelet transform, or key-expansion algorithm, or ... etc.) However, there are only as many unique generators as (2 to the power of the number of bits it takes to specify your generator) in this scheme. And that is the maximum number of unique output files you can generate. There is ***NO*** way round the counting theory. I conjecture that for every permutation on 1..N there exists a function that compresses the permutation into a short representation. I'm afraid to tell you that, as always, you will find the compression stage easy and the decompression impossible. There are many many many more possible permutations of 1..N than the number of unique short representations in the scheme. There is no way that the smaller number of unique representations can possibly produce any more than the same (smaller) number of permutations once expanded. There is no way to represent the other (vast majority) of permutations. It is possible that only NTM, possibly with infinite algorithm (e.g. a human) can do it in some short finite time. Then it's not really an algorithm, it's an ad-hoc collection of different schemes. If you're allowed to use a completely different encryption scheme for every file, I can do better than that: for every file, define an encryption scheme where the bit '1' stands for the content of that file, and everything else is represented by a '0' bit followed by the file itself. Sure, most files grow 1 bit bigger, but the only file we care about is compressed to just a single bit! Great! Unfortunately, all you've done is moved information around. The amount of information you'd have to have in the decompressor to know which file to expand any particular '1' bit into is equal to the amount you've saved by compressing the original to a single bit. There is ***NO*** way round the counting theory. Again, I could've/should've proven or disproven the conjecture, I just don't want to do it - yet ;-) I seriously advise you not to waste much time on it. Because ... There is ***NO*** way round the counting theory. cheers, DaveK -- Can't think of a witty .sigline today - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: Impossible compression still not possible. [was RE: Debunking the PGP backdoor myth for good. [was RE: Hypothesis: PGP backdoor (was: A security bug in PGP products?)]]
We are both talking about the same thing :-) I am not saying there is a finite deterministic algorithm to compress every string into small space, there isn't. BTW, thanks for There is ***NO*** way round the counting theory. :-) All I wanted to say is: For a specific structure (e.g. movie, picture, sound) there is some good compression algorithm. E.g.: if you take a GIF 65536x65536, all white, with just one pixel black, it can be compressed into 35 bytes, see here: http://i.iinfo.cz/urs-att/gif3_6-115626056883166.gif If you wanted to compress the same picture using JPEG (i.e. discrete cosine transform), then two things would happen: The compressed jpeg file would be a) much bigger b) decompressed image would have artifacts, because Fourier transform of a pulse is sync (infinitely many frequencies). Sure, JPEG is a lossy compression, but good enough for photos and images that don't have a lot of high frequencies. Cheers, OM On 8/28/06, Dave Korn [EMAIL PROTECTED] wrote: On 28 August 2006 15:30, Ondrej Mikle wrote: Ad. compression algorithm: I conjecture there exists an algorithm (not necessarily *finite*) that can compress large numbers (strings/files/...) into small space, more precisely, it can compress number that is N bytes long into O(P(log N)) bytes, for some polynomial P. [ My maths isn't quite up to this. Is it *necessarily* the case that /any/ polynomial of log N /necessarily/ grows slower than N? If not, there's nothing to discuss, because this is then a conventional compression scheme in which some inputs lead to larger, not smaller, outputs. Hmm. It would seem to me that if P(x)==e^(2x), P(log n) will grow exponentially faster than N. I'm using log to mean natural log here, substitute 10 for e in that formula if we're talking about log10, the base isn't important. However, assuming that we're only talking about P that *do* grow more slowly, I'll discuss the problem with this theory anyway. ] There are many, but there are no algorithms that can compress *all* large numbers into small space; for all compression algorithms, some sets of input data must result in *larger* output than input. There is *no* way round the sheer counting theory aspect of this. There are only 2^N unique files of N bits. If you compress large files of M bits into small files of N bits, and you decompression algorithm produces deterministic output, then you can only possibly generate 2^N files from the compressed ones. Take as an example group of Z_p* with p prime (in another words: DLP). The triplet (Z, p, generator g) is a compression of a string of p-1 numbers, each number about log2(p) bits. (legend: DTM - deterministic Turing machine, NTM - nondeterministic Turing machine) There exists a way (not necessarily fast/polynomial with DTM) that a lot of strings can be compressed into the mentioned triplets. There are \phi(p-1) different strings that can be compressed with these triplets. Exact number of course depends on factorization of p-1. Decompression is simple: take generator g and compute g, g^2, g^3, g^4, ... in Z_p* This theory has been advanced many times before. The Oh, if I can just find the right parameters for a PRNG, maybe I can get it to reconstruct my file as if by magic. (Or subsitute FFT, or wavelet transform, or key-expansion algorithm, or ... etc.) However, there are only as many unique generators as (2 to the power of the number of bits it takes to specify your generator) in this scheme. And that is the maximum number of unique output files you can generate. There is ***NO*** way round the counting theory. I conjecture that for every permutation on 1..N there exists a function that compresses the permutation into a short representation. I'm afraid to tell you that, as always, you will find the compression stage easy and the decompression impossible. There are many many many more possible permutations of 1..N than the number of unique short representations in the scheme. There is no way that the smaller number of unique representations can possibly produce any more than the same (smaller) number of permutations once expanded. There is no way to represent the other (vast majority) of permutations. It is possible that only NTM, possibly with infinite algorithm (e.g. a human) can do it in some short finite time. Then it's not really an algorithm, it's an ad-hoc collection of different schemes. If you're allowed to use a completely different encryption scheme for every file, I can do better than that: for every file, define an encryption scheme where the bit '1' stands for the content of that file, and everything else is represented by a '0' bit followed by the file itself. Sure, most files grow 1 bit bigger, but the only file we care about is compressed to just a single bit! Great! Unfortunately, all you've done is moved information around. The amount of information you'd have to have in the decompressor to know which file
RE: Impossible compression still not possible. [was RE: Debunking the PGP backdoor myth for good. [was RE: Hypothesis: PGP backdoor (was: A security bug in PGP products?)]]
On 28 August 2006 17:12, Ondrej Mikle wrote: We are both talking about the same thing :-) Oh! I am not saying there is a finite deterministic algorithm to compress every string into small space, there isn't. BTW, thanks for There is ***NO*** way round the counting theory. :-) All I wanted to say is: For a specific structure (e.g. movie, picture, sound) there is some good compression algorithm. E.g.: if you take a GIF 65536x65536, all white, with just one pixel black, it can be compressed into 35 bytes, see here: http://i.iinfo.cz/urs-att/gif3_6-115626056883166.gif If you wanted to compress the same picture using JPEG (i.e. discrete cosine transform), then two things would happen: The compressed jpeg file would be a) much bigger b) decompressed image would have artifacts, because Fourier transform of a pulse is sync (infinitely many frequencies). Sure, JPEG is a lossy compression, but good enough for photos and images that don't have a lot of high frequencies. Absolutely, absolutely! A compression algorithm achieves the best results if it is designed with statistical knowledge of the target domain taken into account. In any compression scheme you're balancing the set of inputs that grow smaller on compression against the necessary counterpart of inputs that grow larger. Whatever you gain in the first set, you lose in the second. The secret is to arrange that the inputs that tend to grow larger are the ones that are less-common in real-world usage, and thus that the ones that are more common tend to grow smaller. In practice, this means eliminating 'redundancy', where 'redundancy' is defined as 'whatever similar properties you can tease out from the more-common-real-world cases'. Of course, I could point out that there is precisely *1* bit of information in that huge GIF, so even compressing it to 35 bytes isn't a great achievement... it's one of the set of less-common inputs that grow bigger as a compromise so that real pictures, which tend to have at least *some* variation, grow smaller. cheers, DaveK -- Can't think of a witty .sigline today - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Bleichenbacher's RSA signature forgery based on implementation error
At the evening rump session at Crypto last week, Daniel Bleichenbacher gave a talk showing how it is possible under some circumstances to easily forge an RSA signature, so easily that it could almost be done with just pencil and paper. This depends on an implementation error, a failure to check a certain condition while verifying the RSA signature. Daniel found at least one implementation (I think it was some Java crypto code) which had this flaw. I wanted to report on his result here so that other implementers can make sure they are not vulnerable. Be aware that my notes were hurried as Daniel had only a few minutes to talk. The attack is only good against keys with exponent of 3. There are not too many of these around any more but you still run into them occasionally. It depends on an error in verifying the PKCS-1 padding of the signed hash. An RSA signature is created in several steps. First the data to be signed is hashed. Then the hash gets a special string of bytes in ASN.1 format prepended, which indicates what hash algorithm is used. This data is then PKCS-1 padded to be the width of the RSA modulus. The PKCS-1 padding consists of a byte of 0, then 1, then a string of 0xFF bytes, then a byte of zero, then the payload which is the ASN.1+hash data. Graphically: 00 01 FF FF FF ... FF 00 ASN.1 HASH The signature verifier first applies the RSA public exponent to reveal this PKCS-1 padded data, checks and removes the PKCS-1 padding, then compares the hash with its own hash value computed over the signed data. The error that Bleichenbacher exploits is if the implementation does not check that the hash+ASN.1 data is right-justified within the PKCS-1 padding. Some implementations apparently remove the PKCS-1 padding by looking for the high bytes of 0 and 1, then the 0xFF bytes, then the zero byte; and then they start parsing the ASN.1 data and hash. The ASN.1 data encodes the length of the hash within it, so this tells them how big the hash value is. These broken implementations go ahead and use the hash, without verifying that there is no more data after it. Failing to add this extra check makes implementations vulnerable to a signature forgery, as follows. Daniel forges the RSA signature for an exponent of 3 by constructing a value which is a perfect cube. Then he can use its cube root as the RSA signature. He starts by putting the ASN.1+hash in the middle of the data field instead of at the right side as it should be. Graphically: 00 01 FF FF ... FF 00 ASN.1 HASH GARBAGE This gives him complete freedom to put anything he wants to the right of the hash. This gives him enough flexibility that he can arrange for the value to be a perfect cube. In more detail, let D represent the numeric value of the 00 byte, the ASN.1 data, and the hash, considered as a byte string. In the case of SHA-1 this will be 36 bytes or 288 bits long. Define N as 2^288-D. We will assume that N is a multiple of 3, which can easily be arranged by slightly tweaking the message if neccessary. Bleichenbacher uses an example of a 3072 bit key, and he will position the hash 2072 bits over from the right. This improperly padded version can be expressed numerically as 2^3057 - 2^2360 + D * 2^2072 + garbage. This is equivalent to 2^3057 - N*2^2072 + garbage. Then, it turns out that a cube root of this is simply 2^1019 - (N * 2^34 / 3), and that is a value which broken implementations accept as an RSA signature. You can cube this mentally, remembering that the cube of (A-B) is A^3 - 3(A^2)B + 3A(B^2) - B^3. Applying that rule gives 2^3057 - N*2^2072 + (N^2 * 2^1087 / 3) - (N^3 * 2^102 / 27), and this fits the pattern above of 2^3057 - N*2^2072 + garbage. This is what Daniel means when he says that this attack is simple enough that it could be carried out by pencil and paper (except for the hash calculation itself). Implementors should review their RSA signature verification carefully to make sure that they are not being sloppy here. Remember the maxim that in cryptography, verification checks should err on the side of thoroughness. This is no place for laxity or permissiveness. Daniel also recommends that people stop using RSA keys with exponents of 3. Even if your own implementation is not vulnerable to this attack, there's no telling what the other guy's code may do. And he is the one relying on your signature. Hal Finney - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: Impossible compression still not possible. [was RE: Debunking the PGP backdoor myth for good. [was RE: Hypothesis: PGP backdoor
Dave Korn wrote: Of course, I could point out that there is precisely *1* bit of information in that huge GIF, so even compressing it to 35 bytes isn't a great achievement... it's one of the set of less-common inputs that grow bigger as a compromise so that real pictures, which tend to have at least *some* variation, grow smaller. OK, according to Shannon's definition of entropy, how much information is there in \pi or e or other transcendent number? AFAIK all finite strings are in fractional expansion of \pi (take positive integer base other than 10 if you want). Sure, the numbers are correlated, because we can express \pi or e as infinite series, though only couple of bytes is necessary. But: if you take any finite string, there exists a finite natural number as index where the string can be found in \pi. So are there any random strings at all? What if I can express the digits starting at 2^(2^k)-th index analytically in a short, fast algorithm? In case no other can, then I have perfect one-time pad, at least for some time, because conventional computers will not reach the place in some reasonable time (yes, I know, there may be other correlations). The point I am trying to make: if I take e.g. a transcendent number and can analytically express a part of its fractional expansion, I *might* have a strong key. The point for this: I am researching an authenticating scheme where keys are *programs*, not data. It means that key can change itself over time, for example. The strongest keys would therefore be somewhat recursive polymorphic code, that enumerates all possible permutations on some finite set in some deterministic, though seemingly random order. I know that there are short descriptions for lots of infinite structures, the mentioned transcendent numbers, then take Mandelbrot set, various IFSs (iterated function systems), quaternions, unmeasurable sets, there are lots of possibilities. What I know for sure is that for many mathematical structures short descriptions (programs or (partially) recursive functions) exist. What I conjecture is that for each finite ordered set a short compression function exists. The whole trick is that it is almost impossible (meaning computationally infeasible) to look for the compression functions using a finite deterministic algorithm. Though a human can do it. It will yet take some time until I have some more results about the categories of key programs. Cheers, OM - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: A security bug in PGP products?
On 8/23/06, Dave Korn [EMAIL PROTECTED] wrote: Given that, whatever passphrase you use, you will decrypt the EDK block and get /something/ that looks like a key, this comparison of hashes is a sanity test. If you bypass it but enter the wrong passphrase, you'll get an incorrectly-decrypted EDK, which will lead your disk to look like every sector is full of random garbage. Rather than decrypt the entire disk and run chkdsk to see if it looks sane, comparing the hashes of the passphrase is a quick and dirty way of testing if the resulting EDK is going to be the correct one. The PGP email encryption has two known-plaintext bytes for that purpose. This only honors a bad key 2^16 of the time, but ensures that brute-forcing must do a more extensive unknown-plaintext attack at that rate for any potentially-correct key. This reminds me a little of the suggestions that MACs should be truncated, although it seems to me that it's better to encrypt a hash of the plaintext. -- If you're not part of the solution, you're part of the precipitate. Unix guru for rent or hire -- http://www.lightconsulting.com/~travis/ GPG fingerprint: 9D3F 395A DAC5 5CCC 9066 151D 0A6B 4098 0C55 1484 - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: Hypothesis: PGP backdoor (was: A security bug in PGP products?)
On 8/23/06, Ondrej Mikle [EMAIL PROTECTED] wrote: We discussed with V. Klima about the recent bug in PGPdisk that allowed extraction of key and data without the knowledge of passphrase. I skimmed the URL and it appears this claim was answered several times in the original thread. Did you not read it, or not understand it? You have to have a valid passphrase from before the change, because the passphrase unlocks the disk key which doesn't change, unless you explicitly tell it to. -- If you're not part of the solution, you're part of the precipitate. Unix guru for rent or hire -- http://www.lightconsulting.com/~travis/ GPG fingerprint: 9D3F 395A DAC5 5CCC 9066 151D 0A6B 4098 0C55 1484 - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: compressing randomly-generated numbers
On Mon, 28 Aug 2006, Travis H. wrote: On 8/23/06, Alexander Klimov [EMAIL PROTECTED] wrote: A random bit stream should have two properties: no bias and no dependency between bits. If one has biased but independent bits he can use the von Neumann algorithm to remove the bias, but if there is a dependency no algorithm will be able to `strip' it. Interesting claim; Well, it not really a claim since there was no definition, here it is: A ``dependency stripping'' algorithm is a deterministic algorithm that gets a stream of unbiased (but not necessary independent bits) and produces a stream of several independent bits. what if I XOR it with a truly random stream, It is no a deterministic algorithm. or a computationally-pseudorandom stream, This requires a random key. Recall that the whole point was to extract random bits -- if we already have useful random bits we can simply output them and discard the input stream. or if I filter out every other bit? Consider ``x a x b x c x ...'', sampling every other bit throws away most of entropy. In general, there is no ``dependency stripping'' algorithm because an input x,x,x,..., where all x are the same (either all 0 or all 1), is possible. There is simply no enough entropy in the input to extract at least two bits. Interestingly, an additional requirement that the input stream has at least two bits of entropy does not change the situation: Consider the first two bits of the output. There are only four possibilities (00, 01, 10, and 11), so if the input is long enough then it is possible to find four different inputs that give the same bits, e.g., A(0001) = 00... A(0010) = 00... A(0100) = 00... A(1000) = 00... An input that contains 0001, 0010, 0100, and 1000 with the same probability has two bits of entropy, but it is always transformed by the algorithm into 00... and thus the first two bits are not independent. (Strictly speaking, if we recall a requirement that the input bits are unbiased we should also include into the set of inputs 1110, 1101, ..., but this does not change the result.) Seems like these would, if not eliminate the dependency, would weaken its effect. The last is equivalent to sampling at a slower rate. See above about ``sampling at a slower rate.'' -- Regards, ASK - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: compressing randomly-generated numbers
On 8/29/06, Alexander Klimov [EMAIL PROTECTED] wrote: Well, it not really a claim since there was no definition, here it is: A ``dependency stripping'' algorithm is a deterministic algorithm that gets a stream of unbiased (but not necessary independent bits) and produces a stream of several independent bits. Well, this is a fairly strict definition, I think uniform and e-close to independent is still potentially suitable for use in cryptography. or a computationally-pseudorandom stream, This requires a random key. I can't use e-close to (uniform and independent)? Recall that the whole point was to extract random bits -- if we already have useful random bits we can simply output them and discard the input stream. I hear this argument often, and it appears that the people who say it don't care about the rate of random bits, nor the desirability of not having to trust any one source to be truly unpredictable, nor have they understood the point of an extractor or hashing the output of multiple sources. For example, I'd like to use the Quantis HWRNG, but since it is an opaque container, then I cannot trust it fully. If I had another source that I could combine with it, then I do not have to trust it blindly; the opponent would have to compromise both in order to be able to predict the combined output. Consider ``x a x b x c x ...'', sampling every other bit throws away most of entropy. In general, there is no ``dependency stripping'' algorithm because an input x,x,x,..., where all x are the same (either all 0 or all 1), is possible. There is simply no enough entropy in the input to extract at least two bits. No, you have no idea of the unpredictability of that source, because it is unspecified and unpredictability is untestable. That could very well be the output of a perfect HWRNG. So could 01010101, or 000, or 111. Each output is equally likely, and no amount of testing the output can say whether the source is imperfectly random or truly random. This was stated several times on this list; entropy depends on the model of the source, not on the output. The sequence: 0, 1, 2, 3, 4... 255, 0, 1, 2, 3, 4... 255 ... has a zero entropy if the source is defined as an 8-bit counter starting at zero, and it has an entropy of 1 if the source is defined as a HWRNG that generates 8-bit outputs in a uniform and independent manner. Re-read the definition of entropy, and pay particular attention to the calculation of probability for a given event. Interestingly, an additional requirement that the input stream has at least two bits of entropy does not change the situation I didn't understand this example. See above about ``sampling at a slower rate.'' I can take an analog random source, and sample it at one rate, and have a nearly independent stream. If I increase the sample rate, the bits will start to become more and more dependent upon the prior bit. So, if that is true, then logically a serially-correlated stream will become less and less correlated as I decrease the sample rate. Taking every other bit corresponds to sampling at half the speed, but doesn't require modifying the hardware. It seems that you are saying there is no general solution, given a total lack of knowledge about the source other than the fact that there is some dependency between the bits. I can agree with that. However, if you understand the physics of the source, then you do know something about the random phenomenon used as a source, and in that case you can eliminate specific kinds of dependencies that it may exhibit. The easiest way to eliminate (computationally) bias and dependency in one step is to combine with a CSPRNG. You can reseed it periodically with the combined output. -- If you're not part of the solution, you're part of the precipitate. Unix guru for rent or hire -- http://www.lightconsulting.com/~travis/ GPG fingerprint: 9D3F 395A DAC5 5CCC 9066 151D 0A6B 4098 0C55 1484 - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]