... >From: Philipp G�hring <[EMAIL PROTECTED]> >Sent: Jan 3, 2006 12:13 PM >To: "Steven M. Bellovin" <[EMAIL PROTECTED]>, [email protected] >Subject: Re: RNG quality verification
... >Hi, > >Ok, now I did the first test. >I took OpenSSL, generated 10000 RSA keys, and took them apart. >First I analyzed the raw keys: It's helpful if you try to determine what you're testing here. What is the ideal situation which you hope to approximate by your RNG? For example, RSA moduli will always be odd, since they're the product of two odd primes. So you're going to get some statistical flaws when you look at the low bit. Similarly, the high few bits will be somewhat skewed by our choices of p and q (you usually set the high bits so that you get the desired size of modulus). And you may be choosing p and q according to some other criteria--I don't know what OpenSSL does exactly. Further, you're looking at the result of using the PRNG outputs in a really complicated way. There might be all kinds of nonrandom behavior in the outputs that would not be apparent at all in the result of generating RSA keys. Think about what's happening--you're generating a random starting point of 512 bits with a certain form (high bit set), and then sieving it, and then running a primality test for a bunch of rounds. Then you're doing the same thing a second time. Then you're multiplying the numbers together. To assess a cryptographic PRNG, you need to know two things: a. If it had a starting point or seed which was impossible to guess, would you be able to find any problems with its outputs? b. Does it get a starting point or seed which is impossible to guess? Assessing (a) is about cryptanalysis; statsitics can help there, but mostly, you're looking at the output from some cryptographic function like SHA1 or AES or 3DES. Assessing (b) is about data analysis--you're going to look at the sources for seed material, and try to determine what makes them ultimately unpredictable, and to model them somehow. You can't assess how much entropy some variable has without some kind of probability model for it. It's really hard to get a satisfactory model for most of the OS/software sources, unfortunately. If you can't get a simple model for it, you can at least attempt to determine some kind of bound on its entropy. See Peter Gutmann's wonderful paper from USENIX a few years ago on his Cryptlib's PRNG for some flavor of what this kind of analysis looks like. The critical thing to understand here is where your statistical tools are and are not useful. There's no harm in running some big statistical tests on the outputs from a cryptographic mechanism, and you may in principle even find something this way, but you probably won't. To the extent that you have a probability model for the source of seed material or entropy you're using, you can use statistical tests to check the plausibility of your model. (That is, if your model says that successive samples of some variable are independent, it's really a good idea to run some statistical tests to check that--Chi Square tests of independence, autocorrelation, whatever makes sense with the kind of data you have.) ... >Perhaps I should have stated the quality demands for possible solutions: >Since I am working on a practical solution, and not a theoretical solution, >the following demands apply: >* A 99.999% solution is ok. Well, the analysis of your entropy source is ultimately a data analysis problem. You go back and forth between your best understanding of the underlying process that generates entropy, your data, and what kind of models you can work with, refining your understanding of your entropy source until your money or time run out. If some attacker has a much better model than yours, he may be able to predict your seeds and thus break your PRNG. (This is a good reason to seed the PRNG with more estimated entropy than the minimum you can get away with--there are other reasons to do this, too.) ... >> However -- and it's a big "however" -- numbers that are "random enough" >> for statistical purposes are not necessarily good enough for >> cryptographic purposes. As several people have pointed out already, >> there are processes involving cryptographic algorithms that produce >> very "random" sequences, but are in fact deterministic to someone who >> knows a secret. In other words, if you don't control the generator, >> it's not possible to distinguish these two cases. > >Has anyone tested yet, how much samples are needed to detect those PRNGs? Suppose I'm using AES128 with a key of 0 in counter mode, starting with a counter of 0. This will pass all the statistical tests you run against it. But with a single 128-bit output, I am nearly certain of identifying this source. If you don't know the key I'm using for AES, then distinguishing counter mode outputs is as hard as breaking AES, for some very precisely defined meaning of the work "breaking." But it has no more than 128 bits of entropy, because if you ever guessed the right 128 bit AES key, you could predict all future outputs from my PRNG forever. That means it would be silly to use, say, AES128 to generate 256-bit AES keys, since the attacker would only need to guess the 128-bit AES key to learn all the 256-bit AES keys you were generating. For what it's worth, we're working on a standard for cryptographic random number generation in X9. There's a draft document (SP800-90) up on the NIST website http://csrc.nist.gov/publications/drafts.html#sp800-90 discussing some hopefully good crypto PRNGs, and some guidelines for their use. This doesn't discuss much about analyzing entropy sources (we're still hashing that out). If you want to understand the security of crypto PRNG algorithms, you can look at some papers I did on this with Bruce Schneier, Niels Ferguson, David Wagner, and Chris Hall: http://www.schneier.com/paper-yarrow.html www.cs.berkeley.edu/~daw/papers/prngs-fse98.ps Also, Peter's PRNG paper is at his website: http://www.cs.auckland.ac.nz/~pgut001/ And Lisa Yin did a paper with Desai and Hevia for Eurocrypt 2002 trying to do reduction proofs for various PRNGs--basically showing that if certain properties of the hash function or block cipher used hold, then the PRNG is secure. I don't know if there's an online version available. If you want to understand how to do the data analysis for a hardware entropy source, I recommend looking at the analysis of the Intel and VIA C3 hardware RNGs, both on Cryptography Research's site. http://www.cryptography.com/research/evaluations.html The big thing to understand here is how much nicer life is when you design the source of entropy from the beginning to follow some kind of reasonable probability model, rather than looking for some way to adapt a mathematically useful model to some complicated process like OS loading or something. There's a reasonably nice suite of statistical tests available from NIST, though I've heard some complaints about the portability of the code. More broadly, there are a gazillion statistical packages out there. But if you don't understand your model, you'll just shoot yourself in the foot by throwing sophisticated statistical tools at the problem. >Best regards, >Philipp G�hring --John Kelsey, NIST --------------------------------------------------------------------- The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
