Re: building a true RNG
On 2002-07-27, David Wagner uttered to [EMAIL PROTECTED]: In particular, there is no function f which doesn't waste entropy, unless |Y|=1. The proof is simple enough that you can probably find it with a moment's contemplation, but let me know if you want to see it. I would be lead to pigeonhole; still, this does seem like an eminently teachable moment to me. However, what we're working with in the case of a typical RNG isn't functions between finite buffer-fulls of data, but functions between infinite sets of entire bitstreams which need to be implemented within a finite memory constraint. Whatever the algorithm, it can have state. I'm thinking that is the reasoning which leads people to think that the problem is simple. In this framework one can very well gather statistics and try and normalize them based on what one's seen so far. (The minimalist example here is bias removal.) After that, what goes into the hash is (approximately) white and even with minimal assumptions (i.e. the hash is a one-way permutation) our RNG would seem to approach ideality. True, such an approach will always become a race in modelling the statistics. But still, I would think the generator has the advantage if sufficient margin is left for error. (That is, hash some constant times the number of bits containing the entropy you suspect to be there, based on the best model you've got.) -- Sampo Syreeni, aka decoy - mailto:[EMAIL PROTECTED], tel:+358-50-5756111 student/math+cs/helsinki university, http://www.iki.fi/~decoy/front openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2 - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
On 2002-07-28, Sampo Syreeni uttered to David Wagner: [Answering to my own mail. Sorry.] and discard every 1/(p(x)-1/256)'th sample with value x. Actually the pedantic solution would be to put an arithmetic compressor/coder between the input and output, using the best model we've got. That still leaves model adaptation to be dealt with, but if we discard a sufficient number of output bits at start (estimable from the model), we *will* end up with (very nearly) flat statistics on the output. Asymptotic optimality and all that... (The qualification comes from limited precision arithmetic.) -- Sampo Syreeni, aka decoy - mailto:[EMAIL PROTECTED], tel:+358-50-5756111 student/math+cs/helsinki university, http://www.iki.fi/~decoy/front openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2 - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
An example: presume we take a simple first order statistical model. If our input is an 8-bit sample value from a noise source, we will build a 256 bin histogram. When we see an input value, we look its probability up in the model, and discard every 1/(p(x)-1/256)'th sample with value x. When this happens, the sample is just eaten and nothing appears in the output; otherwise we copy. I understand what you're trying to say, but this will not give a general-purpose function that doesn't waste entropy regardless of the input distribution. This only works when the distribution on the input stream consists of independent, memoryless samples from some distribution on 8-bit values. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
1024-bit RSA key safety still unknown
Dan Bernstein has a response to the June 2002 Lenstra-Shamir-Tomlinson-Tromer paper (and similarly, Bruce Schneier's comments) about his research into the cost of circuits for integer factorization. http://cr.yp.to/nfscircuit.html -- -russ nelson http://russnelson.com | New Internet Acronym: Crynwr sells support for free software | PGPok | 521 Pleasant Valley Rd. | +1 315 268 1925 voice | IANAE Potsdam, NY 13676-3213 | +1 315 268 9201 FAX | I Am Not An Economist - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
Barney Wolff wrote: This leads me to ask what may be a laughably naive question: Do we even know that the popular hash functions can actually generate all 2^N values of their outputs? It seems very unlikely that they can generate all 2^N outputs (under current knowledge). However, they satisfy the next-best thing: their output appears to be indistinguishable from uniform to computationally-bounded observers, hence it's as good as if they could generate all 2^N outputs for most purposes. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
Yes, but is there any lower bound known? 2^(N-1)? Or worse? Clearly a hash with N-bit output that can't generate all 2^N values can never actually have N bits of entropy. Depending on how many forbidden values there are, this may or may not be significant. Of course if the forbidden values cannot be found, I'd agree with the as good as. On Mon, Jul 29, 2002 at 03:39:32PM +, David Wagner wrote: Do we even know that the popular hash functions can actually generate all 2^N values of their outputs? It seems very unlikely that they can generate all 2^N outputs (under current knowledge). However, they satisfy the next-best thing: their output appears to be indistinguishable from uniform to computationally-bounded observers, hence it's as good as if they could generate all 2^N outputs for most purposes. -- Barney Wolff I never met a computer I didn't like. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
Oh dear. On re-reading your message I now suspect that what you asked is not what I originally thought you asked. I see two questions here: Q1: If we cycle through all N-bit messages, are all 2^N output values possible? Q2: If we cycle through all messages (possibly very long or very short), are all 2^N output values possible? On first reading I thought you were asking Q1, but it now occurs to me you were probably asking Q2. I apologize profusely for any confusion I may have caused. Anyway, the answer to Q1 is likely to be No. I'd guess that the answer to Q2 is probably Yes, or close to it. For the first scenario, if the hash behaves like a random oracle, then one would expect only 2^N * (1 - 1/e) of the possible outputs to be attainable. Of course, the forbidden outputs are not easy to find; the best way I know is by exhaustive search over all 2^N N-bit messages. For the second scenario, if the hash behaves like a random oracle, then one would expect all outputs to be attainable. No significant deviation from the random oracle model is known for SHA1 (well, with one exception that doesn't appear to be relevant here). That said, this is not a proof, and my answers just represent my best guesses. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
Barney Wolff asked: Do we even know that the popular hash functions can actually generate all 2^N values of their outputs? David Wagner replied: It seems very unlikely that they can generate all 2^N outputs (under current knowledge). I was temporarily astonished, but he clarified as follows: Q2: If we cycle through all messages (possibly very long or very short), are all 2^N output values possible? ... I'd guess that the answer to Q2 is probably Yes, or close to it. 1) Consider the following function H0: Divide the input into chunks N bits long. Calculate the XOR of all the chunks, and use that as the output. This meets the definition of hash function, although it would not be a one-way hash function. And it would most certainly be capable of generating all 2^N possible outputs. 2) I can't prove that a standard hash function such as SHA1 generates all possible codes, but I consider it likely. It would be quite shocking if a strong hash function such as SHA1 generated fewer codes than a weak function such as H0. 3) For a one-way hash function should not expect a _constructive_ proof that it generates all possible codes; such a construction would violate the one-way property. 4) Here is a rough plausibility argument. Consider two hash functions H_1 and H_2 that are independent. We define H'(x) := H_1(x) XOR H_2(x) (1) which implies H_1(x) = H'(x) XOR H_2(x) (2) Now let's look at the truth table for equation (2), where the row-index is the H' code, the column-index is the H_2 code, and each entry represents the H_1 code required to uphold the equation: 00 01 10 11 00 00 01 10 11 01 01 00 11 10 10 10 11 00 01 11 11 10 01 00 Now let's suppose H_2 is missing one code (say the 10 code) and H' is missing one code (say the 11 code). Then H_1 must be missing at least three codes! Otherwise there would be a way of combining a non-missing H_1 code with a non-missing H_2 code to create the missing H' code. 00 01 10m 11 00 00 01 10 11 01 01 00 11 10 10 10 11 00 01 11m 11m 10m 01 00m We can extend this argument by combining lots and lots of independent hash functions H_i. The combination has far fewer missing codes than any of the ingredients. So either you conclude that a) there is a conspiracy that prevents us from constructing independent hash functions to use as ingredients, or b) we can produce hash functions with very few missing codes. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
David Wagner wrote: Oh dear. On re-reading your message I now suspect that what you asked is not what I originally thought you asked. I see two questions here: Q1: If we cycle through all N-bit messages, are all 2^N output values possible? Q2: If we cycle through all messages (possibly very long or very short), are all 2^N output values possible? On first reading I thought you were asking Q1, but it now occurs to me you were probably asking Q2. I apologize profusely for any confusion I may have caused. Anyway, the answer to Q1 is likely to be No. I'd guess that the answer to Q2 is probably Yes, or close to it. I think the interesting question is whether, for M-bit hash inputs, and an N-bit hash, with a lower bound Q on entropy per input batch, so M Q N, we can show, as I think Denker is claiming to have done, that the entropy of hash(M) must be N - epsilon, for some epsilon small enough to ignore. That would imply not only that all 2^N values occur, but that they are reasonably evenly distributed. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
3) For a one-way hash function should not expect a _constructive_ proof that it generates all possible codes; such a construction would violate the one-way property. Nitpick: the last statement does not seem quite right to me. I'm thinking of the notion of a one-way permutation. For instance, the RSA function f(x) = x^3 mod n is conjectured to be a one-way permutation, assuming n is a RSA modulus with unknown factorization and the RSA assumption holds. (I'm being a little fast and loose with the formal details, but I hope this conveys the idea.) That said, I'm not claiming that the RSA function would make a good cryptographic hash function. Cryptographic hash functions are expected to be a lot more than just one-way and collision-free. I don't know of any good cryptographic hash function that comes with a proof that all outputs are possible. However, it might not be too hard to come up with plausible examples. For example, if we apply the Luby-Rackoff construction (i.e., 3 rounds of a Feistel cipher), with ideal hash functions in each round, does this have the desired properties? It might. On the gripping hand, I don't think this is a real issue in practice. SHA1 is probably good enough for all practical purposes that I can think of. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
On Mon, 29 Jul 2002, David Wagner wrote: DES, being extremely hardware friendly, can be (ab)used to make a strong one-way hash. (E.g., raw input into both key and data maps 56+64 - uniformly distributed 64 bits.) However, when used in this way, DES is not an especially good hash function. For instance, it is easy to find collisions, to find pre-images, and so on. Somewhat related to that, are there any block cipher-hash function methods that are actually secure? Every one I've ever read about seems to have been broken. Regards, Jack - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
2) I can't prove that a standard hash function such as SHA1 generates all possible codes, but I consider it likely. It would be quite shocking if a strong hash function such as SHA1 generated fewer codes than a weak function such as H0. I think you could do a probabilistic proof similar to the DES is not a group quasi-proof. To test a hash function h() whose range is S, let F be the set of balanced functions from S - {0, 1}. (Balanced meaning that each f in F maps exactly half of S to 0 and half to 1.) If you can contrive to choose many members of F at random, and compose them with h for many arguments of h, you should be able to set confidence limits on how much of S is covered by h. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
At 12:20 PM -0700 7/29/02, David Honig wrote: Whether there is a need for very high bandwidth RNGs was discussed on cypherpunks a few months ago, and no examples were found. (Unless you're using something like a one-time pad where you need a random bit for every cargo bit.) Keeping in mind that a commerical crypto server can often accumulate entropy during off-peak hours. It's been discussed here some time back as well. If you believe your crypto primitives are infeasible to break, a crypto-based PRNG with a long enough random seed should be indistinguishable from a true, perfect RNG. If you are only confident that your crypto primitives are expensive to break, then using a true RNG for keys and nonces, rather than deriving them all from one PRNG, adds security. This suggest a continuum of solutions: Construct a crypto PRNG and periodically (once enough has accumulated) stir your entropy source into it's state in some safe way. If you extract entropy slower than you put it in you can expect the equivalent of of a true RNG. If you extract entropy faster than you put it in, the system degrades gracefully in the sense that someone who expends the effort to break the number generation scheme only gets to read messages since the last entropy update. The reason for batching entropy input is to prevent someone who has broken your system once from discovering each small entropy input by exhaustive search. (There was a nice paper pointing this out in. If someone has the reference...) Arnold Reinhold - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
To test a hash function h() whose range is S, let F be the set of balanced functions from S - {0, 1}. (Balanced meaning that each f in F maps exactly half of S to 0 and half to 1.) If you can contrive to choose many members of F at random, and compose them with h for many arguments of h, you should be able to set confidence limits on how much of S is covered by h. Can you elaborate? What would the computational complexity of this be? Estimating the size of {x : f(h(x))=1} to within relative error e requires O(1/e^2) evaluations of h, if I understand correctly. If we consider the difference between a hash function that can hit all of S, and another hash function that misses only one element of S, we'll need to resolve the size of these sets to within relative error 1/|S|. That suggests we'll need |S|^2 evaluations of h, which is infeasible for SHA1 and slower than the naive O(|S|) algorithm. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
The reason for batching entropy input is to prevent someone who has broken your system once from discovering each small entropy input by exhaustive search. (There was a nice paper pointing this out in. If someone has the reference...) I believe you are referring to the state compromise attacks described in the following paper: J. Kelsey, B. Schneier, D. Wagner, C. Hall, Cryptanalytic Attacks on Pseudorandom Number Generators, FSE'98. http://www.counterpane.com/pseudorandom_number.html I once wrote a short note about the relevance of this to IPSec: http://www.cs.berkeley.edu/~daw/my-posts/using-prngs - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
David Wagner wrote: Amir Herzberg wrote: So I ask: is there a definition of this `no wasted entropy` property, which hash functions can be assumed to have (and tested for), and which ensures the desired extraction of randomness? None that I know of. I'm not aware of much work in the crypto literature on this topic. Actually, there is not much hope for such a property. It is pretty easy to see that, if we make no assumptions on the entropy inputs other than they have sufficient entropy, then no single deterministic algorithm can ever be good at extracting randomness. If we fix any single extraction algorithm A, there exists a distribution D on the inputs which make it give non-uniform output (for example, D might be the uniform distribution on the set {x : A(x) has its first ten bits zero}). This is a standard observation from the theory community's work on derandomization. That's the kind of thing mathematicians like to say - but how much does it apply to the real world? For example, you can't produce your example set for SHA-1 or MD5, can you? If you are prepared to factor in cost, then the landscape changes, I would contend. Mathematicians generally assume uncountable resources. Even constructivists assume countable resources. However, we can assume _finite_ resources. Big difference. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ Available for contract work. There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit. - Robert Woodruff - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]