Re: building a true RNG
On 2 Aug 2002, Paul Crowley wrote: I meant to say, another example of a believed one-way function that is guaranteed to be able to produce any output is one based on the difficulty of discrete log: f:(x) = g^x mod p is bijective if the domain and range is 1..p-1, but finding preimages is the discrete log problem. Of course this doesn't compress. I don't know of any examples which compress and have collision resistance. Choose a group of prime order. Choose t random group elements g_i (1 = i = t) different from 1. Write input x as x_1 || x_2 || ... || x_t prepend a 1 bit to each x_i giving z_i f:(x)= g_1^z_1 . g_2^z_2 . . g_t^z^t For details: M. Bellare, O. Goldreich, S. Goldwasser, (Incremental cryptography: the case of hashing and signing, Crypto 94) after earlier work by D. Chaum, E. van Heijst, B. Pfitzmann, (Cryptographically strong undeniable signatures, unconditionally secure for the signer, Crypto'91) and S. Brands. --Bart - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
David Wagner [EMAIL PROTECTED] writes: I don't know of any good cryptographic hash function that comes with a proof that all outputs are possible. What about the scheme Pad - Encipher - Contract described at http://www.monmouth.com/~jsd/turbid/paper/turbid.htm#sec-uniform-hash if we apply the Luby-Rackoff construction (i.e., 3 rounds of a Feistel cipher), with ideal hash functions in each round, does this have the desired properties? It might. Paul Crowley wrote: This seems to define a block cipher with no key, which is collision free but not one-way. Am I misunderstanding what you're proposing? David Wagner wrote: You understood it perfectly. Good point. I didn't notice that problem. Harrumph. There is only the most minor of problems here, namely that DAW mentioned a symmetric cipher. The problem goes away if you use asymmetric crypto. You want a cipher with no _deciphering_ key, as described in my paper. (op. cit.) - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
John S. Denker [EMAIL PROTECTED] writes: David Wagner [EMAIL PROTECTED] writes: I don't know of any good cryptographic hash function that comes with a proof that all outputs are possible. What about the scheme Pad - Encipher - Contract described at http://www.monmouth.com/~jsd/turbid/paper/turbid.htm#sec-uniform-hash It's not hard to understand why it isn't easy to solve this problem using symmetric primitives. Any solution must come with a proof that for any image, there is a preimage. With symmetric primitives, that proof usually constitutes a recipe for generating the preimage given the image. In this case, I don't see a proof that this proposal can generate any output. I tried to construct one myself but fell over on the padding issue. I suspect that if you were to solve that you would also have shown that it was not preimage resistant. If you allow a keyed hash function then it's much easier to specify the properties it must have to be useful for entropy distillation, but of course that gives you a chicken-and-egg problem. Maybe you can do something with some sort of idea of computable distributions to overcome the specification problem David Wagner outlines? -- __ Paul Crowley \/ o\ [EMAIL PROTECTED] /\__/ http://www.ciphergoth.org/ - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
I meant to say, another example of a believed one-way function that is guaranteed to be able to produce any output is one based on the difficulty of discrete log: f:(x) = g^x mod p is bijective if the domain and range is 1..p-1, but finding preimages is the discrete log problem. Of course this doesn't compress. I don't know of any examples which compress and have collision resistance. -- __ Paul Crowley \/ o\ [EMAIL PROTECTED] /\__/ http://www.ciphergoth.org/ - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
1) There were some very interesting questions such as -- whether one can construct a hash function that generates all possible codes. -- ditto, generating them as uniformly as possible. -- Whether off-the-shelf hash functions such as SHA-1 have such properties. The answers are respectively yes, yes, and very probably. I wrote up a discussion of this, with examples, at http://www.monmouth.com/~jsd/turbid/paper/turbid.htm#sec-uniform-hash 2) David W. suggested (off-list) that I clarify the relationship of entropy-based information-theoretic arguments to computational- feasibility arguments. I took some steps in this direction; see http://www.monmouth.com/~jsd/turbid/paper/turbid.htm#sec-objectives - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
David Wagner [EMAIL PROTECTED] writes: I don't know of any good cryptographic hash function that comes with a proof that all outputs are possible. However, it might not be too hard to come up with plausible examples. For example, if we apply the Luby-Rackoff construction (i.e., 3 rounds of a Feistel cipher), with ideal hash functions in each round, does this have the desired properties? It might. This seems to define a block cipher with no key, which is collision free but not one-way. Am I misunderstanding what you're proposing? -- __ Paul Crowley \/ o\ [EMAIL PROTECTED] /\__/ http://www.ciphergoth.org/ - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
David Wagner [EMAIL PROTECTED] writes: I don't know of any good cryptographic hash function that comes with a proof that all outputs are possible. However, it might not be too hard to come up with plausible examples. For example, if we apply the Luby-Rackoff construction (i.e., 3 rounds of a Feistel cipher), with ideal hash functions in each round, does this have the desired properties? It might. This seems to define a block cipher with no key, which is collision free but not one-way. Am I misunderstanding what you're proposing? You understood it perfectly. Good point. I didn't notice that problem. Harrumph. Thanks for catching my oversight! - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
RE: building a true RNG
On Tue, 30 Jul 2002, James A. Donald wrote: Randomness is of course indefinable. A random oracle is however definable. If SHA-1 is indistinguishable from a random oracle without prior knowledge of the input, then we would like to prove that for an attacker to make use of the loss of entropy that results from the fact that it is not a random oracle, the attacker would be need to be able to distinguish SHA-1 from a random oracle without prior knowledge of the input. The above sentence is unsound. You cannot take an assumption (If SHA-1 is indistinguishable) and then use its negation to prove something else (... then we would like to prove that ... the attacker would need to be able to distinguish...), unless you prove your something else for both the TRUE and FALSE values of the assumption. Euclid was all over that, and George Boole after him. Now, for the TRUE value of your assumption, which I think you may have meant: IF (A) SHA-1 is indistinguishable from a random oracle without prior knowledge of the input, AND (B) We can prove that in order to succeed in an attack, an attacker would need to distinguish SHA-1 from a random Oracle without prior knowledge of the input, THEN (C) The attacker will not succeed in that attack. But for the FALSE value of your assumption A, we get IF (B) AND (~C) THEN (~A) And IF (B) AND (~A) THEN (C) OR (~C). So if we take B as an axiom, then we did not prove anything about A unless given ~C and we did not prove anything about C regardless of our assumptions about A. Bear - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
David Wagner [EMAIL PROTECTED] writes: I once wrote a short note about the relevance of this to IPSec: http://www.cs.berkeley.edu/~daw/my-posts/using-prngs There's another way to avoid this problem, which is to separate the nonce RNG and crypto RNG, so that an attacker seeing the nonce RNG output can't use it to attack the crypto RNG. This is done in PGP 5.x and the cryptlib RNG. OTOH some RNGs are used in exactly the opposite manner, generating alternate public and private random quantities, which make it possible to use one to infer information about the other. Examples are generators used with SSL and ssh, which both alternate from public nonces to private session keys and back. Peter. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
At 03:18 PM 7/29/2002 -0700, David Wagner wrote: I don't even think anyone has analyzed the entropy preservation of a theoretically perfect random oracle Well, I know this particular point wasn't central to your email, but I'm not sure I agree with you on this small point. I believe it should be more or less straightforward to analyze the entropy preservation of a random oracle (alas, so straightforward you probably won't find any paper on it in the literature). Actually, it's covered very well (but briefly) in HAC (in section 2.1.6) and they refer to a seminal work by Flajolet and Odlyzko (Google finds references to it quite easily). regards, Greg. Greg Rose INTERNET: [EMAIL PROTECTED] Qualcomm Australia VOICE: +61-2-9817 4188 FAX: +61-2-9817 5199 Level 3, 230 Victoria Road,http://people.qualcomm.com/ggr/ Gladesville NSW 2111232B EC8F 44C6 C853 D68F E107 E6BF CD2F 1081 A37C - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
RE: building a true RNG
David Wagner said, The problem can really be divided into two parts: 1. Is our entropy crunching algorithm secure when used with a random oracle instead of SHA1? 2. Does SHA1 behave enough like a random oracle that the answer to question 1. is in any way relevant to the real world? I suspect that question 1. is not hard to answer in a formal, rigorous, provable way. It is only question 2. that is hard to answer. It is absolutely true that we have no proof for question 2. That said, we should keep in mind that none of our cryptographic algorithms (even 3DES) come with a proof of security. All we have to rest on is years of unsuccessful cryptanalysis. But there's a big difference: the random oracle `assumption` is clearly not valid for SHA-1 (or any other specific hash function). Since this is trivial, it is less clear what is the cryptoanalytical problem that SHA1 was tested for. SHA-1 (and similar functions) were tested mainly for collision-resistance (and also for one-wayness). But clearly these properties are not sufficient for the proposed use (to extract entropy). So the question is again: what is an assumption which we can test SHA-1 against, and which is sufficient to make the `entropy crunching alg` secure? I found that when trying to explain and define hash functions and their properties, I didn't find a satisfactory definition for the `randomness` properties. Best Regards, Amir Herzberg See http://amir.herzberg.name/book.html for lectures and draft-chapters from book-in-progress, `Introduction to Cryptography, Secure Communication and Commerce`; feedback appreciated! - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
RE: building a true RNG
-- On 30 Jul 2002 at 17:02, Amir Herzberg wrote: I found that when trying to explain and define hash functions and their properties, I didn't find a satisfactory definition for the `randomness` properties. Randomness is of course indefinable. A random oracle is however definable. If SHA-1 is indistinguishable from a random oracle without prior knowledge of the input, then we would like to prove that for an attacker to make use of the loss of entropy that results from the fact that it is not a random oracle, the attacker would be need to be able to distinguish SHA-1 from a random oracle without prior knowledge of the input. --digsig James A. Donald 6YeGpsZR+nOTh/cGwvITnSR3TdzclVpR0+pr3YYQdkG CxPM+cm8zcgy+aC2EA+wlmYH4DUaMzSLmaJFJN6v 225C9EmZaK85VbOoLT5EpF24GeytUdtyW9T/FjXgw - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
Amir Herzberg wrote: But there's a big difference: the random oracle `assumption` is clearly not valid for SHA-1 (or any other specific hash function). Well, the random oracle model has problems, but I think those problems are a bit more subtle than just an assumption that is true or false. So the question is again: what is an assumption which we can test SHA-1 against, and which is sufficient to make the `entropy crunching alg` secure? Hmm; I thought I answered this before. Was it unclear? If so, please ask. In any case, here's a summary. In the standard model (without random oracles), there is *no* such assumption. There's no hope for finding such an assumption, if you want to build a general-purpose entropy cruncher that works for any distribution on the input samples. One can prove this. No matter what function you choose, there is an input distribution that makes this function inadequate. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
On 2002-07-27, David Wagner uttered to [EMAIL PROTECTED]: In particular, there is no function f which doesn't waste entropy, unless |Y|=1. The proof is simple enough that you can probably find it with a moment's contemplation, but let me know if you want to see it. I would be lead to pigeonhole; still, this does seem like an eminently teachable moment to me. However, what we're working with in the case of a typical RNG isn't functions between finite buffer-fulls of data, but functions between infinite sets of entire bitstreams which need to be implemented within a finite memory constraint. Whatever the algorithm, it can have state. I'm thinking that is the reasoning which leads people to think that the problem is simple. In this framework one can very well gather statistics and try and normalize them based on what one's seen so far. (The minimalist example here is bias removal.) After that, what goes into the hash is (approximately) white and even with minimal assumptions (i.e. the hash is a one-way permutation) our RNG would seem to approach ideality. True, such an approach will always become a race in modelling the statistics. But still, I would think the generator has the advantage if sufficient margin is left for error. (That is, hash some constant times the number of bits containing the entropy you suspect to be there, based on the best model you've got.) -- Sampo Syreeni, aka decoy - mailto:[EMAIL PROTECTED], tel:+358-50-5756111 student/math+cs/helsinki university, http://www.iki.fi/~decoy/front openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2 - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
On 2002-07-28, Sampo Syreeni uttered to David Wagner: [Answering to my own mail. Sorry.] and discard every 1/(p(x)-1/256)'th sample with value x. Actually the pedantic solution would be to put an arithmetic compressor/coder between the input and output, using the best model we've got. That still leaves model adaptation to be dealt with, but if we discard a sufficient number of output bits at start (estimable from the model), we *will* end up with (very nearly) flat statistics on the output. Asymptotic optimality and all that... (The qualification comes from limited precision arithmetic.) -- Sampo Syreeni, aka decoy - mailto:[EMAIL PROTECTED], tel:+358-50-5756111 student/math+cs/helsinki university, http://www.iki.fi/~decoy/front openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2 - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
An example: presume we take a simple first order statistical model. If our input is an 8-bit sample value from a noise source, we will build a 256 bin histogram. When we see an input value, we look its probability up in the model, and discard every 1/(p(x)-1/256)'th sample with value x. When this happens, the sample is just eaten and nothing appears in the output; otherwise we copy. I understand what you're trying to say, but this will not give a general-purpose function that doesn't waste entropy regardless of the input distribution. This only works when the distribution on the input stream consists of independent, memoryless samples from some distribution on 8-bit values. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
Barney Wolff wrote: This leads me to ask what may be a laughably naive question: Do we even know that the popular hash functions can actually generate all 2^N values of their outputs? It seems very unlikely that they can generate all 2^N outputs (under current knowledge). However, they satisfy the next-best thing: their output appears to be indistinguishable from uniform to computationally-bounded observers, hence it's as good as if they could generate all 2^N outputs for most purposes. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
Yes, but is there any lower bound known? 2^(N-1)? Or worse? Clearly a hash with N-bit output that can't generate all 2^N values can never actually have N bits of entropy. Depending on how many forbidden values there are, this may or may not be significant. Of course if the forbidden values cannot be found, I'd agree with the as good as. On Mon, Jul 29, 2002 at 03:39:32PM +, David Wagner wrote: Do we even know that the popular hash functions can actually generate all 2^N values of their outputs? It seems very unlikely that they can generate all 2^N outputs (under current knowledge). However, they satisfy the next-best thing: their output appears to be indistinguishable from uniform to computationally-bounded observers, hence it's as good as if they could generate all 2^N outputs for most purposes. -- Barney Wolff I never met a computer I didn't like. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
Oh dear. On re-reading your message I now suspect that what you asked is not what I originally thought you asked. I see two questions here: Q1: If we cycle through all N-bit messages, are all 2^N output values possible? Q2: If we cycle through all messages (possibly very long or very short), are all 2^N output values possible? On first reading I thought you were asking Q1, but it now occurs to me you were probably asking Q2. I apologize profusely for any confusion I may have caused. Anyway, the answer to Q1 is likely to be No. I'd guess that the answer to Q2 is probably Yes, or close to it. For the first scenario, if the hash behaves like a random oracle, then one would expect only 2^N * (1 - 1/e) of the possible outputs to be attainable. Of course, the forbidden outputs are not easy to find; the best way I know is by exhaustive search over all 2^N N-bit messages. For the second scenario, if the hash behaves like a random oracle, then one would expect all outputs to be attainable. No significant deviation from the random oracle model is known for SHA1 (well, with one exception that doesn't appear to be relevant here). That said, this is not a proof, and my answers just represent my best guesses. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
Barney Wolff asked: Do we even know that the popular hash functions can actually generate all 2^N values of their outputs? David Wagner replied: It seems very unlikely that they can generate all 2^N outputs (under current knowledge). I was temporarily astonished, but he clarified as follows: Q2: If we cycle through all messages (possibly very long or very short), are all 2^N output values possible? ... I'd guess that the answer to Q2 is probably Yes, or close to it. 1) Consider the following function H0: Divide the input into chunks N bits long. Calculate the XOR of all the chunks, and use that as the output. This meets the definition of hash function, although it would not be a one-way hash function. And it would most certainly be capable of generating all 2^N possible outputs. 2) I can't prove that a standard hash function such as SHA1 generates all possible codes, but I consider it likely. It would be quite shocking if a strong hash function such as SHA1 generated fewer codes than a weak function such as H0. 3) For a one-way hash function should not expect a _constructive_ proof that it generates all possible codes; such a construction would violate the one-way property. 4) Here is a rough plausibility argument. Consider two hash functions H_1 and H_2 that are independent. We define H'(x) := H_1(x) XOR H_2(x) (1) which implies H_1(x) = H'(x) XOR H_2(x) (2) Now let's look at the truth table for equation (2), where the row-index is the H' code, the column-index is the H_2 code, and each entry represents the H_1 code required to uphold the equation: 00 01 10 11 00 00 01 10 11 01 01 00 11 10 10 10 11 00 01 11 11 10 01 00 Now let's suppose H_2 is missing one code (say the 10 code) and H' is missing one code (say the 11 code). Then H_1 must be missing at least three codes! Otherwise there would be a way of combining a non-missing H_1 code with a non-missing H_2 code to create the missing H' code. 00 01 10m 11 00 00 01 10 11 01 01 00 11 10 10 10 11 00 01 11m 11m 10m 01 00m We can extend this argument by combining lots and lots of independent hash functions H_i. The combination has far fewer missing codes than any of the ingredients. So either you conclude that a) there is a conspiracy that prevents us from constructing independent hash functions to use as ingredients, or b) we can produce hash functions with very few missing codes. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
David Wagner wrote: Oh dear. On re-reading your message I now suspect that what you asked is not what I originally thought you asked. I see two questions here: Q1: If we cycle through all N-bit messages, are all 2^N output values possible? Q2: If we cycle through all messages (possibly very long or very short), are all 2^N output values possible? On first reading I thought you were asking Q1, but it now occurs to me you were probably asking Q2. I apologize profusely for any confusion I may have caused. Anyway, the answer to Q1 is likely to be No. I'd guess that the answer to Q2 is probably Yes, or close to it. I think the interesting question is whether, for M-bit hash inputs, and an N-bit hash, with a lower bound Q on entropy per input batch, so M Q N, we can show, as I think Denker is claiming to have done, that the entropy of hash(M) must be N - epsilon, for some epsilon small enough to ignore. That would imply not only that all 2^N values occur, but that they are reasonably evenly distributed. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
3) For a one-way hash function should not expect a _constructive_ proof that it generates all possible codes; such a construction would violate the one-way property. Nitpick: the last statement does not seem quite right to me. I'm thinking of the notion of a one-way permutation. For instance, the RSA function f(x) = x^3 mod n is conjectured to be a one-way permutation, assuming n is a RSA modulus with unknown factorization and the RSA assumption holds. (I'm being a little fast and loose with the formal details, but I hope this conveys the idea.) That said, I'm not claiming that the RSA function would make a good cryptographic hash function. Cryptographic hash functions are expected to be a lot more than just one-way and collision-free. I don't know of any good cryptographic hash function that comes with a proof that all outputs are possible. However, it might not be too hard to come up with plausible examples. For example, if we apply the Luby-Rackoff construction (i.e., 3 rounds of a Feistel cipher), with ideal hash functions in each round, does this have the desired properties? It might. On the gripping hand, I don't think this is a real issue in practice. SHA1 is probably good enough for all practical purposes that I can think of. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
On Mon, 29 Jul 2002, David Wagner wrote: DES, being extremely hardware friendly, can be (ab)used to make a strong one-way hash. (E.g., raw input into both key and data maps 56+64 - uniformly distributed 64 bits.) However, when used in this way, DES is not an especially good hash function. For instance, it is easy to find collisions, to find pre-images, and so on. Somewhat related to that, are there any block cipher-hash function methods that are actually secure? Every one I've ever read about seems to have been broken. Regards, Jack - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
2) I can't prove that a standard hash function such as SHA1 generates all possible codes, but I consider it likely. It would be quite shocking if a strong hash function such as SHA1 generated fewer codes than a weak function such as H0. I think you could do a probabilistic proof similar to the DES is not a group quasi-proof. To test a hash function h() whose range is S, let F be the set of balanced functions from S - {0, 1}. (Balanced meaning that each f in F maps exactly half of S to 0 and half to 1.) If you can contrive to choose many members of F at random, and compose them with h for many arguments of h, you should be able to set confidence limits on how much of S is covered by h. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
At 12:20 PM -0700 7/29/02, David Honig wrote: Whether there is a need for very high bandwidth RNGs was discussed on cypherpunks a few months ago, and no examples were found. (Unless you're using something like a one-time pad where you need a random bit for every cargo bit.) Keeping in mind that a commerical crypto server can often accumulate entropy during off-peak hours. It's been discussed here some time back as well. If you believe your crypto primitives are infeasible to break, a crypto-based PRNG with a long enough random seed should be indistinguishable from a true, perfect RNG. If you are only confident that your crypto primitives are expensive to break, then using a true RNG for keys and nonces, rather than deriving them all from one PRNG, adds security. This suggest a continuum of solutions: Construct a crypto PRNG and periodically (once enough has accumulated) stir your entropy source into it's state in some safe way. If you extract entropy slower than you put it in you can expect the equivalent of of a true RNG. If you extract entropy faster than you put it in, the system degrades gracefully in the sense that someone who expends the effort to break the number generation scheme only gets to read messages since the last entropy update. The reason for batching entropy input is to prevent someone who has broken your system once from discovering each small entropy input by exhaustive search. (There was a nice paper pointing this out in. If someone has the reference...) Arnold Reinhold - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
To test a hash function h() whose range is S, let F be the set of balanced functions from S - {0, 1}. (Balanced meaning that each f in F maps exactly half of S to 0 and half to 1.) If you can contrive to choose many members of F at random, and compose them with h for many arguments of h, you should be able to set confidence limits on how much of S is covered by h. Can you elaborate? What would the computational complexity of this be? Estimating the size of {x : f(h(x))=1} to within relative error e requires O(1/e^2) evaluations of h, if I understand correctly. If we consider the difference between a hash function that can hit all of S, and another hash function that misses only one element of S, we'll need to resolve the size of these sets to within relative error 1/|S|. That suggests we'll need |S|^2 evaluations of h, which is infeasible for SHA1 and slower than the naive O(|S|) algorithm. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
The reason for batching entropy input is to prevent someone who has broken your system once from discovering each small entropy input by exhaustive search. (There was a nice paper pointing this out in. If someone has the reference...) I believe you are referring to the state compromise attacks described in the following paper: J. Kelsey, B. Schneier, D. Wagner, C. Hall, Cryptanalytic Attacks on Pseudorandom Number Generators, FSE'98. http://www.counterpane.com/pseudorandom_number.html I once wrote a short note about the relevance of this to IPSec: http://www.cs.berkeley.edu/~daw/my-posts/using-prngs - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
David Wagner wrote: Amir Herzberg wrote: So I ask: is there a definition of this `no wasted entropy` property, which hash functions can be assumed to have (and tested for), and which ensures the desired extraction of randomness? None that I know of. I'm not aware of much work in the crypto literature on this topic. Actually, there is not much hope for such a property. It is pretty easy to see that, if we make no assumptions on the entropy inputs other than they have sufficient entropy, then no single deterministic algorithm can ever be good at extracting randomness. If we fix any single extraction algorithm A, there exists a distribution D on the inputs which make it give non-uniform output (for example, D might be the uniform distribution on the set {x : A(x) has its first ten bits zero}). This is a standard observation from the theory community's work on derandomization. That's the kind of thing mathematicians like to say - but how much does it apply to the real world? For example, you can't produce your example set for SHA-1 or MD5, can you? If you are prepared to factor in cost, then the landscape changes, I would contend. Mathematicians generally assume uncountable resources. Even constructivists assume countable resources. However, we can assume _finite_ resources. Big difference. Cheers, Ben. -- http://www.apache-ssl.org/ben.html http://www.thebunker.net/ Available for contract work. There is no limit to what a man can do or how far he can go if he doesn't mind who gets the credit. - Robert Woodruff - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
At 11:24 AM 7/25/02 -0400, John S. Denker wrote: And most particularly I do not care if the analog threshold of my soundcard shifts slightly (as a function of recent history, temperature, phase of the moon, or whatever). A change in the analogue threshold of your digitizing step will change the variability (aka entropy) of your samples. The practical solution is wide engineering margins and monitor your raw input. (And, if you can (SHA users can't), measure the data just before it goes into any whiteners you use, for belts-and-suspenders assurance that you've compressed it sufficiently.) This is the central conceptual point of my paper. It is more important than any particular implementation. The point is that a Random Symbol Generator can be proved correct using fairly mild assumptions and premises. Based on a detailed model of the noise, grounded in physics. Yep. Have you seen RNGs based on arrival times of network packets, or disk accesses, etc.? I'd extrapolate that you'd not trust them much given their distance from physics. (And I'd agree that a soundcard (or RF card) is preferred and 'free') The turning point of the argument is statistical: if I have enough entropy at the input of the hash function, and if the hash function doesn't waste entropy (by having unnecessarily many hash collisions) then the statistics takes over and covers a multitude of sins. I don't think collision is the right word; you're not doing a search on hash values which might collide. For example, if I have 165 bits of entropy at the input of the hash function, the output will have 159.98 bits of entropy in a 160 bit word. I don't understand this. If you have 165 bits of entropy in, you should be able to generate 160 binary symbols with a bit of entropy each. (You are conserving total entropy, only concentrating it by reducing the number of symbols.) You can shift the threshold all you want. Shifting thresholds changes entropic content. You can add something to the input. DC doesn't matter, capacitors are cheap :-) If the alleged threshold shift is so large as to decrease the variability of the raw data, then all bets are off... but that wasn't the question that David asked. The rhetorical question suggested that if the threshold shifted at all I would have a big problem, and I loudly assert that I don't. Specifically: If you give me any halfway-reasonable upper bound on the magnitude of the shift, I can design the generator to accomodate that, producing industrial-strength randomness despite such a shift. This hinges on the definition of large and small... anyway generous margins monitoring work both for steel bridges and RNGs. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
RE: building a true RNG
In several posts to the list, e.g. by Wagner and Denker, it is proposed to use a `strong crypto hash function` h(), e.g. SHA1, to extract entropy from a physical source, i.e. + Source -- Digitizer -- good hash Namely, if the sample from the digitizer (of the physical source) is x, the suggestion is to use h(x) as random data. I think this is a very common design in practice. But clearly this relies on some property of the hash function. For example Denker says: a) if the hash function happens to have a property I call no wasted entropy then... [this h(x) is `as good as random`]. Indeed if we adopt the `random oracle` methodology, i.e. analyze the system assuming h() is a truly random function, then we easily see that h(x) are truly random bits (assuming the number of bits in h(x) is significantly less than the entropy in x). But: the `random oracle` methodology helps us find vulnerabilities in protocols, but no specific hash function is really a random oracle... So I ask: is there a definition of this `no wasted entropy` property, which hash functions can be assumed to have (and tested for), and which ensures the desired extraction of randomness? Regards, Amir Herzberg See http://amir.herzberg.name/book.html for lectures and draft-chapters from book-in-progress, `Introduction to Cryptography, Secure Communication and Commerce`; feedback appreciated! - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
I wrote: a) if the hash function happens to have a property I call no wasted entropy then the whitening stage is superfluous (and you may decide to classify the hash as non-simple); David Honig responded: Not wasting entropy does not mean that a function's output is white ie uniformly distributed. E.g., a function which doubles every bit: 0-00, 1-11 preserves total entropy but does not produce uniformly distributed output. (It also reduces entropy/symbol.) That's a red-herring tangent. I'm not talking about any old function that doesn't waste entropy; I'm talking about a !!hash!! function that doesn't waste entropy. The hash function has a hard constraint on the word-size of its output. If it starts doubling bits, or otherwise putting out redundancies, then it is wasting entropy. See the discussion of the BADHASH-2 function in the paper. http://www.monmouth.com/~jsd/turbid/ And remember: in addition to having a non-entropy-wasting hash function, we are also required to saturate its input. Then we can conclude that the output is white to a very high degree, as quantitatively discussed in the paper. The simple-hash's --lets call it a digest-- function is to increase the entropy/symbol by *reducing* the number of symbols while preserving *total* entropy. Total entropy is preserved in the non-saturated regime. This is documented in upper rows of the table: http://www.monmouth.com/~jsd/turbid/paper/turbid.htm#tab-saturation In the saturated regime, some entropy is necessarily lost. This is documented in the lower rows of the table. This is only a small percentage, but it is mandatory. I don't consider this to be wasted entropy; I consider it entropy well spent. That is, these are necessary hash collisions, as opposed to unnecessary ones. A function like xor(bit N, bit N+1) which halves the number of bits can do this. While its output is whiter than its input, its output will not be perfectly white unless its input was. See the discussion of BADHASH-1 in the paper. Two scenarios where SHA is contraindicated came up: 1. hardware 2. interrupt handlers Sometimes a hardware RNG will feed raw data into a crypto-WEAK analogue of SHA, ie a LFSR which does the bit-reduction and mixing functions, but doesn't mix as well as SHA. (LFSR are hardware-friendly) Separating the bit-reduction from the mixing can be useful for analyzing what's really going on. Well, I tend to agree that systems that separate the bit-reduction from the mixing are easier to analyze, in the sense that it is easier to find flaws in them. But that's because they're so flawed! - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
Amir Herzberg wrote: So I ask: is there a definition of this `no wasted entropy` property, which hash functions can be assumed to have (and tested for), and which ensures the desired extraction of randomness? None that I know of. I'm not aware of much work in the crypto literature on this topic. Actually, there is not much hope for such a property. It is pretty easy to see that, if we make no assumptions on the entropy inputs other than they have sufficient entropy, then no single deterministic algorithm can ever be good at extracting randomness. If we fix any single extraction algorithm A, there exists a distribution D on the inputs which make it give non-uniform output (for example, D might be the uniform distribution on the set {x : A(x) has its first ten bits zero}). This is a standard observation from the theory community's work on derandomization. The only way out of this I can see is to assume we have a small seed of uniformly distributed randomness on the side. This is exactly the direction explored in the theory community in work on extractors, the leftover hashing lemma, and the like. However, from a cryptographic point of view, such an assumption is highly unreasonable (even worse than the random oracle assumption). In short: I know of no better way to analyze cryptographic randomness extraction than to use the random oracle model. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
Amir Herzberg wrote: So I ask: is there a definition of this `no wasted entropy` property, which hash functions can be assumed to have (and tested for), and which ensures the desired extraction of randomness? That's the right question. The answer I give in the paper is A cryptologic hash function advertises that it is computationally infeasible for an adversary to unmix the hash-codes. A chosen-plaintext (chosen-input) attack will not discover inputs that produce hash collisions with any great probability. In contrast: What we are asking is not really very special. We merely ask that the hash-codes in the second column be well mixed. We ask that the data acquisition system will not accidentally produce an input pattern that unmixes the hash-codes. We believe that anything that makes a good pretense of being a cryptologic hash function is good enough for our purposes, with a wide margin of safety. If it resists attack when the adversary can choose the inputs, it presumably resists attack when the adversary can't choose the inputs. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
John S. Denker wrote: Amir Herzberg wrote: So I ask: is there a definition of this `no wasted entropy` property, which hash functions can be assumed to have (and tested for), and which ensures the desired extraction of randomness? That's the right question. The answer I give in the paper is What we are asking is not really very special. We merely ask that the hash-codes in the second column be well mixed. Alas, that's not a very precise definition. Actually, my intuition differs from yours. My intuition is that entropy collection requires fairly strong assumptions about the hash. For instance, collision-freedom isn't enough. One-wayness isn't enough. We need something stronger, and something that appears difficult to formalize in any precise, mathematically rigorous way. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
- Original Message - From: John S. Denker [EMAIL PROTECTED] To: David Honig [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; 'Hannes R. Boehm' [EMAIL PROTECTED]; 'Ian Hill' [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Saturday, July 27, 2002 10:52 AM Subject: Re: building a true RNG I wrote: a) if the hash function happens to have a property I call no wasted entropy then the whitening stage is superfluous (and you may decide to classify the hash as non-simple); I disagree. The whitening stage can perform a very important step of allowing for stretching of the entropy. Ideally a true RNG would be fast enough to supply all our needs, but that typically doesn't happen. Instead we have to depend on stretching tricks, basically using a primitive in counter mode, to stretch the bits we have as far as is needed. This can be combined with the distillation stage itself by inserting a fixed string, copying the state, and finishing the calculation, but can conceptually be thought of as a seperate stage. David Honig responded: Not wasting entropy does not mean that a function's output is white ie uniformly distributed. E.g., a function which doubles every bit: 0-00, 1-11 preserves total entropy but does not produce uniformly distributed output. (It also reduces entropy/symbol.) That's a red-herring tangent. I'm not talking about any old function that doesn't waste entropy; I'm talking about a !!hash!! function that doesn't waste entropy. A construct that simply does not exist, more on this later. And remember: in addition to having a non-entropy-wasting hash function, we are also required to saturate its input. Then we can conclude that the output is white to a very high degree, as quantitatively discussed in the paper. The simple-hash's --lets call it a digest-- function is to increase the entropy/symbol by *reducing* the number of symbols while preserving *total* entropy. I seperated this because I agree with teh statement completely, that is the one and only function of the distilling function, whether it is a hash, a cipher, or a miracle construct. Total entropy is preserved in the non-saturated regime. This is documented in upper rows of the table: http://www.monmouth.com/~jsd/turbid/paper/turbid.htm#tab-saturation I disagree. There are several ways of viewing the problem. Modelling the distillation function as a perfect entropy smoother over k bits (other than that I will be assuming a worst case). We see the following happen: When the first bit is inserted, each of the k bits receives 1/k entropy When the second bit of entropy is inserted one of those bits received 1 bit of entropy, the entire system now has 1 + (k-1)/k bits of entropy, already there is a small but measurable discrepancy Each bit now has (2k-1)/k bits of entropy The third bit results in 1 of the k bits having entropy 1 giving 1+ ((2k-1)/k)(k-1)/k bits of entropy in total While the system slowly loses entropy, it does lose entropy. It's also worth noting that this agrees with the presented analysis at 3 points, 0 has 0 entropy, 1 has 1-bit of entropy, and inf has k bits, but that for all useful lengths having k bits of entropy can never occur. In the saturated regime, some entropy is necessarily lost. This is documented in the lower rows of the table. This is only a small percentage, but it is mandatory. I don't consider this to be wasted entropy; I consider it entropy well spent. That is, these are necessary hash collisions, as opposed to unnecessary ones. These don't even have to be hash collisions, it is possible for the hash function to discard entropy simply through its behavior, as in my example where entropy starts being discarded beginning with the second bit. From a theoretic standpoint, it is entirely possible that a function exists that discards entropy beginning with the first bit. A function like xor(bit N, bit N+1) which halves the number of bits can do this. While its output is whiter than its input, its output will not be perfectly white unless its input was. This statement is true of all deterministic algorithms. In fact it it one of the points where all presented models agree, you cannot reach a completely entropic state until you reach infinite lengths for the input. If the input if purely entropic there is some grey area surrounding this, but for a hash function to be secure the grey area disappears. Two scenarios where SHA is contraindicated came up: 1. hardware 2. interrupt handlers Sometimes a hardware RNG will feed raw data into a crypto-WEAK analogue of SHA, ie a LFSR which does the bit-reduction and mixing functions, but doesn't mix as well as SHA. (LFSR are hardware-friendly) Separating the bit-reduction from the mixing can be useful for analyzing what's really going on. Well, I tend to agree that systems that separate the bit-reduction from the mixing are easier to analyze, in the sense that it is easier to find
Re: building a true RNG (was: Quantum Computing ...)
- Original Message - From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Tuesday, July 23, 2002 1:59 PM Subject: Re: building a true RNG (was: Quantum Computing ...) You cannot measure entropy retrospectively. You need to have a theory as to where the entropy is coming from, in order to reliably measure it. Thus hardware sources should be based on simple and well understood physical principles, such as Johnson noise or shot noise. Entropy is not quite a physical quantity -- rather it is on the slippery edge between being a physical thing and a philosophical thing. If you are not careful, you will slip into a deep epistemic bog and find yourself needing to ask how do we know what is knowable, and what is the whichness of why? To avoid such deep waters, know where your entropy is coming from. Actually, the aura of mystery that surrounds entropy can be cleared if you think of it as the amount of information describing the state of a system that you do NOT know, because the output of the system only allows to inspect part of its state (or nothing at all). For a perfect PRNG, that doesn't leak any incremental information about the internal state, the entropy equals the number of independent bits of its state (which is why I consider the depletion of the entropy pool a non-issue: if no information about the state is disclosed though the output stream, the entropy of the generator CANNOT be decreased). Macroscopic thermodynamic systems contain much larger amounts of entropy, in the region of 10^23 bits, as the number of Avogadro comes into play. Whereas, in theory, even a perfect PRNG can be reverse engineered (e.g., hooking a debugger to its software and/or hardware), for a true RNG this is not possible, either because the number of states is just too large (thermal noise, see above) or because of quantum reasons (no hidden variables at all to dig out: for example, to the best of our knowledge there is simply no way of knowing exactly when a radioactive nucleus will decay). Otherwise, their black box functionality is essentially the same - by definition. Estimating an _upper boundary_ to entropy by simply observing the output of a black box is possible, but under some conditions: you have to assume that the the system is ergodic, i.e. that the statistics can be inferred from time averages (or, equivalently, that the system never gets locked into sequences of some subset of states). And even then, what you have is just an estimate: you could have a sequence of 1000 consecutive zeroes just by chance (if you are VERY, VERY unlucky, that is). Enzo - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
David Honig helped focus the discussion by advocating the block diagram: Source -- Digitizer -- Simple hash -- Whitener (e.g., DES) Let me slightly generalize this to: ! Source -- Digitizer -- hash -- Whitener (e.g., DES) i.e. we defer the question of whether the hash is simple or not. I continue to claim that a) if the hash function happens to have a property I call no wasted entropy then the whitening stage is superfluous (and you may decide to classify the hash as non-simple); otherwise b) if the hash function does not have that property, this is a defective Random Symbol Generator and b1) the whitener will _at best_ conceal, not remove the defects, and b2) this is not the best way to conceal defects. Very definitely not. To illustrate my point, I will accept David's example of a simple-hash function; he wrote: Parity is the ultimate hash. Well, then, suppose that the raw data coming off my digitizer consists of an endless sequences of even-parity words. The words have lots of variability, lots of entropy, but the parity is always even. Then the output of the simple-hash is an endless sequence of zeros. I encrypt this with DES. Maybe triple-DES. It's not going to help. The generator is defective and doesn't even have satisfactory error-concealment. I like my design a lot better: + Source -- Digitizer -- good hash where I have chosen SHA-1 as my hash function. Finally, since SHA-1 is remarkably computationally efficient, I don't understand the motivation to look for simpler hash functions, especially if they are believed to require whitening or other post-processing. = Thanks again for the questions. This is a good discussion. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG (was: Quantum Computing ...)
At 08:39 AM 7/23/02 +0200, Eugen Leitl wrote: On Mon, 22 Jul 2002, David Honig wrote: Yes, it is a joke. However, it is also a viable if low-bandwidth entropy source. I disagree that you need to be able to model I've got a framegrabber with a 640x480 24 bit/pixel camera. It doesn't compress, is rather noisy, and since self-adjusting I get the maximum entropy at maximum darkness. Is there any point in compressing the video before running it through a cryptohash? How does e.g. SHA-1 fare with very sparse bitvectors? Good question. I'd say the answer depends on the computational requirements of compression vs. SHA, assuming you're trying to save CPU cycles. You can measure the entropy of your camera staring at a wall and use that to estimate how much you need to digest. Then add a generous engineering margin, of course. Here's the type of experiment I've done, which separates the bit-digestion from the crypto-hashing: Once you measure entropy in your raw data, you can put the data sample through various distillers. For instance, xor pairs of bits, reducing the sample size by 2. Measure the entropy of this. Keep doing this until your entropy-measure says you've maxxed out. Then, to truly mess with Mr. Adversary, put it through a crypto function (which here needn't be a digest, e.g., a block cipher). The latter step achieves the 'mixing' (avalanche) which is built into SHA. For a software implementation SHA seems like a good choice, though you don't get to measure the entropy distillation before whitening. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG (was: Quantum Computing ...)
At 10:59 PM 7/22/02 -0700, [EMAIL PROTECTED] wrote: Entropy is not quite a physical quantity -- rather it is on the slippery edge between being a physical thing and a philosophical thing. If you are not careful, you will slip into a deep epistemic bog and find yourself needing to ask how do we know what is knowable, and what is the whichness of why? To avoid such deep waters, know where your entropy is coming from. We agree on your substantive points re RNGs, I think, but you're interestingly wrong here. Entropy is a physical quantity, it even figures into chemistry. The physics-of-computation people (Bennett? Landaur? etc) have written about thermodynamics information. Modulo Chaitin-type mindgames about measuring it :-) Anyway we're cryptographers, not philosophers, so we should be safe.. Four wheeling through the epistemological bog with Shannon as copilot - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG (was: Quantum Computing ...)
Eugen Leitl wrote: Is there any point in compressing the video before running it through a cryptohash? No. (assuming you're talking about lossless compression) In general, any invertible transformation neither adds or subtracts entropy, and hence is extremely unlikely to make any difference to the performance of the hash function (assuming SHA-1 is cryptographically secure, which it is currently believed to be). Lossless compression is just one special kind of invertible transformation. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG (was: Quantum Computing ...)
- Original Message - From: Eugen Leitl [EMAIL PROTECTED] Subject: Re: building a true RNG (was: Quantum Computing ...) I've got a framegrabber with a 640x480 24 bit/pixel camera. It doesn't compress, is rather noisy, and since self-adjusting I get the maximum entropy at maximum darkness. Is there any point in compressing the video before running it through a cryptohash? It will not serve a cryptographic use, however if you can find a fast enough truly lossless compressor it can be very useful. Since I assume you'll be taking a picture purely in the dark a usable compressor would be (please pardon the severely abused pseduo-code) SHA1 pool on_pixel { if pixel is not black SHA1_update(pool, pixel, pixel coordinates); } get_random() { SHA1 temp SHA1_update(pool, 1) temp = SHA1_duplicate(pool) return SHA1_finalize(temp) } How does e.g. SHA-1 fare with very sparse bitvectors? It is believed to fare quite well, and considering that the line for proper entropy distillation is actually well below the line for cryptographic security SHA-1 is likely to remain very good for this purpose. If you are more concerned about speed than maximum entropy containment (or require less than 128-bits of entropy) you might also consider MD5. If you are extremely concerned about this (and are willing to lose a few other desirable behaviors) it is possible to use a block cipher, basically in CBC mode, to accumulate entropy, this has the advantage that under some reduced assumptions it is possible to compute the maximal entropy of the state at a given time. Joe - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
John S. Denker [EMAIL PROTECTED] writes: Is there any point in compressing the video before running it through a cryptohash? There might be a minor point, namely computational efficiency. I can't believe any compression software could be as fast as just feeding the signal straight into SHA-1. -- __ Paul Crowley \/ o\ [EMAIL PROTECTED] http://www.ciphergoth.org/ /\__/ BiCon 2002 UK bisexual gathering: http://www.2002.bicon.org.uk/ - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
On 24 Jul 2002, Paul Crowley wrote: I can't believe any compression software could be as fast as just feeding the signal straight into SHA-1. I haven't tried this, but assuming I'm digitizing dark video and only get noise in the lower significant bits I can just mask out the constant (zero) ones and paste them together to destill the entropy with a very low computational cost before feeding it into a cryptohash. As an aside to what constitutes physical entropy of a system it is indeed depending on context of the measurement. A good source of information on entropy in all contexts is http://www.math.psu.edu/gunesch/entropy.html - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG (was: Quantum Computing ...)
-- On 22 Jul 2002 at 15:39, David Honig wrote: You should be able to use any source which you know is not a PRNG as the entropy-source in a true RNG. You should be able to use entropy (and stat tests) to measure the source entropy after digitization. You cannot measure entropy retrospectively. You need to have a theory as to where the entropy is coming from, in order to reliably measure it. Thus hardware sources should be based on simple and well understood physical principles, such as Johnson noise or shot noise. Entropy is not quite a physical quantity -- rather it is on the slippery edge between being a physical thing and a philosophical thing. If you are not careful, you will slip into a deep epistemic bog and find yourself needing to ask how do we know what is knowable, and what is the whichness of why? To avoid such deep waters, know where your entropy is coming from. --digsig James A. Donald 6YeGpsZR+nOTh/cGwvITnSR3TdzclVpR0+pr3YYQdkG SMGOwg3qIP0/FsfmA7GzZGN/XYAabuqcE9Z9eiuB 2CBUwRUngy0VcmaR93NvqduyZBKgppbTUy49tSdEn - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG (was: Quantum Computing ...)
On Mon, 22 Jul 2002, David Honig wrote: Yes, it is a joke. However, it is also a viable if low-bandwidth entropy source. I disagree that you need to be able to model I've got a framegrabber with a 640x480 24 bit/pixel camera. It doesn't compress, is rather noisy, and since self-adjusting I get the maximum entropy at maximum darkness. Is there any point in compressing the video before running it through a cryptohash? How does e.g. SHA-1 fare with very sparse bitvectors? - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG (was: Quantum Computing ...)
John S. Denker [EMAIL PROTECTED] writes: Source -- Digitizer -- Simple hash -- Whitener (e.g., DES) OK, we have DES as an example of a whitener. -- Can somebody give me an example of a simple hash that performs irreversible compression of the required kind? I can give you a number of examples: MD5, SHA-1, -- Isn't the anti-collision property required of even the simplest hash? Isn't that tantamount to a very strong mixing property? If there's strong mixing in the simple hash function, why do we need more mixing in the later whitening step? More mixing is never bad in an RNG.. See RFC1750. -- What is meant by cryptologic strength? Strength against what kind of attack? If this means in particular the one-way property, why do I need it? I can understand why a !!pseudo!! random symbol generator needs the one-way property, to protect its internal state, but since my generator has no secret state to protect, why do I need any cryptologic properties other than mixing? I think they probably meant cryptographic strength, but I don't know what was going through their minds. What do people mean by authentification? That's not even a real world but I see it all the time. To me, I think people just don't know the right term to use so they just put down something that sounds right to them, regardless of its correctness. -derek -- Derek Atkins Computer and Internet Security Consultant [EMAIL PROTECTED] www.ihtfp.com - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG
Eugen Leitl wrote: ... framegrabber with a 640x480 24 bit/pixel camera. It doesn't compress, is rather noisy, and since self-adjusting I get the maximum entropy at maximum darkness. OK. Evidently it's dominated by thermal noise, not to be confused with the Poisson noise recently featured in another thread. Not a problem. Is there any point in compressing the video before running it through a cryptohash? There might be a minor point, namely computational efficiency. A well-chosen compressor might eliminate low-entropy bytes rather quickly. Make sure it's a lossless compressor, perhaps GIF or PNG ... as opposed to a perceptual coder (e.g. JPEG) that would persumably throw away some of the entropy. Calling SHA-1 on low-entropy bytes doesn't waste entropy, but wastes CPU cycles. How does e.g. SHA-1 fare with very sparse bitvectors? 1) In any good hash function, any input bit should have about as much effect on the output as any other input bit. SHA-1 has been analyzed by experts (of which I am not one :-) and I would imagine they checked this. 2) There are 5 one-bit shifts in the fivefold expansion, and lots of 5-bit shifts in the main loop, so it shouldn't matter that the sparse input bits are clustered in the bottom of the 32-bit words. 3) I performed an amateur kick-the-tires test, namely cobbling up some sparse input vectors, calling SHA-1, and applying standard statistical tests including Diehard and Maurer's universal statistical test. No nobody's surprise, the tests didn't detect anything. Arnold Reinhold wrote: ... with a portable TV set and a video digitizer should be a good source of high bandwidth noise. In both cases you are just using the receivers as high gain amplifiers of the thermal noise at the antenna terminals. Thermal noise is good. Antennas are bad -- just an invitation to be attacked that way. Get rid of the antenna. Keep the high gain preamp. Better yet, do as Eugen has done: Use a framegrabber !!without!! the portable TV set. No RF section at all. Plenty of entropy, lower cost, greater simplicity, and less vulnerability to attack. For that matter, an audio card (without microphone) produces more than enough entropy for most applications. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG (was: Quantum Computing ...)
On Mon, 22 Jul 2002, John S. Denker wrote: David Honig wrote yet another nice note: I'm not trying to be dense, but I'm totally not understanding the distinction here. The following block diagram is excellent for focussing the discussion, (thanks): Source -- Digitizer -- Simple hash -- Whitener (e.g., DES) OK, we have DES as an example of a whitener. -- Can somebody give me an example of a simple hash that performs irreversible compression of the required kind? Depends on the data and how much entropy you suppose it has, really. An irreversible compression function that I use when extracting entropy from text (for other purposes) is to have a counter. Each time you process a character, you add the character code to the counter, then multiply the counter by 2.4 rounding down. This is based on estimates of 1.33 bits of entropy per character in english text, and requires an initialization vector (in this case an initialization value) twice as long as the character code to prevent you from taking too many bits from the first few characters alone. For something like a lava-lamp picture, your compression function might be first converting it into a 4-color image, editing out the constant parts (eg, the lamp base and edges), compressing that using PNG format, and then taking some similarly counter-based function of those bits. Using a time series of pictures of the same lava-lamp, you'd have to adjust for lower entropy per byte of processed PNG (by using a lower factor), because it could be redundant with other frames. -- Isn't the anti-collision property required of even the simplest hash? Isn't that tantamount to a very strong mixing property? If there's strong mixing in the simple hash function, why do we need more mixing in the later whitening step? You are talking, specifically, about cryptographic hash functions. The diagram specifies a simple hash function. The distinction between cryptographic hashes and simple hashes is, a simple hash is supposed to produce evenly distributed output. A cryptographic hash is supposed to produce evenly distributed *and unpredictable* output. A simple hash, plus a whitener, is about what you're thinking of for a cryptographic hash function. I assume digestion means the same as distillation? Roughly. People talk of digestion of a datastream, or distillation of entropy, or irreversible compression, etc. It's roughly the same thing. Gimme a break. In particular, gimme an example of a crypto algorithm that will fail if it is fed with a random-symbol generator that has only 159.98 bits in a 160 bit word. That's one bit per 8k. I guess it just depends on which 8k comes through and how much your opponent can make of one bit. I see no point in whitening the output of such a distiller. So the adversary can't look back into your logic. A 'distiller' which produces quality entropy (after digesting an arbitrary number of bits) needn't be as opaque as a crypto-secure hash is. I'm still needing an example of a distiller that has the weakness being alleged here. In particular, -- either it wastes entropy (due to excessive hash collisions) in which case it isn't a good distiller, and whitening it won't improve things (won't recover the lost entropy), or -- it doesn't waste entropy, in which case the output has entropy density of 159.98/160, in which case there is nothing to be gained by so-called whitening or any other post-processing. I think you may be right about that -- whitening protects you from errors in an overly-simple distiller such as I described above, but if you've got a really fine-tuned one, it doesn't help much. In particular, (proof by contradiction) consider the following scenario: suppose she captures 100 bits of output, and wants to use it to make some predictions about the next 60 bits of output. She uses the 100 bits to see back into the hypothetical simple-hash function, learn something about the input thereof, and then pushes that forward again through the simple-hash function to make the predictions. But this scenario violates the most basic requirements of the hash function, even the simplest of simple-hash functions. Again, it violates the requirements of a cryptographic hash function, not a simple hash function. Bear - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG (was: Quantum Computing ...)
On Tue, 23 Jul 2002, John S. Denker wrote: -- I am told (but don't understand) that there might exist a weaker hash that somehow does require whitening. This is the point of the conversation. Please address this point if you can. Perhaps they were refering to something like what is done in the /dev/random driver, where inputs are mixed in using a simple polynomial scheme whose exact details (or name) escapes me at the moment. This is basically because it's called during interupts, and you might not want to be calling out to something expensive like SHA-1 right then. Then when someone reads from the device the output is derived from the internal pool using SHA-1. Regards, Jack - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: building a true RNG (was: Quantum Computing ...)
David Honig wrote: The thread here has split into QM True Randomness and what do you need to build a true RNG... Yup. Specifically: The executive summary of the principles of operation of my generator is: -- use SHA-1, which is believed to be resistant to collisions, even under chosen-input attack. -- use it under conditions where the adversary cannot choose the input. -- the rest is just physics and statistics. Sure. There are many examples of this kind of generator, using physical sources from video'd lava lamps to radioactive decay (incl. semiconductor junctions, resistors, microphone, detuned FM radio cards). For the humor-impaired, let me point out that the lava lamp is a joke. What it conspicuously lacks is a proof of correctness -- that is, a nonzero lower bound on the entropy rate of the raw data. The lava could turn out to have a not-very-complicated periodic pattern. Secondarily, the pattern changes so slowly that there must be rather strict upper bounds on the entropy rate, small out of all proportion to the cost of the contraption. A detuned FM card is a bad idea, because it is just begging the opponent to sit next door with an FM transmitter. A microphone causes users to worry about privacy, and in any case doesn't add much beyond what you'd get with the same input circuitry open-circuited, i.e. everything except the microphone itself. Radioactive decay has a poor price/performance ratio, and isn't nearly as random as neophytes might think, when the data-acquisition hardware is taken into account. 2) Vetting a generator by trying to detect patterns in the output is like kicking the tires on a used car ... go ahead and do it if you want, but it is far from sufficient for establishing any reasonable standard of correctness. You can't vet a RNG by looking at its output, We agree. which is likely whitened anyway, Depending on what whitening means; see below. but you can gain confidence by looking at its design Yes! and measuring the entropy in the raw-physical-source derived bitstream. That's the point where I would like some more detail. If measuring means applying statistical tests, then I've never seen such measurements done in a way that is really convincing. Constructive examples would be welcome. Just saying Joe Schmoe applied all the tests he could think of and couldn't compress it more than XY% isn't going to convince me. I recommend _calculating_ the entropy from physics principles, rather than trying to measure the entropy using statistical tests. The calculation is based on a handful of macroscopic physical parameters, such as temperature, gain, and bandwidth. If the raw source has 1 bits/symbol (and it will), Commonly but not necessarily 1 bit/symbol. Depending on what you mean by symbol, a 24-bit audio card provides a low-cost counterexample. it'd be nice if a later stage distilled this to near 1 bit/symbol, before whitening. We need to be more specific about what the symbol alphabet is. If the symbols are ASCII characters, 1 bit per symbol is not nearly good enough. More importantly, I don't know what whitening means in this case. The output of a good distiller has virtually 100% entropy density, i.e. 8 bits per byte. I say virtually because perfection is impossible, but 159.98 bits in a 160 bit word ought to be good enough for most applications :-). I see no point in whitening the output of such a distiller. If whitening means encrypting the output of the distiller, I consider that just a more-complicated hash function ... just another few rounds. Of course, no one outside the box will know, since you're whitening, but it yields resistance to (albeit difficult) attacks (e.g., your hash turns out to be attackable). I assume that means know [that I'm using a distiller] Well, in principle nobody outside the box knows _anything_ about the mechanism (unless they read the documentation). One random symbol stream looks a lot like another :-). Attackers can always check to see whether the generator is broken or not. But if it's not broken, all they can do (from outside) is measure the output-rate. I also fail to see harm in measuring/monitoring entropy as the RNG operates. Certainly there's no harm. It's like kicking the tires on the used car. It gives some people a warm fuzzy, but it's far from sufficient for establishing any reasonable level of confidence. I recommend monitoring the aforementioned macroscopic (non-statistical) physical parameters, both to detect gross hardware failure and to detect attempted jamming. But that's very different from the traditional (and hereby deprecated) procedure of measuring the entropy using statistical tests. For lots more detail, see http://www.monmouth.com/~jsd/turbid/ - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL
Re: building a true RNG (was: Quantum Computing ...)
At 04:24 PM 7/22/02 -0400, John S. Denker wrote: For the humor-impaired, let me point out that the lava lamp is a joke. What it conspicuously lacks is a proof of correctness -- that is, a nonzero lower bound on the entropy rate of the raw data. Yes, it is a joke. However, it is also a viable if low-bandwidth entropy source. I disagree that you need to be able to model the lava (etc) well enough to give a lower bound on the entropy. (Though its nice if you have such a model). You should be able to use any source which you know is not a PRNG as the entropy-source in a true RNG. You should be able to use entropy (and stat tests) to measure the source entropy after digitization. The lava could turn out to have a not-very-complicated periodic pattern. Secondarily, the pattern changes so slowly that there must be rather strict upper bounds on the entropy rate, small out of all proportion to the cost of the contraption. Yes, thus the humor. A detuned FM card is a bad idea, because it is just begging the opponent to sit next door with an FM transmitter. So work in a Faraday cage... A microphone causes users to worry about privacy, and in any case doesn't add much beyond what you'd get with the same input circuitry open-circuited, i.e. everything except the microphone itself. The advantage of a microphone, I suppose, is that you can hear any jamming attempts.. Radioactive decay has a poor price/performance ratio, and isn't nearly as random as neophytes might think, when the data-acquisition hardware is taken into account. Its a joke/hack of the 'finesse' form.. which is likely whitened anyway, Depending on what whitening means; see below. You can imagine simple-hashing (irreversible compression) as distinct from whitening which is related to cryptographic strength, avalanche, mixing, etc. Source -- Digitizer -- Simple hash -- Whitener (e.g., DES) In this view, SHA combines the compression (aka digestion) function with the crypto-strength whitening That's the point where I would like some more detail. If measuring means applying statistical tests, then I've never seen such measurements done in a way that is really convincing. Constructive examples would be welcome. You collect some representative raw data, and run a number of entropy measurements on that sample. You find 1bit/baud. You run the data through an algorithm which produces fewer bits. You measure the entropy of the result. When successive (or 'stronger') runs measure at 1b/bd, you have distilled entropy from that sample. To use this in crypto, you'd put it through a whitener --feed blocks to DES-- for belts-and-suspenders assurance. And because you don't want someone looking through your simple-hashing logic back to your source. Once you put it through DES, anything (eg the integers) appears random. That's why you measure before whitening, if possible. If you use SHA you can only measure the input entropy and make sure you hash enough bits down to produce your output stream. Since utter crap coming in will look perfectly white coming out. I recommend _calculating_ the entropy from physics principles, rather than trying to measure the entropy using statistical tests. The calculation is based on a handful of macroscopic physical parameters, such as temperature, gain, and bandwidth. You measure because your model may have overlooked something. The output of a good distiller has virtually 100% entropy density, i.e. 8 bits per byte. I say virtually because perfection is impossible, but 159.98 bits in a 160 bit word ought to be good enough for most applications :-). I agree with the first statement (by definition), I think in crypto you have to be dubious (paranoid?) of the second. I see no point in whitening the output of such a distiller. So the adversary can't look back into your logic. A 'distiller' which produces quality entropy (after digesting an arbitrary number of bits) needn't be as opaque as a crypto-secure hash is. If whitening means encrypting the output of the distiller, I consider that just a more-complicated hash function ... just another few rounds. That's a fine implementation. Others may wish to separate the phases (e.g., hardware to do simple-hashing can be high-speed on the entropy-source-side, and feed slower crypto-hashing (whitening) logic on the other side). But using software to distill your entropy its easy enough to run a few more iters. And there are very few users of high bandwidth entropy. Of course, no one outside the box will know, since you're whitening, but it yields resistance to (albeit difficult) attacks (e.g., your hash turns out to be attackable). I assume that means know [that I'm using a distiller] No one outside the box can tell the difference between simply hashed 'noise' and crypto-hashed 'noise'. Or between those and a long-sequence PRNG. Now, if the adversary knows things about your simple-distiller, he may be able to use that. If a
Re: building a true RNG (was: Quantum Computing ...)
David Honig wrote yet another nice note: So work in a Faraday cage... Tee, hee. Have you ever worked in a Faraday cage? Very expensive. Very inconvenient. Depending on what whitening means; see below. You can imagine simple-hashing (irreversible compression) as distinct from whitening which is related to cryptographic strength, avalanche, mixing, etc. I'm not trying to be dense, but I'm totally not understanding the distinction here. The following block diagram is excellent for focussing the discussion, (thanks): Source -- Digitizer -- Simple hash -- Whitener (e.g., DES) OK, we have DES as an example of a whitener. -- Can somebody give me an example of a simple hash that performs irreversible compression of the required kind? -- Isn't the anti-collision property required of even the simplest hash? Isn't that tantamount to a very strong mixing property? If there's strong mixing in the simple hash function, why do we need more mixing in the later whitening step? -- What is meant by cryptologic strength? Strength against what kind of attack? If this means in particular the one-way property, why do I need it? I can understand why a !!pseudo!! random symbol generator needs the one-way property, to protect its internal state, but since my generator has no secret state to protect, why do I need any cryptologic properties other than mixing? -- BTW note that the term avalanche is probably not helpful to this discussion, because it is usually defined (see e.g. Handbook of Applied Cryptography) in terms of single-bit flips. The anti-collision property of the hash function demands resistance to multi-bit flips. In this view, SHA combines the compression (aka digestion) function with the crypto-strength whitening If you want to think of it that way, then we come to the same final state. I assume digestion means the same as distillation? You collect some representative raw data, and run a number of entropy measurements on that sample. You find 1bit/baud. In particular you have found an upper bound. To paraphrase Dykstra: testing can find an upper bound on the entropy density, but it can never find a lower bound. You run the data through an algorithm which produces fewer bits. You measure the entropy of the result. When successive (or 'stronger') runs measure at 1b/bd, you have distilled entropy from that sample. No, you have just reached the limits of your chosen number of entropy measurements. You cannot convince a skeptic that a new test, discovered tomorrow, will not greatly lower your new (1b/bd) upper bound. To use this in crypto, you'd put it through a whitener --feed blocks to DES-- for belts-and-suspenders assurance. And because you don't want someone looking through your simple-hashing logic back to your source. I say again that I don't need the one-way property. At each iteration, the input to the hash function is unrelated to all past and future inputs. We agree that the belt-and-suspenders approach is a standard way of achieving high reliability. It works in crypto and in many unrelated fields http://www.monmouth.com/~jsd/how/htm/misc.html#sec-layers-safety If you want to XOR the output of the HRNG with some nice PRNG, that will give excellent error-concealment in the case of a gross failure of one or the other. But (minor opoint) I recommend we continue to call a belt a belt, and call a suspender a suspender. Call a HRNG a HRNG, and call a PRNG a PRNG. Adding a strong one-way function to my HRNG seems like a total waste of CPU cycles. Once you put it through DES, anything (eg the integers) appears random. That's why you measure before whitening, if possible. We agree that measuring after whitening is pointless, given the current state of the art, namely that encryption algorithms are incomparably stronger than automatic measurement (pattern-finding) algorithms. I recommend _calculating_ the entropy from physics principles, rather than trying to measure the entropy using statistical tests. The calculation is based on a handful of macroscopic physical parameters, such as temperature, gain, and bandwidth. You measure because your model may have overlooked something. Measure what? Measure how? If I run diehard on my raw data it will tell me that the data has entropy density far less than 8 bits per byte -- but duh, I already knew that. Running standard compression algorithms (Lempel-Ziv) or whatever will give me an upper bound that is much, much, higher than my calculated lower bound -- so even if I've overlooked something or made an error in the calculation the measurement is not sensitive enough to catch it. The output of a good distiller has virtually 100% entropy density, i.e. 8 bits per byte. I say virtually because perfection is impossible, but 159.98 bits in a 160 bit word ought to be good enough for most applications :-). I agree with the first statement (by definition), I think in crypto you have to be