Re: combining entropy
On Tue, 28 Oct 2008, John Denker wrote: | Date: Tue, 28 Oct 2008 12:09:04 -0700 | From: John Denker <[EMAIL PROTECTED]> | To: "Leichter, Jerry" <[EMAIL PROTECTED]>, | Cryptography | Cc: IanG <[EMAIL PROTECTED]> | Subject: Re: combining entropy | | On 10/28/2008 09:43 AM, Leichter, Jerry wrote: | | > | We start with a group comprising N members (machines or | > | persons). Each of them, on demand, puts out a 160 bit | > | word, called a "member" word. We wish to combine these | > | to form a single word, the "group" word, also 160 bits | > | in length. | > This isn't enough. Somehow, you have to state that the values emitted | > on demand in any given round i (where a round consists of exactly one | > demand on all N member and produces a single output result) cannot | > receive any input from any other members. Otherwise, if N=2 and member | > 0 produces true random values that member 1 can see before it responds | > to the demand it received, then member 1 can cause the final result to | > be anything it likes. | | | Perhaps an example will make it clear where I am coming | from. Suppose I start with a deck of cards that has been | randomly shuffled. It can provide log2(52!) bits of | entropy. That's a little more than 225 bits. Now suppose | I have ten decks of cards all arranged alike. You could | set this up by shuffling one of them and then stacking | the others to match ... or by some more symmetric process. | In any case the result is symmetric w.r.t interchange of | decks. In this situation, I can choose any one of the | decks and obtain 225 bits of entropy. The funny thing | is that if I choose N of the decks, I still get only 225 | bits of entropy, not N*225 | The original question spoke of "trusted" sources of | entropy, and I answered accordingly. To the extent | that the sources are correlated, they were never eligible | to be considered trusted sources of entropy. To say | the same thing the other way around, to the extent | that each source can be trusted to provide a certain | amount of entropy, it must be to that extent independent | of the others. Rest of example omitted. I'm not sure of the point. Yes, there are plenty of ways for correlation to sneak in. As far as I can see, only the second piece I quoted is relevant, and it essentially gets to the point: The original problem isn't well posed. It makes no sense *both* to say the sources and trusted *and* to say that they may not deliver the expected entropy. If I know the entropy of all the sources, that inherently includes some notion of trust - call it source trust: I can trust them to have at least that much entropy. I have to have that trust, because there is no way to measure the (cryptographic) entropy. (And don't say I can analyze how the source is constructed, because then I'm left with the need to trust that what I analyzed is actually still physically there - maybe an attacker has replaced it!) Given such sources it's easy to *state* what it would mean for them to be independent: Just that if I consider the source produced by concatenating all the individual sources, its entropy is the sum of the entropies of the constituents. Of course, that's an entropy I can again measure - at least in the limit - in the information theoretical sense, but not in the cryptographic sense; another aspect of trust - call it independence trust - has to enter here. All that's fine, but how then are we supposed to construe a question about what happens if some of the sources fail to deliver their rated entropy? That means that source trust must be discarded. (Worse, as the original problem is posed, I must discard source trust for *some unknown subset of the sources*.) But given that, why should I assume that independence trust remains? Sure, I can make additional assumptions. If I'm concerned only about, say, physical failures of sources implemented as well-isolated modules, it might well be a reasonable thing to do. In fact, this is essentially the independent- failure model we use all the time in building reliable physical systems. Of course, as we know well, that model is completely untenable when the concern is hostile attack, not random failure. What do you replace it with? Consider the analogy with reliable distributed systems. People have basically only dealt with two models: 1. The fail-stop model. A failed module stops interacting. 2. The Byzantine model. Failed modules can do anything including cooperating by exchanging arbitrary information and doing infinite computation. The Byzantine model is bizarre sounding, but it's just a way of expressing a worst-case situation: Maybe the failed modules act randomly but just by bad luck they do the worst possible thing. We're trying to define something different here. Twenty-odd years ago, Mike Fischer at Yale proposed some ideas in this direction (where modules have access t
Re: combining entropy
On 10/28/2008 09:43 AM, Leichter, Jerry wrote: > | We start with a group comprising N members (machines or > | persons). Each of them, on demand, puts out a 160 bit > | word, called a "member" word. We wish to combine these > | to form a single word, the "group" word, also 160 bits > | in length. > This isn't enough. Somehow, you have to state that the values emitted > on demand in any given round i (where a round consists of exactly one > demand on all N member and produces a single output result) cannot > receive any input from any other members. Otherwise, if N=2 and member > 0 produces true random values that member 1 can see before it responds > to the demand it received, then member 1 can cause the final result to > be anything it likes. Perhaps an example will make it clear where I am coming from. Suppose I start with a deck of cards that has been randomly shuffled. It can provide log2(52!) bits of entropy. That's a little more than 225 bits. Now suppose I have ten decks of cards all arranged alike. You could set this up by shuffling one of them and then stacking the others to match ... or by some more symmetric process. In any case the result is symmetric w.r.t interchange of decks. In this situation, I can choose any one of the decks and obtain 225 bits of entropy. The funny thing is that if I choose N of the decks, I still get only 225 bits of entropy, not N*225. This can be summarized by saying that entropy is not an extensive quantity in this situation. The graph of entropy versus N goes like this: 225* * * * * 0* 0 1 2 3 4 5 (# of decks) The spooky aspect of this situation is the whack-a-mole aspect: You cannot decide in advance which one of the decks has entropy and which N-1 of them do not. That's the wrong question. The first deck we choose to look at has 225 bits of entropy, and only then can we say that the other N-1 decks have zero additional entropy. The original question spoke of "trusted" sources of entropy, and I answered accordingly. To the extent that the sources are correlated, they were never eligible to be considered trusted sources of entropy. To say the same thing the other way around, to the extent that each source can be trusted to provide a certain amount of entropy, it must be to that extent independent of the others. It is possible for a source to be partially dependent and partially independent. For example, if you take each of the ten aforementioned decks and "cut the deck" randomly and independently, that means the first deck we look at will provide 225 bits of entropy, and each one thereafter will provide 5.7 bits of additional entropy, since log2(52)=5.7. So in this situation, each deck can be /trusted/ to provide 5.7 bits of entropy. In this situation, requiring each deck to have "no input" from the other decks would be an overly strict requirement. We do not need full independence; we just need some independence, as quantified by the provable lower bound on the entropy. If you wanted, you could do a deeper analysis of this example, taking into account the fact that 5.7 is not the whole story. It is easy to use 5.7 bits as a valid and trustworthy lower bound, but under some conditions more entropy is available, and can be quantified by considering the _joint_ probability distribution and computing the entropy of that distribution. Meanwhile the fact remains that under a wide range of practical conditions, it makes sense to engineer a randomness generator based on provable lower bounds, since that is good enough to get the job done, and a deeper analysis would not be worth the trouble. http://www.av8n.com/turbid/paper/turbid.htm > If the issue is how to make sure you get out at least all the randomness > that was there, I'm going to ignore the "At least". It is very hard to get out more than you put in. On a less trivial note: The original question did not require getting out every last bit of available randomness. In situations where the sources might be partially independent and partially dependent, that would be a very hard challenge, and I do not wish to accept that challenge. Dealing with provable lower bounds on the entropy is more tractable, and sufficient for a wide range of practical purposes. - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: combining entropy
On Sat, 25 Oct 2008, John Denker wrote: | On 10/25/2008 04:40 AM, IanG gave us some additional information. | | Even so, it appears there is still some uncertainty as to | interpretation, i.e. some uncertainty as to the requirements | and objectives. | | I hereby propose a new scenario. It is detailed enough to | be amenable to formal analysis. The hope is that it will | satisfy the requirements and objectives ... or at least | promote a more precise discussion thereof. | | We start with a group comprising N members (machines or | persons). Each of them, on demand, puts out a 160 bit | word, called a "member" word. We wish to combine these | to form a single word, the "group" word, also 160 bits | in length. This isn't enough. Somehow, you have to state that the values emitted on demand in any given round i (where a round consists of exactly one demand on all N member and produces a single output result) cannot receive any input from any other members. Otherwise, if N=2 and member 0 produces true random values that member 1 can see before it responds to the demand it received, then member 1 can cause the final result to be anything it likes. This is an attack that must be considered because you already want to consider the case: | b) Some of [the members] are malicious. Their outputs may appear | random, but are in fact predictable by our adversary. Stating this requirement formally seems to be quite difficult. You can easily make it very strong - the members are to be modeled as probabilistic TM's with no input. Then, certainly, no one can see anyone else's value, since they can't see *anything*. But you really want to say something along the lines of "no malicious member can see the value output by any non-malicious member", which gets you into requiring an explicit failure model - which doesn't fit comfortably with the underlying problem. If the issue is how to make sure you get out at least all the randomness that was there, where the only failures are that some of your sources become predictable, the XOR is fine. But once you allow for more complicated failure/attack modes, it's really not clear what is going on and what the model should to be. -- Jerry - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]