Re: combining entropy

2008-10-28 Thread Leichter, Jerry
On Tue, 28 Oct 2008, John Denker wrote:

| Date: Tue, 28 Oct 2008 12:09:04 -0700
| From: John Denker <[EMAIL PROTECTED]>
| To: "Leichter, Jerry" <[EMAIL PROTECTED]>,
| Cryptography 
| Cc: IanG <[EMAIL PROTECTED]>
| Subject: Re: combining entropy
| 
| On 10/28/2008 09:43 AM, Leichter, Jerry wrote:
| 
| > | We start with a group comprising N members (machines or 
| > | persons).  Each of them, on demand, puts out a 160 bit 
| > | word, called a "member" word.  We wish to combine these 
| > | to form a single word, the "group" word, also 160 bits 
| > | in length.
| > This isn't enough.  Somehow, you have to state that the values emitted
| > on demand in any given round i (where a round consists of exactly one
| > demand on all N member and produces a single output result) cannot
| > receive any input from any other members.  Otherwise, if N=2 and member
| > 0 produces true random values that member 1 can see before it responds
| > to the demand it received, then member 1 can cause the final result to
| > be anything it likes.
| 
| 
| Perhaps an example will make it clear where I am coming
| from.  Suppose I start with a deck of cards that has been 
| randomly shuffled.  It can provide log2(52!) bits of 
| entropy.  That's a little more than 225 bits.  Now suppose 
| I have ten decks of cards all arranged alike.  You could 
| set this up by shuffling one of them and then stacking 
| the others to match ... or by some more symmetric process. 
| In any case the result is symmetric w.r.t interchange of 
| decks.  In this situation, I can choose any one of the 
| decks and obtain 225 bits of entropy.  The funny thing 
| is that if I choose N of the decks, I still get only 225 
| bits of entropy, not N*225
| The original question spoke of "trusted" sources of
| entropy, and I answered accordingly.  To the extent
| that the sources are correlated, they were never eligible 
| to be considered trusted sources of entropy.  To say 
| the same thing the other way around, to the extent
| that each source can be trusted to provide a certain
| amount of entropy, it must be to that extent independent 
| of the others.
Rest of example omitted.  I'm not sure of the point.  Yes, there are
plenty of ways for correlation to sneak in.

As far as I can see, only the second piece I quoted is relevant, and it
essentially gets to the point:  The original problem isn't well posed.
It makes no sense *both* to say the sources and trusted *and* to say
that they may not deliver the expected entropy.  If I know the entropy of
all the sources, that inherently includes some notion of trust - call
it source trust:  I can trust them to have at least that much entropy.
I have to have that trust, because there is no way to measure the
(cryptographic) entropy.  (And don't say I can analyze how the source
is constructed, because then I'm left with the need to trust that what
I analyzed is actually still physically there - maybe an attacker has
replaced it!)

Given such sources it's easy to *state* what it would mean for them to
be independent:  Just that if I consider the source produced by
concatenating all the individual sources, its entropy is the sum of the
entropies of the constituents.  Of course, that's an entropy I can again
measure - at least in the limit - in the information theoretical sense,
but not in the cryptographic sense; another aspect of trust - call it
independence trust - has to enter here.

All that's fine, but how then are we supposed to construe a question
about what happens if some of the sources fail to deliver their rated
entropy?  That means that source trust must be discarded.  (Worse, as
the original problem is posed, I must discard source trust for *some
unknown subset of the sources*.)  But given that, why should I assume
that independence trust remains?

Sure, I can make additional assumptions.  If I'm concerned only about,
say, physical failures of sources implemented as well-isolated modules,
it might well be a reasonable thing to do.  In fact, this is essentially
the independent- failure model we use all the time in building reliable
physical systems.  Of course, as we know well, that model is completely
untenable when the concern is hostile attack, not random failure.  What
do you replace it with?

Consider the analogy with reliable distributed systems.  People have
basically only dealt with two models:

1.  The fail-stop model.  A failed module stops interacting.
2.  The Byzantine model.  Failed modules can do anything
including cooperating by exchanging arbitrary
information and doing infinite computation.

The Byzantine model is bizarre sounding, but it's just a way of expressing
a worst-case situation:  Maybe the failed modules act randomly but just by
bad luck they do the worst possible thing.

We're trying to define something different here.  Twenty-odd years ago,
Mike Fischer at Yale proposed some ideas in this direction (where
modules have access t

Re: combining entropy

2008-10-28 Thread John Denker
On 10/28/2008 09:43 AM, Leichter, Jerry wrote:

> | We start with a group comprising N members (machines or 
> | persons).  Each of them, on demand, puts out a 160 bit 
> | word, called a "member" word.  We wish to combine these 
> | to form a single word, the "group" word, also 160 bits 
> | in length.
> This isn't enough.  Somehow, you have to state that the values emitted
> on demand in any given round i (where a round consists of exactly one
> demand on all N member and produces a single output result) cannot
> receive any input from any other members.  Otherwise, if N=2 and member
> 0 produces true random values that member 1 can see before it responds
> to the demand it received, then member 1 can cause the final result to
> be anything it likes.


Perhaps an example will make it clear where I am coming
from.  Suppose I start with a deck of cards that has been 
randomly shuffled.  It can provide log2(52!) bits of 
entropy.  That's a little more than 225 bits.  Now suppose 
I have ten decks of cards all arranged alike.  You could 
set this up by shuffling one of them and then stacking 
the others to match ... or by some more symmetric process. 
In any case the result is symmetric w.r.t interchange of 
decks.  In this situation, I can choose any one of the 
decks and obtain 225 bits of entropy.  The funny thing 
is that if I choose N of the decks, I still get only 225 
bits of entropy, not N*225.

This can be summarized by saying that entropy is not an
extensive quantity in this situation.  The graph of
entropy versus N goes like this:

 225*   *   *   *   *

   0*
0   1   2   3   4   5  (# of decks)

The spooky aspect of this situation is the whack-a-mole 
aspect:  You cannot decide in advance which one of the 
decks has entropy and which N-1 of them do not.  That's 
the wrong question.  The first deck we choose to look 
at has 225 bits of entropy, and only then can we say 
that the other N-1 decks have zero additional entropy.

The original question spoke of "trusted" sources of
entropy, and I answered accordingly.  To the extent
that the sources are correlated, they were never eligible 
to be considered trusted sources of entropy.  To say 
the same thing the other way around, to the extent
that each source can be trusted to provide a certain
amount of entropy, it must be to that extent independent 
of the others.

It is possible for a source to be partially dependent
and partially independent.  For example, if you take
each of the ten aforementioned decks and "cut the deck"
randomly and independently, that means the first deck
we look at will provide 225 bits of entropy, and each
one thereafter will provide 5.7 bits of additional
entropy, since log2(52)=5.7.  So in this situation,
each deck can be /trusted/ to provide 5.7 bits of
entropy.

In this situation, requiring each deck to have "no
input" from the other decks would be an overly strict
requirement.  We do not need full independence;  we
just need some independence, as quantified by the
provable lower bound on the entropy.

If you wanted, you could do a deeper analysis of this 
example, taking into account the fact that 5.7 is not 
the whole story.  It is easy to use 5.7 bits as a valid 
and trustworthy lower bound, but under some conditions 
more entropy is available, and can be quantified by
considering the _joint_ probability distribution and
computing the entropy of that distribution.  Meanwhile
the fact remains that under a wide range of practical 
conditions, it makes sense to engineer a randomness 
generator based on provable lower bounds, since that 
is good enough to get the job done, and a deeper
analysis would not be worth the trouble.
  http://www.av8n.com/turbid/paper/turbid.htm


> If the issue is how to make sure you get out at least all the randomness
> that was there,

I'm going to ignore the "At least".  It is very hard to 
get out more than you put in.

On a less trivial note:  The original question did not
require getting out every last bit of available randomness.
In situations where the sources might be partially
independent and partially dependent, that would be a 
very hard challenge, and I do not wish to accept that 
challenge.

Dealing with provable lower bounds on the entropy is
more tractable, and sufficient for a wide range of
practical purposes.

-
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]


Re: combining entropy

2008-10-28 Thread Leichter, Jerry
On Sat, 25 Oct 2008, John Denker wrote:

| On 10/25/2008 04:40 AM, IanG gave us some additional information.
| 
| Even so, it appears there is still some uncertainty as to
| interpretation, i.e. some uncertainty as to the requirements
| and objectives.
| 
| I hereby propose a new scenario.  It is detailed enough to
| be amenable to formal analysis.  The hope is that it will
| satisfy the requirements and objectives ... or at least
| promote a more precise discussion thereof.
| 
| We start with a group comprising N members (machines or 
| persons).  Each of them, on demand, puts out a 160 bit 
| word, called a "member" word.  We wish to combine these 
| to form a single word, the "group" word, also 160 bits 
| in length.
This isn't enough.  Somehow, you have to state that the values emitted
on demand in any given round i (where a round consists of exactly one
demand on all N member and produces a single output result) cannot
receive any input from any other members.  Otherwise, if N=2 and member
0 produces true random values that member 1 can see before it responds
to the demand it received, then member 1 can cause the final result to
be anything it likes.

This is an attack that must be considered because you already want to
consider the case:

|  b) Some of [the members] are malicious.  Their outputs may appear
|   random, but are in fact predictable by our adversary.

Stating this requirement formally seems to be quite difficult.  You can
easily make it very strong - the members are to be modeled as
probabilistic TM's with no input.  Then, certainly, no one can see
anyone else's value, since they can't see *anything*.  But you really
want to say something along the lines of "no malicious member can see
the value output by any non-malicious member", which gets you into
requiring an explicit failure model - which doesn't fit comfortably with
the underlying problem.

If the issue is how to make sure you get out at least all the randomness
that was there, where the only failures are that some of your sources
become predictable, the XOR is fine.  But once you allow for more
complicated failure/attack modes, it's really not clear what is going on
and what the model should to be.
-- Jerry

-
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]