Hi Ted --

At 11:41 PM 8/14/99 -0400, you wrote: 
> 
>standard Mathematician's style --- encrypted by formulae 
>guaranteed to make it opaque to all but those who are trained in the
>peculiar style of Mathematics' papers. 
> ...
>someone tried to pursuade me to use Maurer's test
>...
>too memory intensive and too CPU intensive

You are very wise to be skeptical of mathematical mumbo-jumbo.

You mentioned questions about efficiency, but I would like to call into
question whether the entropy estimate provided by Maurer's Universal
Statistical Test (MUST) would be suitable for our purposes, even if it
could be computed for free.

Don't be fooled by the Universal name.  If you looked it up in a real-world
dictionary, you might conclude that Universal means all-purpose or
general-purpose.  But if you look it up in a mathematical dictionary, you
will find that a Universal probability distribution has the property that
if we compare it to some other distribution, it is not lower by more than
some constant factor.  Alas, the "constant" depends on what two
distributions are being compared, and there is no uniform bound on it!  Oooops!

In the language of entropy, a Universal entropy-estimator overestimates the
entropy by no more than a constant -- but beware, there is no uniform upper
bound on the constant.

To illustrate this point, I have invented Denker's Universal Statistical
Test (DUST) which I hereby disclose and place in the public domain:
According to DUST, the entropy of a string is equal to its length.  That's
it!  Now you may not *like* this test, and you may quite rightly decide
that it is not suitable for your purposes, but my point is that according
to the mathematical definitions, DUST is just as Universal as MUST.

There are profound theoretical reasons to believe it is impossible to
calculate a useful lower bound on the entropy of a string without knowing
how it was prepared.  There simply *cannot* be an all-purpose statistical test.

If you were to make the mistake of treating a Universal estimator as an
all-purpose estimator, and then applying it in a situation where the input
might (in whole or in part) be coming from an adversary, you would lay
yourself open to a chosen-seed attack (analogous to a chosen-plaintext attack).

On the other side of the same coin, if you *do* know something about how
the input was prepared, there obviously are things you can do to improve
your estimate of its entropy.  For example, in the early stages of a
hardware RNG, you could use two input channels, sending the
differential-mode signal to the next stage, and using the common-mode
signal only for error checking.  This is a good way to get rid of a certain
type of interference, and could be quite useful in the appropriate
circumstances.  Returning to the ugly side of the coin, you can see that a
small change in the way the inputs were prepared would make this
differencing scheme worthless, possibly leading to wild overestimates of
the entropy.

BOTTOM LINE:  
 *) Incorporating an all-purpose entropy-estimator into /dev/random is
impossible.
 *) Incorporating something that *pretends* to be an all-purpose estimator
is a Really Bad Idea.
 *) The present design appears to be the only sound design:  whoever
provides the inputs is responsible for providing the estimate of the
entropy thereof.  If no estimate is provided, zero entropy is attributed.

Cheers --- jsd

Reply via email to