Re: building a true RNG

2002-08-03 Thread Bart Preneel



On 2 Aug 2002, Paul Crowley wrote:

 I meant to say, another example of a believed one-way function that is
 guaranteed to be able to produce any output is one based on the
 difficulty of discrete log:

 f:(x) = g^x mod p

 is bijective if the domain and range is 1..p-1, but finding preimages
 is the discrete log problem.  Of course this doesn't compress.  I
 don't know of any examples which compress and have collision resistance.

Choose a group of prime order.
Choose t random group elements g_i (1 = i = t) different from 1.
Write input x as x_1 || x_2 || ... || x_t
prepend a 1 bit to each x_i giving z_i

f:(x)= g_1^z_1 . g_2^z_2 .  . g_t^z^t

For details:
M. Bellare, O. Goldreich, S. Goldwasser,
(Incremental cryptography: the case of hashing and signing, Crypto 94)
after earlier work by D. Chaum, E. van Heijst, B. Pfitzmann,
(Cryptographically strong undeniable signatures, unconditionally secure
for the signer, Crypto'91) and S. Brands.

--Bart


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-08-02 Thread John S. Denker

David Wagner [EMAIL PROTECTED] writes:
 I don't know of any good cryptographic hash function 
 that comes with a proof that all outputs are possible.  

What about the scheme
Pad - Encipher - Contract
described at
  http://www.monmouth.com/~jsd/turbid/paper/turbid.htm#sec-uniform-hash

 if we apply the Luby-Rackoff construction 
 (i.e., 3 rounds of a Feistel cipher), with
 ideal hash functions in each round, does this have 
 the desired properties? It might.

Paul Crowley wrote:
 
  This seems to define a block cipher with no key, which is collision
  free but not one-way.  Am I misunderstanding what you're proposing?

David Wagner wrote:
 
 You understood it perfectly.  Good point.
 I didn't notice that problem.  Harrumph.

There is only the most minor of problems here, namely
that DAW mentioned a symmetric cipher.  The problem goes 
away if you use asymmetric crypto.  You want a cipher with
no _deciphering_ key, as described in my paper.  (op. cit.)

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-08-02 Thread Paul Crowley

John S. Denker [EMAIL PROTECTED] writes:

 David Wagner [EMAIL PROTECTED] writes:
  I don't know of any good cryptographic hash function 
  that comes with a proof that all outputs are possible.  
 
 What about the scheme
   Pad - Encipher - Contract
 described at
   http://www.monmouth.com/~jsd/turbid/paper/turbid.htm#sec-uniform-hash

It's not hard to understand why it isn't easy to solve this problem
using symmetric primitives.  Any solution must come with a proof that
for any image, there is a preimage.  With symmetric primitives, that
proof usually constitutes a recipe for generating the preimage given
the image.

In this case, I don't see a proof that this proposal can generate any
output.  I tried to construct one myself but fell over on the padding
issue.  I suspect that if you were to solve that you would also have
shown that it was not preimage resistant.

If you allow a keyed hash function then it's much easier to specify
the properties it must have to be useful for entropy distillation, but
of course that gives you a chicken-and-egg problem.  Maybe you can do
something with some sort of idea of computable distributions to
overcome the specification problem David Wagner outlines?
-- 
  __  Paul Crowley
\/ o\ [EMAIL PROTECTED]
/\__/ http://www.ciphergoth.org/

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-08-02 Thread Paul Crowley

I meant to say, another example of a believed one-way function that is
guaranteed to be able to produce any output is one based on the
difficulty of discrete log:

f:(x) = g^x mod p

is bijective if the domain and range is 1..p-1, but finding preimages
is the discrete log problem.  Of course this doesn't compress.  I
don't know of any examples which compress and have collision resistance.
-- 
  __  Paul Crowley
\/ o\ [EMAIL PROTECTED]
/\__/ http://www.ciphergoth.org/

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-08-01 Thread John S. Denker

1) There were some very interesting questions such as
  -- whether one can construct a hash function that
 generates all possible codes.
  -- ditto, generating them as uniformly as possible.
  -- Whether off-the-shelf hash functions such as SHA-1 
 have such properties.

The answers are respectively yes, yes, and very probably.

I wrote up a discussion of this, with examples, at
  http://www.monmouth.com/~jsd/turbid/paper/turbid.htm#sec-uniform-hash

2) David W. suggested (off-list) that I clarify the relationship
of entropy-based information-theoretic arguments to computational-
feasibility arguments.  I took some steps in this direction; see
  http://www.monmouth.com/~jsd/turbid/paper/turbid.htm#sec-objectives

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-08-01 Thread Paul Crowley

David Wagner [EMAIL PROTECTED] writes:
 I don't know of any good cryptographic hash function that comes with
 a proof that all outputs are possible.  However, it might not be too
 hard to come up with plausible examples.  For example, if we apply the
 Luby-Rackoff construction (i.e., 3 rounds of a Feistel cipher), with
 ideal hash functions in each round, does this have the desired properties?
 It might.

This seems to define a block cipher with no key, which is collision
free but not one-way.  Am I misunderstanding what you're proposing?
-- 
  __  Paul Crowley
\/ o\ [EMAIL PROTECTED]
/\__/ http://www.ciphergoth.org/

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-08-01 Thread David Wagner

 David Wagner [EMAIL PROTECTED] writes:
  I don't know of any good cryptographic hash function that comes with
  a proof that all outputs are possible.  However, it might not be too
  hard to come up with plausible examples.  For example, if we apply the
  Luby-Rackoff construction (i.e., 3 rounds of a Feistel cipher), with
  ideal hash functions in each round, does this have the desired properties?
  It might.
 
 This seems to define a block cipher with no key, which is collision
 free but not one-way.  Am I misunderstanding what you're proposing?

You understood it perfectly.  Good point.
I didn't notice that problem.  Harrumph.

Thanks for catching my oversight!

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



RE: building a true RNG

2002-07-31 Thread bear



On Tue, 30 Jul 2002, James A. Donald wrote:


Randomness is of course indefinable.  A random oracle is however
definable.

If SHA-1 is indistinguishable from a random oracle without prior
knowledge of the input, then we would like to prove that for an
attacker to make use of the loss of entropy that results from the
fact that it is not a random oracle, the attacker would be need to
be able to distinguish SHA-1 from a random oracle without prior
knowledge of the input.


The above sentence is unsound.  You cannot take an assumption
(If SHA-1 is indistinguishable) and then use its negation
to prove something else (... then we would like to prove that
... the attacker would need to be able to distinguish...),
unless you prove your something else for both the TRUE and FALSE
values of the assumption.

Euclid was all over that, and George Boole after him.

Now, for the TRUE value of your assumption, which I
think you may have meant:

IF (A) SHA-1 is indistinguishable from a random oracle without
   prior knowledge of the input,
   AND
   (B) We can prove that in order to succeed in an attack, an
   attacker would need to distinguish SHA-1 from a random
   Oracle without prior knowledge of the input,
   THEN
   (C) The attacker will not succeed in that attack.

But for the FALSE value of your assumption A, we get

   IF (B) AND (~C) THEN (~A)

And

   IF (B) AND (~A) THEN (C) OR (~C).



So if we take B as an axiom, then we did not prove
anything about A unless given ~C and we did not prove anything
about C regardless of our assumptions about A.


Bear






-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-31 Thread Peter Gutmann

David Wagner [EMAIL PROTECTED] writes:

I once wrote a short note about the relevance of this to IPSec:
http://www.cs.berkeley.edu/~daw/my-posts/using-prngs

There's another way to avoid this problem, which is to separate the nonce RNG
and crypto RNG, so that an attacker seeing the nonce RNG output can't use it
to attack the crypto RNG.  This is done in PGP 5.x and the cryptlib RNG.  OTOH
some RNGs are used in exactly the opposite manner, generating alternate public
and private random quantities, which make it possible to use one to infer
information about the other.  Examples are generators used with SSL and ssh,
which both alternate from public nonces to private session keys and back.

Peter.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-30 Thread Greg Rose

At 03:18 PM 7/29/2002 -0700, David Wagner wrote:
  I don't even think anyone has analyzed the entropy preservation of a 
theoretically perfect random oracle

Well, I know this particular point wasn't central to your email, but
I'm not sure I agree with you on this small point.  I believe it should
be more or less straightforward to analyze the entropy preservation of
a random oracle (alas, so straightforward you probably won't find any
paper on it in the literature).

Actually, it's covered very well (but briefly) in HAC (in section 2.1.6) 
and they refer to a seminal work by Flajolet and Odlyzko (Google finds 
references to it quite easily).

regards,
Greg.



Greg Rose   INTERNET: [EMAIL PROTECTED]
Qualcomm Australia  VOICE:  +61-2-9817 4188   FAX: +61-2-9817 5199
Level 3, 230 Victoria Road,http://people.qualcomm.com/ggr/
Gladesville NSW 2111232B EC8F 44C6 C853 D68F  E107 E6BF CD2F 1081 A37C


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



RE: building a true RNG

2002-07-30 Thread Amir Herzberg

David Wagner said,

 The problem can really be divided into two parts:
   1. Is our entropy crunching algorithm secure when used with
  a random oracle instead of SHA1?
   2. Does SHA1 behave enough like a random oracle that the answer
  to question 1. is in any way relevant to the real world?
 I suspect that question 1. is not hard to answer in a formal,
 rigorous, provable way.  It is only question 2. that is hard
 to answer.  It is absolutely true that we have no proof for
 question 2.

 That said, we should keep in mind that none of our
 cryptographic algorithms (even 3DES) come with a proof of
 security.  All we have to rest on is years of unsuccessful
 cryptanalysis. 

But there's a big difference: the random oracle `assumption` is clearly not
valid for SHA-1 (or any other specific hash function). Since this is
trivial, it is less clear what is the cryptoanalytical problem that SHA1 was
tested for. SHA-1 (and similar functions) were tested mainly for
collision-resistance (and also for one-wayness). But clearly these
properties are not sufficient for the proposed use (to extract entropy). 

So the question is again: what is an assumption which we can test SHA-1
against, and which is sufficient to make the `entropy crunching alg` secure?


I found that when trying to explain and define hash functions and their
properties, I didn't find a satisfactory definition for the `randomness`
properties.

Best Regards, Amir Herzberg
See http://amir.herzberg.name/book.html  for lectures and draft-chapters
from book-in-progress, `Introduction to Cryptography, Secure Communication
and Commerce`; feedback appreciated!
-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



RE: building a true RNG

2002-07-30 Thread James A. Donald

--
On 30 Jul 2002 at 17:02, Amir Herzberg wrote:
 I found that when trying to explain and define hash functions
 and their properties, I didn't find a satisfactory definition
 for the `randomness` properties.

Randomness is of course indefinable.  A random oracle is however 
definable.  

If SHA-1 is indistinguishable from a random oracle without prior
knowledge of the input, then we would like to prove that for an
attacker to make use of the loss of entropy that results from the
fact that it is not a random oracle, the attacker would be need to
be able to distinguish SHA-1 from a random oracle without prior
knowledge of the input. 

--digsig
 James A. Donald
 6YeGpsZR+nOTh/cGwvITnSR3TdzclVpR0+pr3YYQdkG
 CxPM+cm8zcgy+aC2EA+wlmYH4DUaMzSLmaJFJN6v
 225C9EmZaK85VbOoLT5EpF24GeytUdtyW9T/FjXgw


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-30 Thread David Wagner

Amir Herzberg wrote:
But there's a big difference: the random oracle `assumption` is clearly not
valid for SHA-1 (or any other specific hash function).

Well, the random oracle model has problems, but I think those problems
are a bit more subtle than just an assumption that is true or false.

So the question is again: what is an assumption which we can test SHA-1
against, and which is sufficient to make the `entropy crunching alg` secure?

Hmm; I thought I answered this before.  Was it unclear?  If so, please
ask.  In any case, here's a summary.  In the standard model (without
random oracles), there is *no* such assumption.  There's no hope for
finding such an assumption, if you want to build a general-purpose
entropy cruncher that works for any distribution on the input samples.
One can prove this.  No matter what function you choose, there is an
input distribution that makes this function inadequate.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread Sampo Syreeni

On 2002-07-27, David Wagner uttered to [EMAIL PROTECTED]:

In particular, there is no function f which doesn't waste entropy,
unless |Y|=1.  The proof is simple enough that you can probably find it
with a moment's contemplation, but let me know if you want to see it.

I would be lead to pigeonhole; still, this does seem like an eminently
teachable moment to me.

However, what we're working with in the case of a typical RNG isn't
functions between finite buffer-fulls of data, but functions between
infinite sets of entire bitstreams which need to be implemented within a
finite memory constraint. Whatever the algorithm, it can have state. I'm
thinking that is the reasoning which leads people to think that the
problem is simple. In this framework one can very well gather statistics
and try and normalize them based on what one's seen so far. (The
minimalist example here is bias removal.)  After that, what goes into the
hash is (approximately) white and even with minimal assumptions (i.e. the
hash is a one-way permutation) our RNG would seem to approach ideality.

True, such an approach will always become a race in modelling the
statistics. But still, I would think the generator has the advantage if
sufficient margin is left for error. (That is, hash some constant times
the number of bits containing the entropy you suspect to be there, based
on the best model you've got.)
-- 
Sampo Syreeni, aka decoy - mailto:[EMAIL PROTECTED], tel:+358-50-5756111
student/math+cs/helsinki university, http://www.iki.fi/~decoy/front
openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread Sampo Syreeni

On 2002-07-28, Sampo Syreeni uttered to David Wagner:

[Answering to my own mail. Sorry.]

and discard every 1/(p(x)-1/256)'th sample with value x.

Actually the pedantic solution would be to put an arithmetic
compressor/coder between the input and output, using the best model we've
got. That still leaves model adaptation to be dealt with, but if we
discard a sufficient number of output bits at start (estimable from the
model), we *will* end up with (very nearly) flat statistics on the output.
Asymptotic optimality and all that... (The qualification comes from
limited precision arithmetic.)
-- 
Sampo Syreeni, aka decoy - mailto:[EMAIL PROTECTED], tel:+358-50-5756111
student/math+cs/helsinki university, http://www.iki.fi/~decoy/front
openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread David Wagner

 An example: presume we take a simple first order statistical model. If our
 input is an 8-bit sample value from a noise source, we will build a 256
 bin histogram. When we see an input value, we look its probability up in
 the model, and discard every 1/(p(x)-1/256)'th sample with value x. When
 this happens, the sample is just eaten and nothing appears in the output;
 otherwise we copy.

I understand what you're trying to say, but this will not give a
general-purpose function that doesn't waste entropy regardless of the
input distribution.  This only works when the distribution on the input
stream consists of independent, memoryless samples from some distribution
on 8-bit values.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread David Wagner

Barney Wolff  wrote:
This leads me to ask what may be a laughably naive question:
Do we even know that the popular hash functions can actually generate
all 2^N values of their outputs?

It seems very unlikely that they can generate all 2^N outputs
(under current knowledge).  However, they satisfy the next-best
thing: their output appears to be indistinguishable from uniform to
computationally-bounded observers, hence it's as good as if they
could generate all 2^N outputs for most purposes.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread Barney Wolff

Yes, but is there any lower bound known?  2^(N-1)?  Or worse?
Clearly a hash with N-bit output that can't generate all 2^N
values can never actually have N bits of entropy.  Depending on how
many forbidden values there are, this may or may not be significant.
Of course if the forbidden values cannot be found, I'd agree with
the as good as.

On Mon, Jul 29, 2002 at 03:39:32PM +, David Wagner wrote:
 Do we even know that the popular hash functions can actually generate
 all 2^N values of their outputs?
 
 It seems very unlikely that they can generate all 2^N outputs
 (under current knowledge).  However, they satisfy the next-best
 thing: their output appears to be indistinguishable from uniform to
 computationally-bounded observers, hence it's as good as if they
 could generate all 2^N outputs for most purposes.

-- 
Barney Wolff
I never met a computer I didn't like.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread David Wagner

Oh dear.  On re-reading your message I now suspect that what you asked
is not what I originally thought you asked.  I see two questions here:
  Q1: If we cycle through all N-bit messages, are all
  2^N output values possible?
  Q2: If we cycle through all messages (possibly very long
  or very short), are all 2^N output values possible?
On first reading I thought you were asking Q1, but it now occurs to me
you were probably asking Q2.  I apologize profusely for any confusion
I may have caused.

Anyway, the answer to Q1 is likely to be No.  I'd guess that the answer
to Q2 is probably Yes, or close to it.

For the first scenario, if the hash behaves like a random oracle,
then one would expect only 2^N * (1 - 1/e) of the possible outputs to
be attainable.  Of course, the forbidden outputs are not easy to find;
the best way I know is by exhaustive search over all 2^N N-bit messages.
For the second scenario, if the hash behaves like a random oracle, then
one would expect all outputs to be attainable.  No significant deviation
from the random oracle model is known for SHA1 (well, with one exception
that doesn't appear to be relevant here).  That said, this is not a proof,
and my answers just represent my best guesses.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread John S. Denker

Barney Wolff  asked:
 Do we even know that the popular hash functions can actually generate
 all 2^N values of their outputs?

David Wagner replied:
 
 It seems very unlikely that they can generate all 2^N outputs
 (under current knowledge).  

I was temporarily astonished, but he clarified as follows:

   Q2: If we cycle through all messages (possibly very long
   or very short), are all 2^N output values possible?
...
 I'd guess that the answer
 to Q2 is probably Yes, or close to it.

1) Consider the following function H0:  Divide the input into
chunks N bits long.  Calculate the XOR of all the chunks, and
use that as the output.

This meets the definition of hash function, although it would
not be a one-way hash function.  And it would most certainly
be capable of generating all 2^N possible outputs.

2) I can't prove that a standard hash function such as SHA1
generates all possible codes, but I consider it likely.  It would 
be quite shocking if a strong hash function such as SHA1 generated
fewer codes than a weak function such as H0.

3) For a one-way hash function should not expect a _constructive_ 
proof that it generates all possible codes;  such a construction
would violate the one-way property.

4) Here is a rough plausibility argument.  Consider two hash
functions H_1 and H_2 that are independent.  We define
H'(x) := H_1(x) XOR H_2(x)  (1)
which implies
H_1(x) = H'(x) XOR H_2(x)   (2)

Now let's look at the truth table for equation (2), where
the row-index is the H' code, the column-index is the H_2
code, and each entry represents the H_1 code required
to uphold the equation:

 00   01   10   11
00   00   01   10   11
01   01   00   11   10
10   10   11   00   01
11   11   10   01   00

Now let's suppose H_2 is missing one code (say the 10 code) and
H' is missing one code (say the 11 code).  Then H_1 must be missing
at least three codes!  Otherwise there would be a way of combining
a non-missing H_1 code with a non-missing H_2 code to create the
missing H' code.

 00   01   10m  11
00   00   01   10   11
01   01   00   11   10
10   10   11   00   01
11m  11m  10m  01   00m

We can extend this argument by combining lots and lots of 
independent hash functions H_i.   The combination has far 
fewer missing codes than any of the ingredients.  So either
you conclude that 
  a) there is a conspiracy that prevents us from constructing
 independent hash functions to use as ingredients, or
  b) we can produce hash functions with very few missing codes.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread Sandy Harris

David Wagner wrote:
 
 Oh dear.  On re-reading your message I now suspect that what you asked
 is not what I originally thought you asked.  I see two questions here:
   Q1: If we cycle through all N-bit messages, are all
   2^N output values possible?
   Q2: If we cycle through all messages (possibly very long
   or very short), are all 2^N output values possible?
 On first reading I thought you were asking Q1, but it now occurs to me
 you were probably asking Q2.  I apologize profusely for any confusion
 I may have caused.
 
 Anyway, the answer to Q1 is likely to be No.  I'd guess that the answer
 to Q2 is probably Yes, or close to it.

I think the interesting question is whether, for M-bit hash inputs,
and an N-bit hash, with a lower bound Q on entropy per input batch,
so M  Q  N, we can show, as I think Denker is claiming to have done,
that the entropy of hash(M) must be  N - epsilon, for some epsilon
small enough to ignore.

That would imply not only that all 2^N values occur, but that they
are reasonably evenly distributed.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread David Wagner

 3) For a one-way hash function should not expect a _constructive_ 
 proof that it generates all possible codes;  such a construction
 would violate the one-way property.

Nitpick: the last statement does not seem quite right to me.  I'm thinking
of the notion of a one-way permutation.  For instance, the RSA function
f(x) = x^3 mod n is conjectured to be a one-way permutation, assuming n
is a RSA modulus with unknown factorization and the RSA assumption holds.
(I'm being a little fast and loose with the formal details, but I hope
this conveys the idea.)

That said, I'm not claiming that the RSA function would make a good
cryptographic hash function.  Cryptographic hash functions are expected
to be a lot more than just one-way and collision-free.

I don't know of any good cryptographic hash function that comes with
a proof that all outputs are possible.  However, it might not be too
hard to come up with plausible examples.  For example, if we apply the
Luby-Rackoff construction (i.e., 3 rounds of a Feistel cipher), with
ideal hash functions in each round, does this have the desired properties?
It might.

On the gripping hand, I don't think this is a real issue in practice.
SHA1 is probably good enough for all practical purposes that I can
think of.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread Jack Lloyd

On Mon, 29 Jul 2002, David Wagner wrote:

  DES, being extremely hardware friendly, can be (ab)used to
  make a strong one-way hash.  (E.g., raw input into both key and data maps
  56+64 - uniformly distributed 64 bits.)

 However, when used in this way, DES is not an especially good hash function.
 For instance, it is easy to find collisions, to find pre-images, and so on.

Somewhat related to that, are there any block cipher-hash function methods
that are actually secure? Every one I've ever read about seems to have been
broken.

Regards,
  Jack


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread Matt Crawford

2) I can't prove that a standard hash function such as SHA1
generates all possible codes, but I consider it likely.  It would 
be quite shocking if a strong hash function such as SHA1 generated
fewer codes than a weak function such as H0.

I think you could do a probabilistic proof similar to the DES is not
a group quasi-proof.  To test a hash function h() whose range is S,
let F be the set of balanced functions from S - {0, 1}.  (Balanced
meaning that each f in F maps exactly half of S to 0 and half to 1.)
If you can contrive to choose many members of F at random, and compose
them with h for many arguments of h, you should be able to set
confidence limits on how much of S is covered by h.


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread Arnold G. Reinhold

At 12:20 PM -0700 7/29/02, David Honig wrote:

Whether there is a need for very high bandwidth RNGs was discussed
on cypherpunks a few months ago, and no examples were found.
(Unless you're using something like a one-time pad where you need
a random bit for every cargo bit.)  Keeping in mind that
a commerical crypto server can often accumulate entropy during
off-peak hours. 


It's been discussed here some time back as well. If you believe your 
crypto primitives are infeasible to break, a crypto-based PRNG with a 
long enough random seed should be indistinguishable from a true, 
perfect RNG. If you are only confident that your crypto primitives 
are expensive to break, then using a true RNG for keys and nonces, 
rather than deriving them all from one PRNG, adds security.

This suggest a continuum of solutions: Construct a crypto PRNG and 
periodically (once enough has accumulated) stir your entropy source 
into it's state in some safe way. If you extract entropy slower than 
you put it in you can expect the equivalent of of a true RNG. If you 
extract entropy faster than you put it in, the system degrades 
gracefully in the sense that someone who expends the effort to break 
the number generation scheme only gets to read messages since the 
last entropy update.

The reason for batching entropy input is to prevent someone who has 
broken your system once from discovering each small entropy input by 
exhaustive search.  (There was a nice paper pointing this out in. If 
someone has the reference...)

Arnold Reinhold

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread David Wagner

 To test a hash function h() whose range is S,
 let F be the set of balanced functions from S - {0, 1}.  (Balanced
 meaning that each f in F maps exactly half of S to 0 and half to 1.)
 If you can contrive to choose many members of F at random, and compose
 them with h for many arguments of h, you should be able to set
 confidence limits on how much of S is covered by h.

Can you elaborate?  What would the computational complexity of this be?

Estimating the size of {x : f(h(x))=1} to within relative error e
requires O(1/e^2) evaluations of h, if I understand correctly.  If we
consider the difference between a hash function that can hit all of
S, and another hash function that misses only one element of S, we'll
need to resolve the size of these sets to within relative error 1/|S|.
That suggests we'll need |S|^2 evaluations of h, which is infeasible
for SHA1 and slower than the naive O(|S|) algorithm.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread David Wagner

 The reason for batching entropy input is to prevent someone who has 
 broken your system once from discovering each small entropy input by 
 exhaustive search.  (There was a nice paper pointing this out in. If 
 someone has the reference...)

I believe you are referring to the state compromise attacks
described in the following paper:
  J. Kelsey, B. Schneier, D. Wagner, C. Hall,
  Cryptanalytic Attacks on Pseudorandom Number Generators,
  FSE'98.  http://www.counterpane.com/pseudorandom_number.html
I once wrote a short note about the relevance of this to IPSec:
  http://www.cs.berkeley.edu/~daw/my-posts/using-prngs

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread Ben Laurie

David Wagner wrote:
 Amir Herzberg wrote:
 
So I ask: is there a definition of this `no wasted entropy` property, which
hash functions can be assumed to have (and tested for), and which ensures
the desired extraction of randomness?
 
 
 None that I know of.  I'm not aware of much work in the crypto literature
 on this topic.
 
 Actually, there is not much hope for such a property.  It is pretty easy
 to see that, if we make no assumptions on the entropy inputs other than
 they have sufficient entropy, then no single deterministic algorithm can
 ever be good at extracting randomness.  If we fix any single extraction
 algorithm A, there exists a distribution D on the inputs which make it
 give non-uniform output (for example, D might be the uniform distribution
 on the set {x : A(x) has its first ten bits zero}).  This is a standard
 observation from the theory community's work on derandomization.

That's the kind of thing mathematicians like to say - but how much does 
it apply to the real world? For example, you can't produce your example 
set for SHA-1 or MD5, can you?

If you are prepared to factor in cost, then the landscape changes, I 
would contend. Mathematicians generally assume uncountable resources. 
Even constructivists assume countable resources. However, we can assume 
_finite_ resources. Big difference.

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html   http://www.thebunker.net/

Available for contract work.

There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit. - Robert Woodruff


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-27 Thread David Honig

At 11:24 AM 7/25/02 -0400, John S. Denker wrote:

And most particularly I do not
care if the analog threshold of my soundcard shifts slightly 
(as a function of recent history, temperature, phase of the
moon, or whatever).

A change in the analogue threshold of your digitizing step
will change the variability (aka entropy) of your samples.

The practical solution is wide engineering margins and monitor
your raw input.  (And, if you can (SHA users can't), 
measure the data just before it goes into any whiteners you use, 
for belts-and-suspenders assurance that you've compressed it sufficiently.)


This is the central conceptual point of my paper.  It is
more important than any particular implementation.  The point
is that a Random Symbol Generator can be proved correct using
fairly mild assumptions and premises.

Based on a detailed model of the noise, grounded in physics.
Yep.

Have you seen RNGs based on arrival times of network packets,
or disk accesses, etc.?  I'd extrapolate that you'd not trust
them much given their distance from physics.  (And I'd
agree that a soundcard (or RF card) is preferred and 'free')


The turning point of the argument is statistical: if I have
enough entropy at the input of the hash function, and if the
hash function doesn't waste entropy (by having unnecessarily
many hash collisions) then the statistics takes over and 
covers a multitude of sins.  

I don't think collision is the right word; you're not doing
a search on hash values which might collide.


For example, if I have 165 bits
of entropy at the input of the hash function, the output will
have 159.98 bits of entropy in a 160 bit word.  

I don't understand this.  If you have 165 bits of entropy in, you should be
able to generate 160 binary symbols with a bit of entropy each.
(You are conserving total entropy, only concentrating it by reducing
the number of symbols.)


You can shift
the threshold all you want.  

Shifting thresholds changes entropic content.


You can add something to the input.

DC doesn't matter, capacitors are cheap :-) 


If the alleged threshold
shift is so large as to decrease the variability of the raw
data, then all bets are off... but that wasn't the question
that David asked.  The rhetorical question suggested that if
the threshold shifted at all I would have a big problem, and 
I loudly assert that I don't.  Specifically:  If you give me
any halfway-reasonable upper bound on the magnitude of the shift, 
I can design the generator to accomodate that, producing 
industrial-strength randomness despite such a shift.

This hinges on the definition of large and small... anyway
generous margins  monitoring work both for steel bridges and RNGs.





-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



RE: building a true RNG

2002-07-27 Thread Amir Herzberg

In several posts to the list, e.g. by Wagner and Denker, it is proposed to
use a `strong crypto hash function` h(), e.g. SHA1, to extract entropy from
a physical source, i.e.

 + Source -- Digitizer -- good hash

Namely, if the sample from the digitizer (of the physical source) is x, the
suggestion is to use h(x) as random data.

I think this is a very common design in practice. But clearly this relies on
some property of the hash function. For example Denker says:
  a) if the hash function happens to have a property I call no
 wasted entropy then... [this h(x) is `as good as random`].

Indeed if we adopt the `random oracle` methodology, i.e. analyze the system
assuming h() is a truly random function, then we easily see that h(x) are
truly random bits (assuming the number of bits in h(x) is significantly less
than the entropy in x).

But: the `random oracle` methodology helps us find vulnerabilities in
protocols, but no specific hash function is really a random oracle...

So I ask: is there a definition of this `no wasted entropy` property, which
hash functions can be assumed to have (and tested for), and which ensures
the desired extraction of randomness?

Regards, Amir Herzberg
See http://amir.herzberg.name/book.html  for lectures and draft-chapters
from book-in-progress, `Introduction to Cryptography, Secure Communication
and Commerce`; feedback appreciated!
-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-27 Thread John S. Denker

I wrote:
  a) if the hash function happens to have a property I call no
 wasted entropy then the whitening stage is superfluous (and
 you may decide to classify the hash as non-simple);

David Honig responded:
 Not wasting entropy does not mean that a function's output
 is white ie uniformly distributed.  E.g., a function
 which doubles every bit: 0-00, 1-11 preserves total
 entropy but does not produce uniformly distributed output.
 (It also reduces entropy/symbol.)

That's a red-herring tangent.  I'm not talking about any old
function that doesn't waste entropy;  I'm talking about a 
!!hash!! function that doesn't waste entropy.  The hash function
has a hard constraint on the word-size of its output.  If it
starts doubling bits, or otherwise putting out redundancies,
then it is wasting entropy.  See the discussion of the BADHASH-2
function in the paper.
  http://www.monmouth.com/~jsd/turbid/

And remember:  in addition to having a non-entropy-wasting hash 
function, we are also required to saturate its input.  Then we can 
conclude that the output is white to a very high degree, as 
quantitatively discussed in the paper.

 The simple-hash's --lets call it a digest-- function is to increase the
 entropy/symbol
 by *reducing* the number of symbols while preserving *total* entropy.

Total entropy is preserved in the non-saturated regime.  This is
documented in upper rows of the table:
  http://www.monmouth.com/~jsd/turbid/paper/turbid.htm#tab-saturation
In the saturated regime, some entropy is necessarily lost.  This is
documented in the lower rows of the table.  This is only a small 
percentage, but it is mandatory.  I don't consider this to be wasted 
entropy;  I consider it entropy well spent.  That is, these are 
necessary hash collisions, as opposed to unnecessary ones.

 A function like xor(bit N, bit N+1) which halves the number of bits
 can do this.  While its output is whiter than its input, its output
 will not be perfectly white unless its input was.

See the discussion of BADHASH-1 in the paper.

 Two scenarios where SHA is contraindicated came up:
 1. hardware
 2. interrupt handlers
 
 Sometimes a hardware RNG will feed raw data into a crypto-WEAK analogue of
 SHA,
 ie a LFSR which does the bit-reduction and mixing functions,
 but doesn't mix as well as SHA.  (LFSR are hardware-friendly)  Separating
 the bit-reduction from the mixing can be useful for analyzing what's
 really going on.

Well, I tend to agree that systems that separate the bit-reduction
from the mixing are easier to analyze, in the sense that it is
easier to find flaws in them.  But that's because they're so flawed!

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-27 Thread David Wagner

Amir Herzberg wrote:
So I ask: is there a definition of this `no wasted entropy` property, which
hash functions can be assumed to have (and tested for), and which ensures
the desired extraction of randomness?

None that I know of.  I'm not aware of much work in the crypto literature
on this topic.

Actually, there is not much hope for such a property.  It is pretty easy
to see that, if we make no assumptions on the entropy inputs other than
they have sufficient entropy, then no single deterministic algorithm can
ever be good at extracting randomness.  If we fix any single extraction
algorithm A, there exists a distribution D on the inputs which make it
give non-uniform output (for example, D might be the uniform distribution
on the set {x : A(x) has its first ten bits zero}).  This is a standard
observation from the theory community's work on derandomization.

The only way out of this I can see is to assume we have a small seed
of uniformly distributed randomness on the side.   This is exactly the
direction explored in the theory community in work on extractors, the
leftover hashing lemma, and the like.  However, from a cryptographic point
of view, such an assumption is highly unreasonable (even worse than the
random oracle assumption).

In short: I know of no better way to analyze cryptographic randomness
extraction than to use the random oracle model.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-27 Thread John S. Denker

Amir Herzberg wrote:
 
 So I ask: is there a definition of this `no wasted entropy` property, which
 hash functions can be assumed to have (and tested for), and which ensures
 the desired extraction of randomness?

That's the right question.

The answer I give in the paper is 

 A cryptologic hash function advertises that it is
 computationally infeasible for an adversary to unmix
 the hash-codes.

 A chosen-plaintext (chosen-input) attack will not
 discover inputs that produce hash collisions with
 any great probability.

 In contrast:

 What we are asking is not really very special. We
 merely ask that the hash-codes in the second
 column be well mixed. 

 We ask that the data acquisition system will not
 accidentally produce an input pattern that unmixes
 the hash-codes. 

We believe that anything that makes a good pretense of being 
a cryptologic hash function is good enough for our purposes,
with a wide margin of safety.   If it resists attack when the 
adversary can choose the inputs, it presumably resists attack 
when the adversary can't choose the inputs.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-27 Thread David Wagner

John S. Denker wrote:
Amir Herzberg wrote:
 So I ask: is there a definition of this `no wasted entropy` property, which
 hash functions can be assumed to have (and tested for), and which ensures
 the desired extraction of randomness?

That's the right question.

The answer I give in the paper is 

 What we are asking is not really very special. We
 merely ask that the hash-codes in the second
 column be well mixed. 

Alas, that's not a very precise definition.

Actually, my intuition differs from yours.  My intuition is that
entropy collection requires fairly strong assumptions about the hash.
For instance, collision-freedom isn't enough.  One-wayness isn't enough.
We need something stronger, and something that appears difficult to
formalize in any precise, mathematically rigorous way.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-27 Thread Joseph Ashwood

- Original Message -
From: John S. Denker [EMAIL PROTECTED]
To: David Honig [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; 'Hannes R. Boehm' [EMAIL PROTECTED]; 'Ian
Hill' [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Saturday, July 27, 2002 10:52 AM
Subject: Re: building a true RNG


 I wrote:
   a) if the hash function happens to have a property I call no
  wasted entropy then the whitening stage is superfluous (and
  you may decide to classify the hash as non-simple);

I disagree. The whitening stage can perform a very important step of
allowing for stretching of the entropy. Ideally a true RNG would be fast
enough to supply all our needs, but that typically doesn't happen. Instead
we have to depend on stretching tricks, basically using a primitive in
counter mode, to stretch the bits we have as far as is needed. This can be
combined with the distillation stage itself by inserting a fixed string,
copying the state, and finishing the calculation, but can conceptually be
thought of as a seperate stage.

 David Honig responded:
  Not wasting entropy does not mean that a function's output
  is white ie uniformly distributed.  E.g., a function
  which doubles every bit: 0-00, 1-11 preserves total
  entropy but does not produce uniformly distributed output.
  (It also reduces entropy/symbol.)

 That's a red-herring tangent.  I'm not talking about any old
 function that doesn't waste entropy;  I'm talking about a
 !!hash!! function that doesn't waste entropy.

A construct that simply does not exist, more on this later.

 And remember:  in addition to having a non-entropy-wasting hash
 function, we are also required to saturate its input.  Then we can
 conclude that the output is white to a very high degree, as
 quantitatively discussed in the paper.




  The simple-hash's --lets call it a digest-- function is to increase the
  entropy/symbol
  by *reducing* the number of symbols while preserving *total* entropy.

I seperated this because I agree with teh statement completely, that is the
one and only function of the distilling function, whether it is a hash, a
cipher, or a miracle construct.


 Total entropy is preserved in the non-saturated regime.  This is
 documented in upper rows of the table:
   http://www.monmouth.com/~jsd/turbid/paper/turbid.htm#tab-saturation

I disagree. There are several ways of viewing the problem. Modelling the
distillation function as a perfect entropy smoother over k bits (other than
that I will be assuming a worst case). We see the following happen:
When the first bit is inserted, each of the k bits receives 1/k entropy
When the second bit of entropy is inserted one of those bits received 1 bit
of entropy, the entire system now has 1 + (k-1)/k bits of entropy, already
there is a small but measurable discrepancy
Each bit now has (2k-1)/k bits of entropy
The third bit results in 1 of the k bits having entropy 1 giving 1+
((2k-1)/k)(k-1)/k bits of entropy in total

While the system slowly loses entropy, it does lose entropy. It's also worth
noting that this agrees with the presented analysis at 3 points, 0 has 0
entropy, 1 has 1-bit of entropy, and inf has k bits, but that for all useful
lengths having k bits of entropy can never occur.

 In the saturated regime, some entropy is necessarily lost.  This is
 documented in the lower rows of the table.  This is only a small
 percentage, but it is mandatory.  I don't consider this to be wasted
 entropy;  I consider it entropy well spent.  That is, these are
 necessary hash collisions, as opposed to unnecessary ones.

These don't even have to be hash collisions, it is possible for the hash
function to discard entropy simply through its behavior, as in my example
where entropy starts being discarded beginning with the second bit. From a
theoretic standpoint, it is entirely possible that a function exists that
discards entropy beginning with the first bit.

  A function like xor(bit N, bit N+1) which halves the number of bits
  can do this.  While its output is whiter than its input, its output
  will not be perfectly white unless its input was.

This statement is true of all deterministic algorithms. In fact it it one of
the points where all presented models agree, you cannot reach a completely
entropic state until you reach infinite lengths for the input. If the input
if purely entropic there is some grey area surrounding this, but for a hash
function to be secure the grey area disappears.

  Two scenarios where SHA is contraindicated came up:
  1. hardware
  2. interrupt handlers
 
  Sometimes a hardware RNG will feed raw data into a crypto-WEAK analogue
of
  SHA,
  ie a LFSR which does the bit-reduction and mixing functions,
  but doesn't mix as well as SHA.  (LFSR are hardware-friendly)
Separating
  the bit-reduction from the mixing can be useful for analyzing what's
  really going on.

 Well, I tend to agree that systems that separate the bit-reduction
 from the mixing are easier to analyze, in the sense that it is
 easier to find

Re: building a true RNG (was: Quantum Computing ...)

2002-07-26 Thread Enzo Michelangeli

- Original Message -
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Tuesday, July 23, 2002 1:59 PM
Subject: Re: building a true RNG (was: Quantum Computing ...)


 You cannot measure entropy retrospectively.  You need to have a
 theory as to where the entropy is coming from, in order to
 reliably measure it.

 Thus hardware sources should be based on simple and well
 understood physical principles, such as Johnson noise or shot
 noise.

 Entropy is not quite a physical quantity -- rather it is on the
 slippery edge between being a physical thing and a philosophical
 thing. If you are not careful, you will slip into a deep epistemic
 bog and find yourself needing to ask how do we know what is
 knowable, and what is the whichness of why?

 To avoid such deep waters, know where your entropy is coming from.

Actually, the aura of mystery that surrounds entropy can be cleared if you
think of it as the amount of information describing the state of a system
that you do NOT know, because the output of the system only allows to
inspect part of its state (or nothing at all). For a perfect PRNG, that
doesn't leak any incremental information about the internal state, the
entropy equals the number of independent bits of its state (which is why I
consider the depletion of the entropy pool a non-issue: if no information
about the state is disclosed though the output stream, the entropy of the
generator CANNOT be decreased). Macroscopic thermodynamic systems contain
much larger amounts of entropy, in the region of 10^23 bits, as the number
of Avogadro comes into play.

Whereas, in theory, even a perfect PRNG can be reverse engineered (e.g.,
hooking a debugger to its software and/or hardware), for a true RNG this is
not possible, either because the number of states is just too large (thermal
noise, see above) or because of quantum reasons (no hidden variables at
all to dig out: for example, to the best of our knowledge there is simply no
way of knowing exactly when a radioactive nucleus will decay). Otherwise,
their black box functionality is essentially the same - by definition.

Estimating an _upper boundary_ to entropy by simply observing the output of
a black box is possible, but under some conditions: you have to assume that
the the system is ergodic, i.e. that the statistics can be inferred from
time averages (or, equivalently, that the system never gets locked into
sequences of some subset of states). And even then, what you have is just
an estimate: you could have a sequence of 1000 consecutive zeroes just by
chance (if you are VERY, VERY unlucky, that is).

Enzo







-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-25 Thread John S. Denker

David Honig helped focus the discussion by advocating the 
block diagram:

 Source -- Digitizer -- Simple hash -- Whitener (e.g., DES)

Let me slightly generalize this to:
! Source -- Digitizer -- hash -- Whitener (e.g., DES)

i.e. we defer the question of whether the hash is simple or not.

I continue to claim that
 a) if the hash function happens to have a property I call no 
wasted entropy then the whitening stage is superfluous (and
you may decide to classify the hash as non-simple);  otherwise
 b) if the hash function does not have that property, this
is a defective Random Symbol Generator and 
  b1) the whitener will _at best_ conceal, not remove the 
  defects, and
  b2) this is not the best way to conceal defects.  Very
  definitely not.

To illustrate my point, I will accept David's example of a
simple-hash function;  he wrote:
 Parity is the ultimate hash.

Well, then, suppose that the raw data coming off my digitizer
consists of an endless sequences of even-parity words.  The
words have lots of variability, lots of entropy, but the parity
is always even.  Then the output of the simple-hash is an endless 
sequence of zeros.  I encrypt this with DES.  Maybe triple-DES.  
It's not going to help.  The generator is defective and doesn't 
even have satisfactory error-concealment.

I like my design a lot better:

+ Source -- Digitizer -- good hash

where I have chosen SHA-1 as my hash function.  

Finally, since SHA-1 is remarkably computationally efficient,
I don't understand the motivation to look for simpler hash
functions, especially if they are believed to require whitening
or other post-processing.

=

Thanks again for the questions.  This is a good discussion.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG (was: Quantum Computing ...)

2002-07-24 Thread David Honig

At 08:39 AM 7/23/02 +0200, Eugen Leitl wrote:
On Mon, 22 Jul 2002, David Honig wrote:

 Yes, it is a joke.  However, it is also a viable if low-bandwidth
 entropy source.  I disagree that you need to be able to model

I've got a framegrabber with a 640x480 24 bit/pixel camera. It doesn't 
compress, is rather noisy, and since self-adjusting I get the maximum 
entropy at maximum darkness.

Is there any point in compressing the video before running it through a 
cryptohash? How does e.g. SHA-1 fare with very sparse bitvectors?

Good question.  I'd say the answer depends on the computational requirements
of compression vs. SHA, assuming you're trying to save CPU cycles.

You can measure the entropy of your camera staring at a wall and use that
to estimate how much you need to digest.  Then add a generous engineering
margin,
of course.


Here's the type of experiment I've done, which separates the
bit-digestion from the crypto-hashing: 

Once you measure entropy in your raw data, you can put the data sample through
various distillers.  For instance, xor pairs of bits, reducing the
sample size by 2.  Measure the entropy of this.  Keep doing this
until your entropy-measure says you've maxxed out.

Then, to truly mess with Mr. Adversary, put it through a crypto function
(which here needn't be a digest, e.g., a block cipher).

The latter step achieves the 'mixing' (avalanche) which is built into 
SHA.  For a software implementation SHA seems like a good choice, though
you don't get to measure the entropy distillation before whitening.







-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG (was: Quantum Computing ...)

2002-07-24 Thread David Honig

At 10:59 PM 7/22/02 -0700, [EMAIL PROTECTED] wrote:

Entropy is not quite a physical quantity -- rather it is on the  
slippery edge between being a physical thing and a philosophical  
thing. If you are not careful, you will slip into a deep epistemic 
bog and find yourself needing to ask how do we know what is 
knowable, and what is the whichness of why?

To avoid such deep waters, know where your entropy is coming from. 

We agree on your substantive points re RNGs, I think, but you're 
interestingly wrong here.  Entropy is a physical quantity,
it even figures into chemistry.  The physics-of-computation people
(Bennett? Landaur? etc) have written about thermodynamics  information.  
Modulo Chaitin-type mindgames about measuring it :-)  
Anyway we're cryptographers, not philosophers, so we should be safe..

Four wheeling through the epistemological bog with Shannon as copilot




-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG (was: Quantum Computing ...)

2002-07-24 Thread David Wagner

Eugen Leitl  wrote:
Is there any point in compressing the video before running it through a 
cryptohash?

No.  (assuming you're talking about lossless compression)

In general, any invertible transformation neither adds or subtracts
entropy, and hence is extremely unlikely to make any difference to the
performance of the hash function (assuming SHA-1 is cryptographically
secure, which it is currently believed to be).  Lossless compression
is just one special kind of invertible transformation.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG (was: Quantum Computing ...)

2002-07-24 Thread Joseph Ashwood

- Original Message -
From: Eugen Leitl [EMAIL PROTECTED]
Subject: Re: building a true RNG (was: Quantum Computing ...)


 I've got a framegrabber with a 640x480 24 bit/pixel camera. It doesn't
 compress, is rather noisy, and since self-adjusting I get the maximum
 entropy at maximum darkness.

 Is there any point in compressing the video before running it through a
 cryptohash?

It will not serve a cryptographic use, however if you can find a fast enough
truly lossless compressor it can be very useful. Since I assume you'll be
taking a picture purely in the dark a usable compressor would be (please
pardon the severely abused pseduo-code)

SHA1 pool

on_pixel
{
if pixel is not black
SHA1_update(pool, pixel, pixel coordinates);
}

get_random()
{
SHA1 temp
SHA1_update(pool, 1)
temp = SHA1_duplicate(pool)
return SHA1_finalize(temp)
}

 How does e.g. SHA-1 fare with very sparse bitvectors?

It is believed to fare quite well, and considering that the line for proper
entropy distillation is actually well below the line for cryptographic
security SHA-1 is likely to remain very good for this purpose. If you are
more concerned about speed than maximum entropy containment (or require less
than 128-bits of entropy) you might also consider MD5. If you are extremely
concerned about this (and are willing to lose a few other desirable
behaviors) it is possible to use a block cipher, basically in CBC mode, to
accumulate entropy, this has the advantage that under some reduced
assumptions it is possible to compute the maximal entropy of the state at a
given time.
Joe



-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-24 Thread Paul Crowley

John S. Denker [EMAIL PROTECTED] writes:

  Is there any point in compressing the video before running it through a
  cryptohash? 
 
 There might be a minor point, namely computational efficiency.

I can't believe any compression software could be as fast as just
feeding the signal straight into SHA-1.
-- 
  __  Paul Crowley
\/ o\ [EMAIL PROTECTED] http://www.ciphergoth.org/
/\__/ BiCon 2002 UK bisexual gathering: http://www.2002.bicon.org.uk/

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-24 Thread Eugen Leitl

On 24 Jul 2002, Paul Crowley wrote:

 I can't believe any compression software could be as fast as just
 feeding the signal straight into SHA-1.

I haven't tried this, but assuming I'm digitizing dark video and only get
noise in the lower significant bits I can just mask out the constant
(zero) ones and paste them together to destill the entropy with a very low 
computational cost before feeding it into a cryptohash.

As an aside to what constitutes physical entropy of a system it is indeed 
depending on context of the measurement. A good source of information on 
entropy in all contexts is 

http://www.math.psu.edu/gunesch/entropy.html


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG (was: Quantum Computing ...)

2002-07-23 Thread jamesd

--
On 22 Jul 2002 at 15:39, David Honig wrote:
 You should be able to use any source which you know is not a   
 PRNG as the entropy-source in a true RNG.  You should be able to 
 use entropy (and stat tests) to measure the source entropy after 
 digitization.

You cannot measure entropy retrospectively.  You need to have a   
theory as to where the entropy is coming from, in order to   
reliably measure it.

Thus hardware sources should be based on simple and well   
understood physical principles, such as Johnson noise or shot   
noise. 

Entropy is not quite a physical quantity -- rather it is on the  
slippery edge between being a physical thing and a philosophical  
thing. If you are not careful, you will slip into a deep epistemic 
bog and find yourself needing to ask how do we know what is 
knowable, and what is the whichness of why?

To avoid such deep waters, know where your entropy is coming from. 

--digsig
 James A. Donald
 6YeGpsZR+nOTh/cGwvITnSR3TdzclVpR0+pr3YYQdkG
 SMGOwg3qIP0/FsfmA7GzZGN/XYAabuqcE9Z9eiuB
 2CBUwRUngy0VcmaR93NvqduyZBKgppbTUy49tSdEn


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG (was: Quantum Computing ...)

2002-07-23 Thread Eugen Leitl

On Mon, 22 Jul 2002, David Honig wrote:

 Yes, it is a joke.  However, it is also a viable if low-bandwidth
 entropy source.  I disagree that you need to be able to model

I've got a framegrabber with a 640x480 24 bit/pixel camera. It doesn't 
compress, is rather noisy, and since self-adjusting I get the maximum 
entropy at maximum darkness.

Is there any point in compressing the video before running it through a 
cryptohash? How does e.g. SHA-1 fare with very sparse bitvectors?


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG (was: Quantum Computing ...)

2002-07-23 Thread Derek Atkins

John S. Denker [EMAIL PROTECTED] writes:

  Source -- Digitizer -- Simple hash -- Whitener (e.g., DES)
 
 OK, we have DES as an example of a whitener.  
 -- Can somebody give me an example of a simple hash 
 that performs irreversible compression of the required
 kind?

I can give you a number of examples:  MD5, SHA-1, 

 -- Isn't the anti-collision property required of even
 the simplest hash?  Isn't that tantamount to a very
 strong mixing property?  If there's strong mixing in
 the simple hash function, why do we need more mixing
 in the later whitening step?

More mixing is never bad in an RNG..  See RFC1750.

 -- What is meant by cryptologic strength?  Strength
 against what kind of attack?  If this means in particular
 the one-way property, why do I need it?  I can understand
 why a !!pseudo!! random symbol generator needs the one-way
 property, to protect its internal state, but since my
 generator has no secret state to protect, why do I need
 any cryptologic properties other than mixing?

I think they probably meant cryptographic strength, but I
don't know what was going through their minds.  What
do people mean by authentification?  That's not even
a real world but I see it all the time.  To me, I think
people just don't know the right term to use so they
just put down something that sounds right to them, regardless
of its correctness.

-derek

-- 
   Derek Atkins
   Computer and Internet Security Consultant
   [EMAIL PROTECTED] www.ihtfp.com

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-23 Thread John S. Denker

Eugen Leitl wrote:
 
 ... framegrabber with a 640x480 24 bit/pixel camera. It doesn't
 compress, is rather noisy, and since self-adjusting I get the maximum
 entropy at maximum darkness.

OK.  Evidently it's dominated by thermal noise, not to
be confused with the Poisson noise recently featured
in another thread.  Not a problem.

 Is there any point in compressing the video before running it through a
 cryptohash? 

There might be a minor point, namely computational efficiency.
A well-chosen compressor might eliminate low-entropy bytes
rather quickly.  Make sure it's a lossless compressor, perhaps
GIF or PNG ... as opposed to a perceptual coder (e.g. JPEG) 
that would persumably throw away some of the entropy.  Calling 
SHA-1 on low-entropy bytes doesn't waste entropy, but wastes CPU
cycles.

 How does e.g. SHA-1 fare with very sparse bitvectors?

1) In any good hash function, any input bit should have
about as much effect on the output as any other input bit.
SHA-1 has been analyzed by experts (of which I am not one :-)
and I would imagine they checked this.

2) There are 5 one-bit shifts in the fivefold expansion, and
lots of 5-bit shifts in the main loop, so it shouldn't matter
that the sparse input bits are clustered in the bottom of the
32-bit words.

3) I performed an amateur kick-the-tires test, namely cobbling
up some sparse input vectors, calling SHA-1, and applying
standard statistical tests including Diehard and Maurer's
universal statistical test.  No nobody's surprise, the tests 
didn't detect anything.


Arnold Reinhold wrote:
 
 ... with a portable TV set and a video digitizer 
 should be a good source of high bandwidth noise. In both cases you 
 are just using the receivers as high gain amplifiers of the thermal 
 noise at the antenna terminals.

Thermal noise is good.  Antennas are bad -- just an invitation
to be attacked that way.  Get rid of the antenna.  Keep the high
gain preamp.

Better yet, do as Eugen has done:  Use a framegrabber !!without!! 
the portable TV set.  No RF section at all.  Plenty of entropy,
lower cost, greater simplicity, and less vulnerability to attack.

For that matter, an audio card (without microphone) produces more
than enough entropy for most applications.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG (was: Quantum Computing ...)

2002-07-23 Thread bear



On Mon, 22 Jul 2002, John S. Denker wrote:

David Honig wrote yet another nice note:

I'm not trying to be dense, but I'm totally not
understanding the distinction here.  The following
block diagram is excellent for focussing the discussion,
(thanks):

 Source -- Digitizer -- Simple hash -- Whitener (e.g., DES)

OK, we have DES as an example of a whitener.
-- Can somebody give me an example of a simple hash
that performs irreversible compression of the required
kind?

Depends on the data and how much entropy you suppose it
has, really.  An irreversible compression function that
I use when extracting entropy from text (for other
purposes) is to have a counter.  Each time you process
a character, you add the character code to the counter,
then multiply the counter by 2.4 rounding down.  This is
based on estimates of 1.33 bits of entropy per character
in english text, and requires an initialization vector
(in this case an initialization value) twice as long as
the character code to prevent you from taking too many
bits from the first few characters alone.

For something like a lava-lamp picture, your compression
function might be first converting it into a 4-color image,
editing out the constant parts (eg, the lamp base and edges),
compressing that using PNG format, and then taking some
similarly counter-based function of those bits. Using a
time series of pictures of the same lava-lamp, you'd have
to adjust for lower entropy per byte of processed PNG (by
using a lower factor), because it could be redundant with
other frames.

-- Isn't the anti-collision property required of even
the simplest hash?  Isn't that tantamount to a very
strong mixing property?  If there's strong mixing in
the simple hash function, why do we need more mixing
in the later whitening step?

You are talking, specifically, about cryptographic hash
functions.  The diagram specifies a simple hash function.
The distinction between cryptographic hashes and simple
hashes is, a simple hash is supposed to produce evenly
distributed output.  A cryptographic hash is supposed to
produce evenly distributed *and unpredictable* output.
A simple hash, plus a whitener, is about what you're
thinking of for a cryptographic hash function.

I assume digestion means the same as distillation?

Roughly.  People talk of digestion of a datastream, or
distillation of entropy, or irreversible compression,
etc.  It's roughly the same thing.

Gimme a break.  In particular, gimme an example of a crypto
algorithm that will fail if it is fed with a random-symbol
generator that has only 159.98 bits in a 160 bit word.

That's one bit per 8k. I guess it just depends on which
8k comes through and how much your opponent can make of
one bit.



 I see no point in whitening the output of such a
 distiller.

 So the adversary can't look back into your logic.  A 'distiller'
 which produces quality entropy (after digesting an arbitrary
 number of bits) needn't be as opaque as a crypto-secure hash is.

I'm still needing an example of a distiller that has
the weakness being alleged here.  In particular,
 -- either it wastes entropy (due to excessive hash collisions)
in which case it isn't a good distiller, and whitening it won't
improve things (won't recover the lost entropy), or
 -- it doesn't waste entropy, in which case the output has entropy
density of 159.98/160, in which case there is nothing to be gained
by so-called whitening or any other post-processing.

I think you may be right about that -- whitening protects you
from errors in an overly-simple distiller such as I described
above, but if you've got a really fine-tuned one, it doesn't
help much.


In particular, (proof by contradiction) consider the following
scenario:  suppose she captures 100 bits of output, and wants
to use it to make some predictions about the next 60 bits of
output.  She uses the 100 bits to see back into the
hypothetical simple-hash function, learn something about the
input thereof, and then pushes that forward again through the
simple-hash function to make the predictions.  But this scenario
violates the most basic requirements of the hash function, even
the simplest of simple-hash functions.

Again, it violates the requirements of a cryptographic hash
function, not a simple hash function.


Bear


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG (was: Quantum Computing ...)

2002-07-23 Thread Jack Lloyd

On Tue, 23 Jul 2002, John S. Denker wrote:

  -- I am told (but don't understand) that there might exist
 a weaker hash that somehow does require whitening.  This
 is the point of the conversation.  Please address this
 point if you can.

Perhaps they were refering to something like what is done in the
/dev/random driver, where inputs are mixed in using a simple polynomial
scheme whose exact details (or name) escapes me at the moment. This is
basically because it's called during interupts, and you might not want to
be calling out to something expensive like SHA-1 right then. Then when
someone reads from the device the output is derived from the internal pool
using SHA-1.

Regards,
 Jack


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG (was: Quantum Computing ...)

2002-07-22 Thread John S. Denker

David Honig wrote:

 The thread here has split into QM  True Randomness and
 what do you need to build a true RNG...

Yup.

 Specifically:  The executive summary of the
 principles of operation of my generator is:
  -- use SHA-1, which is believed to be resistant
 to collisions, even under chosen-input attack.
  -- use it under conditions where the adversary
 cannot choose the input.
  -- the rest is just physics and statistics.
 
 Sure.  There are many examples of this kind of generator,
 using physical sources from video'd lava lamps to radioactive decay
 (incl. semiconductor junctions, resistors, microphone,
 detuned FM radio cards).  

For the humor-impaired, let me point out that the lava 
lamp is a joke.  What it conspicuously lacks is a proof 
of correctness -- that is, a nonzero lower bound on the 
entropy rate of the raw data.  The lava could turn out to 
have a not-very-complicated periodic pattern.  Secondarily, 
the pattern changes so slowly that there must be rather strict 
upper bounds on the entropy rate, small out of all proportion 
to the cost of the contraption.

A detuned FM card is a bad idea, because it is just
begging the opponent to sit next door with an FM
transmitter.

A microphone causes users to worry about privacy, and
in any case doesn't add much beyond what you'd get with
the same input circuitry open-circuited, i.e. everything
except the microphone itself.

Radioactive decay has a poor price/performance ratio, and
isn't nearly as random as neophytes might think, when the
data-acquisition hardware is taken into account.

 2) Vetting a generator by trying to detect patterns
 in the output is like kicking the tires on a used car
 ... go ahead and do it if you want, but it is far from
 sufficient for establishing any reasonable standard of
 correctness.
 
 You can't vet a RNG by looking at its output, 

We agree.

 which is likely whitened anyway, 

Depending on what whitening means;  see below.

 but you can gain confidence by looking at its design 

Yes!

 and measuring the entropy in the raw-physical-source derived bitstream.

That's the point where I would like some more detail.
If measuring means applying statistical tests, then
I've never seen such measurements done in a way that is
really convincing.  Constructive examples would be welcome.

Just saying Joe Schmoe applied all the tests he could 
think of and couldn't compress it more than XY% isn't
going to convince me.

I recommend _calculating_ the entropy from physics principles,
rather than trying to measure the entropy using statistical
tests.  The calculation is based on a handful of macroscopic
physical parameters, such as temperature, gain, and bandwidth.

 If the raw source has  1 bits/symbol (and it will), 

Commonly but not necessarily  1 bit/symbol.
Depending on what you mean by symbol, a 24-bit 
audio card provides a low-cost counterexample.

 it'd be nice if a later stage
 distilled this to near 1 bit/symbol, before whitening.  

We need to be more specific about what the symbol
alphabet is.  If the symbols are ASCII characters,
1 bit per symbol is not nearly good enough.

More importantly, I don't know what whitening means in 
this case.

The output of a good distiller has virtually 100% entropy 
density, i.e. 8 bits per byte.  I say virtually because
perfection is impossible, but 159.98 bits in a 160 bit
word ought to be good enough for most applications :-).

I see no point in whitening the output of such a
distiller.

If whitening means encrypting the output of the distiller,
I consider that just a more-complicated hash function ...
just another few rounds.

 Of course, no one
 outside the box will know, since you're whitening, but it yields resistance
 to (albeit difficult) attacks (e.g., your hash turns out to be attackable).

I assume that means know [that I'm using a distiller]

Well, in principle nobody outside the box knows _anything_
about the mechanism (unless they read the documentation).
One random symbol stream looks a lot like another :-).  
Attackers can always check to see whether the generator 
is broken or not.  But if it's not broken, all they can 
do (from outside) is measure the output-rate.

 I also fail to see harm in measuring/monitoring entropy 
 as the RNG operates.

Certainly there's no harm.  It's like kicking the tires
on the used car.  It gives some people a warm fuzzy, but 
it's far from sufficient for establishing any reasonable 
level of confidence.

I recommend monitoring the aforementioned macroscopic
(non-statistical) physical parameters, both to detect
gross hardware failure and to detect attempted jamming.
But that's very different from the traditional (and
hereby deprecated) procedure of measuring the entropy
using statistical tests.

For lots more detail, see
  http://www.monmouth.com/~jsd/turbid/

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL 

Re: building a true RNG (was: Quantum Computing ...)

2002-07-22 Thread David Honig

At 04:24 PM 7/22/02 -0400, John S. Denker wrote:

For the humor-impaired, let me point out that the lava 
lamp is a joke.  What it conspicuously lacks is a proof 
of correctness -- that is, a nonzero lower bound on the 
entropy rate of the raw data.  

Yes, it is a joke.  However, it is also a viable if low-bandwidth
entropy source.  I disagree that you need to be able to model
the lava (etc) well enough to give a lower bound on the entropy.
(Though its nice if you have such a model).  You should be able
to use any source which you know is not a PRNG as the entropy-source
in a true RNG.  You should be able to use entropy (and stat tests)
to measure the source entropy after digitization.

The lava could turn out to 
have a not-very-complicated periodic pattern.  Secondarily, 
the pattern changes so slowly that there must be rather strict 
upper bounds on the entropy rate, small out of all proportion 
to the cost of the contraption.

Yes, thus the humor.


A detuned FM card is a bad idea, because it is just
begging the opponent to sit next door with an FM
transmitter.

So work in a Faraday cage...


A microphone causes users to worry about privacy, and
in any case doesn't add much beyond what you'd get with
the same input circuitry open-circuited, i.e. everything
except the microphone itself.

The advantage of a microphone, I suppose, is that you can
hear any jamming attempts..


Radioactive decay has a poor price/performance ratio, and
isn't nearly as random as neophytes might think, when the
data-acquisition hardware is taken into account.

Its a joke/hack of the 'finesse' form.. 


 which is likely whitened anyway, 

Depending on what whitening means;  see below.

You can imagine simple-hashing (irreversible compression) 
as distinct from whitening which is
related to cryptographic strength, avalanche, mixing, etc.

Source -- Digitizer -- Simple hash -- Whitener (e.g., DES)

In this view, SHA combines the compression (aka digestion) 
function with the crypto-strength whitening


That's the point where I would like some more detail.
If measuring means applying statistical tests, then
I've never seen such measurements done in a way that is
really convincing.  Constructive examples would be welcome.

You collect some representative raw data, and run a number of 
entropy measurements on that sample.  You find  1bit/baud.
You run the data through an algorithm which produces fewer bits.
You measure the entropy of the result.  When successive (or
'stronger') runs measure at 1b/bd, you have distilled
entropy from that sample.  To use this in crypto, you'd
put it through a whitener --feed blocks to DES-- for
belts-and-suspenders assurance.  And because you don't
want someone looking through your simple-hashing logic
back to your source.

Once you put it through DES, anything (eg the integers) appears random.
That's why you measure before whitening, if possible.  If
you use SHA you can only measure the input entropy and make
sure you hash enough bits down to produce your output stream.
Since utter crap coming in will look perfectly white coming out.

I recommend _calculating_ the entropy from physics principles,
rather than trying to measure the entropy using statistical
tests.  The calculation is based on a handful of macroscopic
physical parameters, such as temperature, gain, and bandwidth.

You measure because your model may have overlooked something.


The output of a good distiller has virtually 100% entropy 
density, i.e. 8 bits per byte.  I say virtually because
perfection is impossible, but 159.98 bits in a 160 bit
word ought to be good enough for most applications :-).

I agree with the first statement (by definition), I think in crypto you have
to be dubious (paranoid?) of the second.

I see no point in whitening the output of such a
distiller.

So the adversary can't look back into your logic.  A 'distiller'
which produces quality entropy (after digesting an arbitrary
number of bits) needn't be as opaque as a crypto-secure hash is.


If whitening means encrypting the output of the distiller,
I consider that just a more-complicated hash function ...
just another few rounds.

That's a fine implementation.  Others may wish to separate
the phases (e.g., hardware to do simple-hashing can be high-speed
on the entropy-source-side, and feed slower crypto-hashing (whitening) 
logic on the other side).  But using software to distill your entropy its
easy enough to run a few more iters.  And there are very few users of high
bandwidth entropy.



 Of course, no one
 outside the box will know, since you're whitening, but it yields resistance
 to (albeit difficult) attacks (e.g., your hash turns out to be attackable).

I assume that means know [that I'm using a distiller]

No one outside the box can tell the difference between simply
hashed 'noise' and crypto-hashed 'noise'.  Or between those
and a long-sequence PRNG.

Now, if the adversary knows things about your simple-distiller, he
may be able to use that.  If a 

Re: building a true RNG (was: Quantum Computing ...)

2002-07-22 Thread John S. Denker

David Honig wrote yet another nice note:
 
 So work in a Faraday cage...

Tee, hee.  Have you ever worked in a Faraday cage?
Very expensive.  Very inconvenient.

 Depending on what whitening means;  see below.
 
 You can imagine simple-hashing (irreversible compression)
 as distinct from whitening which is
 related to cryptographic strength, avalanche, mixing, etc.

I'm not trying to be dense, but I'm totally not 
understanding the distinction here.  The following
block diagram is excellent for focussing the discussion,
(thanks):

 Source -- Digitizer -- Simple hash -- Whitener (e.g., DES)

OK, we have DES as an example of a whitener.  
-- Can somebody give me an example of a simple hash 
that performs irreversible compression of the required
kind?
-- Isn't the anti-collision property required of even
the simplest hash?  Isn't that tantamount to a very
strong mixing property?  If there's strong mixing in
the simple hash function, why do we need more mixing
in the later whitening step?
-- What is meant by cryptologic strength?  Strength
against what kind of attack?  If this means in particular
the one-way property, why do I need it?  I can understand
why a !!pseudo!! random symbol generator needs the one-way
property, to protect its internal state, but since my
generator has no secret state to protect, why do I need
any cryptologic properties other than mixing?
-- BTW note that the term avalanche is probably not
helpful to this discussion, because it is usually defined
(see e.g. Handbook of Applied Cryptography) in terms of
single-bit flips.  The anti-collision property of the hash
function demands resistance to multi-bit flips.

 In this view, SHA combines the compression (aka digestion)
 function with the crypto-strength whitening

If you want to think of it that way, then we come
to the same final state.

I assume digestion means the same as distillation?

 You collect some representative raw data, and run a number of
 entropy measurements on that sample.  You find  1bit/baud.

In particular you have found an upper bound.
To paraphrase Dykstra:  testing can find an upper bound
on the entropy density, but it can never find a lower bound.

 You run the data through an algorithm which produces fewer bits.
 You measure the entropy of the result.  When successive (or
 'stronger') runs measure at 1b/bd, you have distilled
 entropy from that sample.  

No, you have just reached the limits of your chosen number
of entropy measurements.  You cannot convince a skeptic that
a new test, discovered tomorrow, will not greatly lower your
new (1b/bd) upper bound.

 To use this in crypto, you'd
 put it through a whitener --feed blocks to DES-- for
 belts-and-suspenders assurance.  And because you don't
 want someone looking through your simple-hashing logic
 back to your source.

I say again that I don't need the one-way property.  At
each iteration, the input to the hash function is unrelated
to all past and future inputs.

We agree that the belt-and-suspenders approach is a standard 
way of achieving high reliability.  It works in crypto and in
many unrelated fields
  http://www.monmouth.com/~jsd/how/htm/misc.html#sec-layers-safety

If you want to XOR the output of the HRNG with some nice
PRNG, that will give excellent error-concealment in the
case of a gross failure of one or the other.

But (minor opoint) I recommend we continue to call a belt a belt, 
and call a suspender a suspender.  Call a HRNG a HRNG, and call
a PRNG a PRNG.  Adding a strong one-way function to my HRNG seems
like a total waste of CPU cycles.

 Once you put it through DES, anything (eg the integers) appears random.
 That's why you measure before whitening, if possible.  

We agree that measuring after whitening is pointless,
given the current state of the art, namely that encryption
algorithms are incomparably stronger than automatic measurement
(pattern-finding) algorithms.

 I recommend _calculating_ the entropy from physics principles,
 rather than trying to measure the entropy using statistical
 tests.  The calculation is based on a handful of macroscopic
 physical parameters, such as temperature, gain, and bandwidth.
 
 You measure because your model may have overlooked something.

Measure what?  Measure how?  If I run diehard on my raw
data it will tell me that the data has entropy density
far less than 8 bits per byte -- but duh, I already knew
that.  Running standard compression algorithms (Lempel-Ziv)
or whatever will give me an upper bound that is much,
much, higher than my calculated lower bound -- so even
if I've overlooked something or made an error in the calculation
the measurement is not sensitive enough to catch it.

 The output of a good distiller has virtually 100% entropy
 density, i.e. 8 bits per byte.  I say virtually because
 perfection is impossible, but 159.98 bits in a 160 bit
 word ought to be good enough for most applications :-).
 
 I agree with the first statement (by definition), I think in crypto you have
 to be