Re: building a true RNG

2002-07-29 Thread Sampo Syreeni

On 2002-07-27, David Wagner uttered to [EMAIL PROTECTED]:

In particular, there is no function f which doesn't waste entropy,
unless |Y|=1.  The proof is simple enough that you can probably find it
with a moment's contemplation, but let me know if you want to see it.

I would be lead to pigeonhole; still, this does seem like an eminently
teachable moment to me.

However, what we're working with in the case of a typical RNG isn't
functions between finite buffer-fulls of data, but functions between
infinite sets of entire bitstreams which need to be implemented within a
finite memory constraint. Whatever the algorithm, it can have state. I'm
thinking that is the reasoning which leads people to think that the
problem is simple. In this framework one can very well gather statistics
and try and normalize them based on what one's seen so far. (The
minimalist example here is bias removal.)  After that, what goes into the
hash is (approximately) white and even with minimal assumptions (i.e. the
hash is a one-way permutation) our RNG would seem to approach ideality.

True, such an approach will always become a race in modelling the
statistics. But still, I would think the generator has the advantage if
sufficient margin is left for error. (That is, hash some constant times
the number of bits containing the entropy you suspect to be there, based
on the best model you've got.)
-- 
Sampo Syreeni, aka decoy - mailto:[EMAIL PROTECTED], tel:+358-50-5756111
student/math+cs/helsinki university, http://www.iki.fi/~decoy/front
openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread Sampo Syreeni

On 2002-07-28, Sampo Syreeni uttered to David Wagner:

[Answering to my own mail. Sorry.]

and discard every 1/(p(x)-1/256)'th sample with value x.

Actually the pedantic solution would be to put an arithmetic
compressor/coder between the input and output, using the best model we've
got. That still leaves model adaptation to be dealt with, but if we
discard a sufficient number of output bits at start (estimable from the
model), we *will* end up with (very nearly) flat statistics on the output.
Asymptotic optimality and all that... (The qualification comes from
limited precision arithmetic.)
-- 
Sampo Syreeni, aka decoy - mailto:[EMAIL PROTECTED], tel:+358-50-5756111
student/math+cs/helsinki university, http://www.iki.fi/~decoy/front
openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread David Wagner

 An example: presume we take a simple first order statistical model. If our
 input is an 8-bit sample value from a noise source, we will build a 256
 bin histogram. When we see an input value, we look its probability up in
 the model, and discard every 1/(p(x)-1/256)'th sample with value x. When
 this happens, the sample is just eaten and nothing appears in the output;
 otherwise we copy.

I understand what you're trying to say, but this will not give a
general-purpose function that doesn't waste entropy regardless of the
input distribution.  This only works when the distribution on the input
stream consists of independent, memoryless samples from some distribution
on 8-bit values.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



1024-bit RSA key safety still unknown

2002-07-29 Thread Russell Nelson

Dan Bernstein has a response to the June 2002
Lenstra-Shamir-Tomlinson-Tromer paper (and similarly, Bruce Schneier's
comments) about his research into the cost of circuits for integer
factorization.

http://cr.yp.to/nfscircuit.html

-- 
-russ nelson  http://russnelson.com |  New Internet Acronym:
Crynwr sells support for free software  | PGPok |
521 Pleasant Valley Rd. | +1 315 268 1925 voice | IANAE
Potsdam, NY 13676-3213  | +1 315 268 9201 FAX   |  I Am Not An Economist

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread David Wagner

Barney Wolff  wrote:
This leads me to ask what may be a laughably naive question:
Do we even know that the popular hash functions can actually generate
all 2^N values of their outputs?

It seems very unlikely that they can generate all 2^N outputs
(under current knowledge).  However, they satisfy the next-best
thing: their output appears to be indistinguishable from uniform to
computationally-bounded observers, hence it's as good as if they
could generate all 2^N outputs for most purposes.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread Barney Wolff

Yes, but is there any lower bound known?  2^(N-1)?  Or worse?
Clearly a hash with N-bit output that can't generate all 2^N
values can never actually have N bits of entropy.  Depending on how
many forbidden values there are, this may or may not be significant.
Of course if the forbidden values cannot be found, I'd agree with
the as good as.

On Mon, Jul 29, 2002 at 03:39:32PM +, David Wagner wrote:
 Do we even know that the popular hash functions can actually generate
 all 2^N values of their outputs?
 
 It seems very unlikely that they can generate all 2^N outputs
 (under current knowledge).  However, they satisfy the next-best
 thing: their output appears to be indistinguishable from uniform to
 computationally-bounded observers, hence it's as good as if they
 could generate all 2^N outputs for most purposes.

-- 
Barney Wolff
I never met a computer I didn't like.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread David Wagner

Oh dear.  On re-reading your message I now suspect that what you asked
is not what I originally thought you asked.  I see two questions here:
  Q1: If we cycle through all N-bit messages, are all
  2^N output values possible?
  Q2: If we cycle through all messages (possibly very long
  or very short), are all 2^N output values possible?
On first reading I thought you were asking Q1, but it now occurs to me
you were probably asking Q2.  I apologize profusely for any confusion
I may have caused.

Anyway, the answer to Q1 is likely to be No.  I'd guess that the answer
to Q2 is probably Yes, or close to it.

For the first scenario, if the hash behaves like a random oracle,
then one would expect only 2^N * (1 - 1/e) of the possible outputs to
be attainable.  Of course, the forbidden outputs are not easy to find;
the best way I know is by exhaustive search over all 2^N N-bit messages.
For the second scenario, if the hash behaves like a random oracle, then
one would expect all outputs to be attainable.  No significant deviation
from the random oracle model is known for SHA1 (well, with one exception
that doesn't appear to be relevant here).  That said, this is not a proof,
and my answers just represent my best guesses.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread John S. Denker

Barney Wolff  asked:
 Do we even know that the popular hash functions can actually generate
 all 2^N values of their outputs?

David Wagner replied:
 
 It seems very unlikely that they can generate all 2^N outputs
 (under current knowledge).  

I was temporarily astonished, but he clarified as follows:

   Q2: If we cycle through all messages (possibly very long
   or very short), are all 2^N output values possible?
...
 I'd guess that the answer
 to Q2 is probably Yes, or close to it.

1) Consider the following function H0:  Divide the input into
chunks N bits long.  Calculate the XOR of all the chunks, and
use that as the output.

This meets the definition of hash function, although it would
not be a one-way hash function.  And it would most certainly
be capable of generating all 2^N possible outputs.

2) I can't prove that a standard hash function such as SHA1
generates all possible codes, but I consider it likely.  It would 
be quite shocking if a strong hash function such as SHA1 generated
fewer codes than a weak function such as H0.

3) For a one-way hash function should not expect a _constructive_ 
proof that it generates all possible codes;  such a construction
would violate the one-way property.

4) Here is a rough plausibility argument.  Consider two hash
functions H_1 and H_2 that are independent.  We define
H'(x) := H_1(x) XOR H_2(x)  (1)
which implies
H_1(x) = H'(x) XOR H_2(x)   (2)

Now let's look at the truth table for equation (2), where
the row-index is the H' code, the column-index is the H_2
code, and each entry represents the H_1 code required
to uphold the equation:

 00   01   10   11
00   00   01   10   11
01   01   00   11   10
10   10   11   00   01
11   11   10   01   00

Now let's suppose H_2 is missing one code (say the 10 code) and
H' is missing one code (say the 11 code).  Then H_1 must be missing
at least three codes!  Otherwise there would be a way of combining
a non-missing H_1 code with a non-missing H_2 code to create the
missing H' code.

 00   01   10m  11
00   00   01   10   11
01   01   00   11   10
10   10   11   00   01
11m  11m  10m  01   00m

We can extend this argument by combining lots and lots of 
independent hash functions H_i.   The combination has far 
fewer missing codes than any of the ingredients.  So either
you conclude that 
  a) there is a conspiracy that prevents us from constructing
 independent hash functions to use as ingredients, or
  b) we can produce hash functions with very few missing codes.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread Sandy Harris

David Wagner wrote:
 
 Oh dear.  On re-reading your message I now suspect that what you asked
 is not what I originally thought you asked.  I see two questions here:
   Q1: If we cycle through all N-bit messages, are all
   2^N output values possible?
   Q2: If we cycle through all messages (possibly very long
   or very short), are all 2^N output values possible?
 On first reading I thought you were asking Q1, but it now occurs to me
 you were probably asking Q2.  I apologize profusely for any confusion
 I may have caused.
 
 Anyway, the answer to Q1 is likely to be No.  I'd guess that the answer
 to Q2 is probably Yes, or close to it.

I think the interesting question is whether, for M-bit hash inputs,
and an N-bit hash, with a lower bound Q on entropy per input batch,
so M  Q  N, we can show, as I think Denker is claiming to have done,
that the entropy of hash(M) must be  N - epsilon, for some epsilon
small enough to ignore.

That would imply not only that all 2^N values occur, but that they
are reasonably evenly distributed.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread David Wagner

 3) For a one-way hash function should not expect a _constructive_ 
 proof that it generates all possible codes;  such a construction
 would violate the one-way property.

Nitpick: the last statement does not seem quite right to me.  I'm thinking
of the notion of a one-way permutation.  For instance, the RSA function
f(x) = x^3 mod n is conjectured to be a one-way permutation, assuming n
is a RSA modulus with unknown factorization and the RSA assumption holds.
(I'm being a little fast and loose with the formal details, but I hope
this conveys the idea.)

That said, I'm not claiming that the RSA function would make a good
cryptographic hash function.  Cryptographic hash functions are expected
to be a lot more than just one-way and collision-free.

I don't know of any good cryptographic hash function that comes with
a proof that all outputs are possible.  However, it might not be too
hard to come up with plausible examples.  For example, if we apply the
Luby-Rackoff construction (i.e., 3 rounds of a Feistel cipher), with
ideal hash functions in each round, does this have the desired properties?
It might.

On the gripping hand, I don't think this is a real issue in practice.
SHA1 is probably good enough for all practical purposes that I can
think of.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread Jack Lloyd

On Mon, 29 Jul 2002, David Wagner wrote:

  DES, being extremely hardware friendly, can be (ab)used to
  make a strong one-way hash.  (E.g., raw input into both key and data maps
  56+64 - uniformly distributed 64 bits.)

 However, when used in this way, DES is not an especially good hash function.
 For instance, it is easy to find collisions, to find pre-images, and so on.

Somewhat related to that, are there any block cipher-hash function methods
that are actually secure? Every one I've ever read about seems to have been
broken.

Regards,
  Jack


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread Matt Crawford

2) I can't prove that a standard hash function such as SHA1
generates all possible codes, but I consider it likely.  It would 
be quite shocking if a strong hash function such as SHA1 generated
fewer codes than a weak function such as H0.

I think you could do a probabilistic proof similar to the DES is not
a group quasi-proof.  To test a hash function h() whose range is S,
let F be the set of balanced functions from S - {0, 1}.  (Balanced
meaning that each f in F maps exactly half of S to 0 and half to 1.)
If you can contrive to choose many members of F at random, and compose
them with h for many arguments of h, you should be able to set
confidence limits on how much of S is covered by h.


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread Arnold G. Reinhold

At 12:20 PM -0700 7/29/02, David Honig wrote:

Whether there is a need for very high bandwidth RNGs was discussed
on cypherpunks a few months ago, and no examples were found.
(Unless you're using something like a one-time pad where you need
a random bit for every cargo bit.)  Keeping in mind that
a commerical crypto server can often accumulate entropy during
off-peak hours. 


It's been discussed here some time back as well. If you believe your 
crypto primitives are infeasible to break, a crypto-based PRNG with a 
long enough random seed should be indistinguishable from a true, 
perfect RNG. If you are only confident that your crypto primitives 
are expensive to break, then using a true RNG for keys and nonces, 
rather than deriving them all from one PRNG, adds security.

This suggest a continuum of solutions: Construct a crypto PRNG and 
periodically (once enough has accumulated) stir your entropy source 
into it's state in some safe way. If you extract entropy slower than 
you put it in you can expect the equivalent of of a true RNG. If you 
extract entropy faster than you put it in, the system degrades 
gracefully in the sense that someone who expends the effort to break 
the number generation scheme only gets to read messages since the 
last entropy update.

The reason for batching entropy input is to prevent someone who has 
broken your system once from discovering each small entropy input by 
exhaustive search.  (There was a nice paper pointing this out in. If 
someone has the reference...)

Arnold Reinhold

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread David Wagner

 To test a hash function h() whose range is S,
 let F be the set of balanced functions from S - {0, 1}.  (Balanced
 meaning that each f in F maps exactly half of S to 0 and half to 1.)
 If you can contrive to choose many members of F at random, and compose
 them with h for many arguments of h, you should be able to set
 confidence limits on how much of S is covered by h.

Can you elaborate?  What would the computational complexity of this be?

Estimating the size of {x : f(h(x))=1} to within relative error e
requires O(1/e^2) evaluations of h, if I understand correctly.  If we
consider the difference between a hash function that can hit all of
S, and another hash function that misses only one element of S, we'll
need to resolve the size of these sets to within relative error 1/|S|.
That suggests we'll need |S|^2 evaluations of h, which is infeasible
for SHA1 and slower than the naive O(|S|) algorithm.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread David Wagner

 The reason for batching entropy input is to prevent someone who has 
 broken your system once from discovering each small entropy input by 
 exhaustive search.  (There was a nice paper pointing this out in. If 
 someone has the reference...)

I believe you are referring to the state compromise attacks
described in the following paper:
  J. Kelsey, B. Schneier, D. Wagner, C. Hall,
  Cryptanalytic Attacks on Pseudorandom Number Generators,
  FSE'98.  http://www.counterpane.com/pseudorandom_number.html
I once wrote a short note about the relevance of this to IPSec:
  http://www.cs.berkeley.edu/~daw/my-posts/using-prngs

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]



Re: building a true RNG

2002-07-29 Thread Ben Laurie

David Wagner wrote:
 Amir Herzberg wrote:
 
So I ask: is there a definition of this `no wasted entropy` property, which
hash functions can be assumed to have (and tested for), and which ensures
the desired extraction of randomness?
 
 
 None that I know of.  I'm not aware of much work in the crypto literature
 on this topic.
 
 Actually, there is not much hope for such a property.  It is pretty easy
 to see that, if we make no assumptions on the entropy inputs other than
 they have sufficient entropy, then no single deterministic algorithm can
 ever be good at extracting randomness.  If we fix any single extraction
 algorithm A, there exists a distribution D on the inputs which make it
 give non-uniform output (for example, D might be the uniform distribution
 on the set {x : A(x) has its first ten bits zero}).  This is a standard
 observation from the theory community's work on derandomization.

That's the kind of thing mathematicians like to say - but how much does 
it apply to the real world? For example, you can't produce your example 
set for SHA-1 or MD5, can you?

If you are prepared to factor in cost, then the landscape changes, I 
would contend. Mathematicians generally assume uncountable resources. 
Even constructivists assume countable resources. However, we can assume 
_finite_ resources. Big difference.

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html   http://www.thebunker.net/

Available for contract work.

There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit. - Robert Woodruff


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]