On 12/22/2010 06:57 AM, Kevin Chadwick wrote:
On Wed, 22 Dec 2010 05:08:56 -0600
Marsh Ray<[email protected]>  wrote:

Let's say I could sample the output of the RNG in every process and from
every network device in the system. As much as I wanted. How could I
tell the difference between "one prng per purpose" and "data-slicing one
prng with all consumers"?

There was a thread "called how to use /dev/srandom" where theo sent
this, which may be relevant?

It's relevant, but not precisely the question I was trying to ask.

For those who don't want to go read the code, the algorith on the very
back end is roughly this:

     (a) collect entropy until there is a big enough buffer
     (b) fold it into the srandom buffer, eventually

That is just like the past.

But the front end is different.  From the kernel side:

     (1) grab a srandom buffer and start a arc4 stream cipher on it
        (discarding the first bit, of course)
     (2) now the kernel starts taking data from this on every packet
        it sends, to modulate this, to modulate that, who knows.
     (3) lots of other subsystems get small chunks of random from the
        stream; deeply unpredictable when
     (4) on very interrupt, based on quality, the kernel injects
     something into (a)
     (5) re-seed the buffer as stated in (1) when needed

How is this different, except for perhaps the intermediate arc4 cipher. What does that add, other than crappiness? (RC4 is known to be distinguishable from a good random source.)

Simultaneously, userland programs need random data:

     (i) libc does a sysctl to get a chunk from the rc4 buffer
     (ii) starts a arc4 buffer of it's own, in that program
     (iii) feeds data to the program, and re-seeds the buffer when needed

The arc4 stream ciphers get new entropy when they need.

Looking at lib/libc/crypt/arc4random.c it would appear that happens once on startup or fork and then again after about every 1.6MB of random data consumed. Probably most processes will not consume that much.

But the really
neat architecture here is that a single stream cipher is *unpredictably*
having entropy taken out of it, by hundreds of consumers.

How is that noticeably different than any other system where processes are reading from /dev/(u)random and kernel events are mixing in a high-resolution timer?

In regular
unix operating systems, there are only a few entropy consumers.  In
     OpenBSD there are hundreds and hundreds.

But a typical box doesn't have "hundreds and hundreds" of processes or unpredictable event sources. There are 300 or so references in the source tree, but most of them are in code that doesn't run on any given machine.

A special-purpose box (e.g. a IPsec VPN gateway) may have very few other than network events, which are known to an outside observer to a significant degree.

 The entire system is full
     of random number readers, at every level.  That is why this works
     so well.

How do you know it works well? How is it observably different?

- Marsh

Reply via email to