On 12/22/2010 06:57 AM, Kevin Chadwick wrote:
On Wed, 22 Dec 2010 05:08:56 -0600
Marsh Ray<[email protected]> wrote:
Let's say I could sample the output of the RNG in every process and from
every network device in the system. As much as I wanted. How could I
tell the difference between "one prng per purpose" and "data-slicing one
prng with all consumers"?
There was a thread "called how to use /dev/srandom" where theo sent
this, which may be relevant?
It's relevant, but not precisely the question I was trying to ask.
For those who don't want to go read the code, the algorith on the very
back end is roughly this:
(a) collect entropy until there is a big enough buffer
(b) fold it into the srandom buffer, eventually
That is just like the past.
But the front end is different. From the kernel side:
(1) grab a srandom buffer and start a arc4 stream cipher on it
(discarding the first bit, of course)
(2) now the kernel starts taking data from this on every packet
it sends, to modulate this, to modulate that, who knows.
(3) lots of other subsystems get small chunks of random from the
stream; deeply unpredictable when
(4) on very interrupt, based on quality, the kernel injects
something into (a)
(5) re-seed the buffer as stated in (1) when needed
How is this different, except for perhaps the intermediate arc4 cipher.
What does that add, other than crappiness? (RC4 is known to be
distinguishable from a good random source.)
Simultaneously, userland programs need random data:
(i) libc does a sysctl to get a chunk from the rc4 buffer
(ii) starts a arc4 buffer of it's own, in that program
(iii) feeds data to the program, and re-seeds the buffer when needed
The arc4 stream ciphers get new entropy when they need.
Looking at lib/libc/crypt/arc4random.c it would appear that happens once
on startup or fork and then again after about every 1.6MB of random data
consumed. Probably most processes will not consume that much.
But the really
neat architecture here is that a single stream cipher is *unpredictably*
having entropy taken out of it, by hundreds of consumers.
How is that noticeably different than any other system where processes
are reading from /dev/(u)random and kernel events are mixing in a
high-resolution timer?
In regular
unix operating systems, there are only a few entropy consumers. In
OpenBSD there are hundreds and hundreds.
But a typical box doesn't have "hundreds and hundreds" of processes or
unpredictable event sources. There are 300 or so references in the
source tree, but most of them are in code that doesn't run on any given
machine.
A special-purpose box (e.g. a IPsec VPN gateway) may have very few other
than network events, which are known to an outside observer to a
significant degree.
The entire system is full
of random number readers, at every level. That is why this works
so well.
How do you know it works well? How is it observably different?
- Marsh