On Wed, Sep 29, 2010, Theo de Raadt wrote to [email protected]:
> [Ted Unangst wrote:  -- Joachim]
> > On Wed, Sep 29, 2010 at 12:49 PM, Kevin Chadwick <[email protected]> 
> > wrote:
> > >> [Joachim Schipper wrote:  -- Joachim]
> > >> > And isn't srandom sometimes (very rarely!) appropriate? E.g. for
> > >> > generating encryption keys?
> > 
> > If arandom is somehow not appropriate for generating keys, it should
> > be fixed.  I'd be interested to hear more.
> 
> For those who don't want to go read the code, the algorith on the very back
> end is roughly this:
> 
>     (a) collect entropy until there is a big enough buffer
>     (b) fold it into the srandom buffer, eventually
> 
> That is just like the past.
> 
> But the front end is different.  From the kernel side:
> 
>     (1) grab a srandom buffer and start a arc4 stream cipher on it
>        (discarding the first bit, of course)
>     (2) now the kernel starts taking data from this on every packet
>        it sends, to modulate this, to modulate that, who knows.
>     (3) lots of other subsystems get small chunks of random from the
>        stream; deeply unpredictable when
>     (4) on very interrupt, based on quality, the kernel injects something
>        into (a)
>     (5) re-seed the buffer as stated in (1) when needed
> 
> Simultaneously, userland programs need random data:
> 
>     (i) libc does a sysctl to get a chunk from the rc4 buffer
>     (ii) starts a arc4 buffer of it's own, in that program
>     (iii) feeds data to the program, and re-seeds the buffer when needed
>          
> The arc4 stream ciphers get new entropy when they need. But the really
> neat architecture here is that a single stream cipher is *unpredictably*
> having entropy taken out of it, by hundreds of consumers.  In regular
> unix operating systems, there are only a few entropy consumers.  In OpenBSD
> there are hundreds and hundreds.  The entire system is full of random number
> readers, at every level.  That is why this works so well.
> 
> > > I notice arandom doesn't pause. Is arandom always better or only when
> > > there's enough entropy?
> > 
> > It is more efficient.  There is almost always enough entropy for
> > arandom, and if there isn't, you would have a hard time detecting
> > that.
> 
> There is always enough.  The generator will keep moving, until it has fetched
> too much, or too much time has gone by.  Then it reseeds; though I think
> it fundamentally does not care if the srandom buffer it feeds from is full
> or not.

My, how embarrassing. I could have figured that out.

Still, the man page does allow these misconceptions to persist. Perhaps
the following patch would improve matters?

- terminology: (A)RC4 is not a hash/message digest like MD5;
- clarify what srandom(4), urandom(4) and arandom(4) do and how they
  compare;
- "pauses while more of such data is collected" -> "pauses while more
  data is collected" (clear from the context and less awkward);
- clarify that arandom(4) is pretty much always the right choice.

This does not attempt to explain *why* arandom(4) continues to output
high-quality data even when the entropy pool runs low (the reasons,
(pseudorandom) "generator" and "simultaneous", *are* named); I couldn't
think of a few to express it concisely enough for a man page.

                Joachim

Index: random.4
===================================================================
RCS file: /usr/cvs/src/src/share/man/man4/random.4,v
retrieving revision 1.22
diff -u -p -r1.22 random.4
--- random.4    10 Oct 2008 20:13:29 -0000      1.22
+++ random.4    30 Sep 2010 19:21:33 -0000
@@ -42,31 +42,30 @@ The various
 devices produce random output data with different random qualities.
 Entropy data is collected from system activity (like disk and
 network device interrupts and such), and then run through various
-hash or message digest functions to generate the output.
+algorithms to generate the output.
 .Bl -hang -width /dev/srandomX
 .It Pa /dev/random
 This device is reserved for future support of hardware
 random generators.
 .It Pa /dev/srandom
-Strong random data.
-This device returns reliable random data.
+Data directly from the entropy pool, converted using MD5.
 If sufficient entropy is not currently available (i.e., the entropy
-pool quality starts to run low), the driver pauses while more of
-such data is collected.
-The entropy pool data is converted into output data using MD5.
+pool quality starts to run low), the driver pauses while more
+data is collected.
 .It Pa /dev/urandom
-Same as above, but does not guarantee the data to be strong.
-The entropy pool data is converted into output data using MD5.
+Data directly from the entropy pool, converted using MD5.
 When the entropy pool quality runs low, the driver will continue
 to output data.
 .It Pa /dev/arandom
-As required, entropy pool data re-seeds an ARC4 generator,
-which then generates high-quality pseudo-random output data.
+As required, entropy pool data re-seeds an ARC4 generator.
+This cryptographic generator is simultaneously used by many kernel
+subsystems and userland programs (as a seed for the
+.Xr arc4random 3 
+pool).
 .Pp
-The
-.Xr arc4random 3
-function in userland libraries seeds itself from this device,
-providing a second level of ARC4 hashed data.
+The entropy output by this device is as strong as the output of
+.Pa /dev/srandom ,
+but this device will never pause.
 .El
 .Sh FILES
 .Bl -tag -width /dev/srandom -compact

Reply via email to