Re: Allegations regarding OpenBSD's PRNG

Marsh Ray Wed, 22 Dec 2010 04:31:02 -0800

On 12/21/2010 09:26 PM, Theo de Raadt wrote:

Wow.  You really are not reading the same code, are you.

Haha, yeah I have been reading all over the map. My comments areout-of-order too.

BTW, the nanotime in arc4_stir looks like it's redundant anyway sinceget_random_bytes calls extract_entropy which calls add_timer_randomness.

* An unauthenticated attacker may be able to sample an almost arbitrary
amount of output from your PRNG by making new IPsec connections.


Really?  Prove it.


http://tools.ietf.org/html/rfc2405

   The payload field, as defined in [ESP], is broken down according to
   the following diagram:
      +---------------+---------------+---------------+---------------+
      |                                                               |
      +                   Initialization Vector (IV)                  +
      |                                                               |
      +---------------+---------------+---------------+---------------+
      |                                                               |
      ~              Encrypted Payload (variable length)              ~
      |                                                               |
      +---------------------------------------------------------------+
       1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

http://code.bsd64.org/cvsweb/openbsd/src/sys/arch/amd64/amd64/aesni.c?rev=1.17

Bring CBC oracle attack countermeasure from r1.32 of cryptosoft.c to
the hardware crypto accelerator land.  This fixes aes-ni, via xcrypt,
glxsb(4), hifn(4), safe(4) and ubsec(4) drivers.

Original commit message by angelos:

Don't keep the last blocksize-bytes of ciphertext for use as the next
plaintext's IV, in CBC mode. Use arc4random() to acquire fresh IVs per
message.

with and ok deraadt, ok markus, djm
@@ -343,7 +340,7 @@ aesni_encdec(struct cryptop *crp, struct
                if (crd->crd_flags & CRD_F_IV_EXPLICIT)
                        bcopy(crd->crd_iv, iv, ivlen);
                else
-                       bcopy(ses->ses_iv, iv, ivlen);
+                       arc4random_buf(iv, ivlen);

                /* Do we need to write the IV */
                if ((crd->crd_flags & CRD_F_IV_PRESENT) == 0) {


Just a hunch. Worth following up on IMHO.

As I
understand it, each now sends 128 bits or so of output as plaintext over
the wire in the IV. :-)


Four days ago, if you were using a particular set of hardware drivers,
then yes.  But the software ipsec stack was fixed for this NINE YEARS
AGO.

What you just said was utterly careless.


*shrug*

* How much of nanotime() is truly unpredictable to an attacker sitting
directly on your local network cable? Maybe the bottom 10 bits, at best?


No.  The upper bits are 'more known'.  The lower bits are 'less known'.


Right, the unpredictable ones are near the bottom.

How did you come to your conclusion?

I wasn't drawing conclusions and tried to indicate that with the use ofkey words and phrases like "may" and "I don't know...but" andpunctuation like '?'.

Except we don't do what you suggest at all.


Excellent!

* Unless you persist entropy across reboots, you are starting from a
known state at boot.


We don't save entropy over boots. You are speaking of one specific way
of solving this, and we don't do that.  You can read the code in /etc/rc
and /etc/rc.shutdown.

At shutdown, we save a block of OUTPUT from the PRNG.
At boot, right after some extremely early system setup, we load that
block in as ENTROPY.

Let me be exact:  We do not save 'raw entropy'.


Oh, ok.

* One reason you would want to XOR entropy into the pool is so that any
nonrandom bits don't obliterate any randomness that they land on top of
(assuming they're independent). Better still, use something like a hash

                                                 ^^^^^^^^^^^^^^^^^^^^^^^^^

function (or the compression function from one).


Moments later, a hash function is indeed used.  It is called MD5.

I'd been reading in the thread something about XORing vs prependingnanotime() and wanted to pitch in an answer this question. I do see nowthat the random_bytes are coming from MD5. But this was more about wherethey were going:

XOR it?  Why?

Please provide a citation regarding the benefit of XOR'ing feed data
before passing it into MD5 for the purpose of PRNG folding.


This paper covers the topic pretty thoroughly:
  http://eprint.iacr.org/2010/264.pdf

It somehow manages to be both down-to-Earth and theoretical at the sametime (if that makes any sense). It even has a section on "Applicationsto System RNGs"!

Obviously prepending vs clobbering vs XORing will each result indiffering amounts of entropy flowing into the input side. But the outputside of MD5 maxes out at 128 bits. So it depends on the input widthsinvolved and other things that may be happening with that same datawhether or not it makes a real difference in theory (if that makes anysense).

Note,
this is the first stage PRNG, and that a second stage kernel-use PRNG
is built on top of that the first one, and that a third stage
per-process PRNG is built on top of that.

What does that mean for kernel code like that doing the IV generation?Does that code have its own third stage, or does it feed from the secondstage? [NM - I've read the code now]

* If you have to drop KB of output from RC4, you might be better off
with a hash function for stirring. Designers use hash functions because
their one-wayness is a critical property. RC4 wasn't designed to be a
one-way function any more than MD5 was designed to be used in CTR mode
as a stream cipher.


You don't know


Busted! Yeah I was just guessing.  ;-)

There's probably only one guy in the world who really knows what theprecise intent was for RC4 and that's Mr. R himself.


Let's check the Google...(first PDF link is usually a good bet)

http://www.google.com/search?q=design+of+rc4


Oooo interesting... it's a Master's thesis from University of Waterloo:
"Design and Analysis of RC4-like Stream Ciphers"

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.70.4039&rep=rep1&type=pdf

This guy sounds like he might know. He even has a home page with apicture of himself on a mountain, like yours!

http://www.math.uwaterloo.ca/~memckagu/uwaterloo/Home.html

But there's no mention of one-way-ness being an explicit design goal forRC4. Since this is a "D & A" paper I'll take that as evidence of absence.


But look at this, this section seems eerily relevant:

3.5.1 Methods of Using IVs
There are, of course, numerous ways of combining two data strings to obtain
a third.


Combining strings! Yeah! Like a nanotime and um..random bytes.

Some of them lead to obvious relationships between the session and
secret keys. A common procedure used with RC4 is to concatenate the
IV with the secret key. Another is to XOR the two strings together.


So they mention concatenation, and there's XOR.

But what about this peculiar method of truncation used in arc4_stir,where the prepended result of nanotime() displaces random material froma fixed-sized buffer?

http://code.bsd64.org/cvsweb/openbsd/src/sys/dev/rnd.c?rev=1.104;content-type=text%2Fx-cvsweb-markup
static void arc4_stir(void) {
        u_int8_t buf[256];
        int len;
        nanotime((struct timespec *) buf);
        len = sizeof(buf) - sizeof(struct timespec);
        get_random_bytes(buf + sizeof (struct timespec), len);
        len += sizeof(struct timespec);


That method isn't mentioned. It seems to be nonstandard.

Of course, with 2048 bits to play with, what could possibly go wrong?(RC4, that's what. It's old and busted.)

Hmm, say, how big is timespec anyway? How many random bytes it itdisplacing?

I'm not really sure, but I wonder if it might kinda not be so great if astruct timespec ended up needing a different alignment than you get withuint8[]. But I'm just saying that because the C standards would call it'undefined behavior' (they're so paranoid!). You kernel guys are ofcourse intimately familiar the platform compiler's behavior.


Anyway, back to the UWaterloo D & A RC4:

The simple relationships require a robust key schedule to destroy any
information that could be gleaned about the secret key from the resulting
keystreams. Unfortunately, the key schedule used in RC4 leaks information
when these methods are used, resulting in attacks, described byFluhrer et
al. [5] that expose the secret key.


Oh crap! Attacks that expose the secret key! And to think I didn't know...

A better method is to use a
cryptographic hash function in order to obscure these relationships.
(For example, concatenating the key with the IV and then hashing with
SHA-1 to produce a session key.)

And, in fact, get_random_bytes is implemented as a sequence of MD5s overthe main entropy pool.


OK, well here's a spot in D & A paper that mentions one-way functions:

5.4.5 Other Modified RC4 Ciphers
Previously other authors have modified RC4 in an attempt to defeat the
various attacks against it. []
VMPC Stream Cipher
The VMPC stream cipher, described by Zoltak [27] is a straightforward
modification of RC4 that attempts to gain security by replacing the output
function and the j update.  The function used is the so called VMPC one
way function.

It's a variant of RC4 that's considered notable specifically because itadds one-way-ness to the key input of the cipher.

So I think I'll stick by my story that "RC4 wasn't designed to be aone-way function".

* Here's a good summary of attacks:
http://www.schneier.com/paper-prngs.pdf


Yes, a wonderful paper about designing a PRNG for ONE consumer.


Really? I thought it was a survey of attacks. I'll have to read it again.

If you had 1000 consumers of PRNG data, would you design a PRNG that way?

That is the question which I think OpenBSD asked and proceeded to solve.

Whatever you mean by "that way", if the generator satisfies the definedproperties and gave a healthy margin all round, I would say "yes".


That said, some designs could be expected to parallelize better than others.

* This is a thoroughly researched area of crypto. Perhaps you might
consider using a standard design? (If for no other reason than to save
endless discussions?)


I disagree strongly.  The field of research is a total disaster because
everyone tries to use ONE PRNG PER PURPOSE.

We don't do that.  We data-slice one PRNG with ALL CONSUMERS.


I don't know what that means.

Let's say I could sample the output of the RNG in every process and fromevery network device in the system. As much as I wanted. How could Itell the difference between "one prng per purpose" and "data-slicing oneprng with all consumers"?


Maybe performance?

Looks to me like you have:
"entropy pool" of theoretically 64K bits max
"kernel arc4random" @ 1684 bits shared system-wide under single mutex
"per-process arc4random" @ 1684 bits per process

It currently costs 2064 MD5 block evaluations when arc4_stir seeds fromget_random_bytes, mostly over the same data repeatedly. But 1684/128 =13.2, so that's 156 times more expensive than necessary!

If you have that kind of capacity to spend, I can think of some betterplaces to spend it.

Everyone else is using standard designs, and as a result they do not
provide enough output data, and therefore, they make decisions that
consider PRNG output EXPENSIVE, and then use it less.

Well that's not the fault of a good algorithm or a good analysis. (Notthat all of them are good).

I remember, you guys were the first on the block to have decent TCPISNs, randomized PIDs, and other cool stuff too. Other vendors maketheir own decisions. If they don't think it's worth 256 bytes perprocess overhead, that's their business, but they ought not complainabout the expense of doing it the hard way either.

My desktop machine is generating and consuming PRNG in many many
layers of the kernel PER PACKET, heck, per keystroke, while other
operating systems continue to use sequential IP id's.


Yeah they suck.

Please go study what we have done before talking about it.

Oh come on. Don't you think that's putting the horse before the cart alittle bit?


Actually, I've been looking at the code as I composed this reply.

Regards,

- Marsh

Re: Allegations regarding OpenBSD's PRNG

Reply via email to