Re: RFC: redesigning random(4)

2011-12-11 Thread Sandy Harris
On Thu, Sep 29, 2011 at 2:46 PM, Sandy Harris sandyinch...@gmail.com wrote:

 I have been thinking about how random(4) might be redesigned ...

 ... make the input
 pool use Skein (or another SHA-3 candidate) and the output pools a
 modified counter-mode AES.

I now actually have most of the code for that and a substantial
rationale document, both in a first draft sort of state.

I have worked out how to use a block cipher in a way that has
the hard-to-invert property and does not either lose state when
it rekeys or encrypt successive counter values with a small
Hamming difference. It is fairly complex.

 Currently the driver uses SHA-1 for all three. ,,,

Having looked at the block cipher method in some detail, I've now
concluded that it is better to just use a hash which is non-invertible
by design and does not make analysis more difficult.

I may eventually have code  rationale for that too, but almost
certainly not soon.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RFC: redesigning random(4)

2011-09-29 Thread Sandy Harris
I have been thinking about how random(4) might be redesigned and I now
have enough thoughts to put them on electrons and send them out. This
is all first-cut stuff, though I have thought about it some. It needs
comment, criticism and revision.

A few things I think we absolutely want to keep. At least the existing
input mixing code including its entropy estimation, the three-pool
structure, and the use (at least sometimes) of a hash for the mixing.
I think the large pool plus hashing is obviously superior to Yarrow,
where the pool is a single hash context. For our purposes, though
perhaps not elsewhere, it also seems better than Fortuna which
complicates things with multiple input pools; we do not need those.

I would change some things fairly radically, though. Hence this
message and an invitation to discussion.

I would define all pools in terms of 64-bit objects, make the input
pool use Skein (or another SHA-3 candidate) and the output pools a
modified counter-mode AES.

Currently the driver uses SHA-1 for all three. Sometime next year, the
SHA-3 competition will declare a winner. Using that might make getting
various certifications easier, so we should probably do that when we
can. The simplest way to do that might be to keep the three-pool setup
and make them all use a new hash, but I'll argue for making the two
output pools use a block cipher instead.

All  five of the finalists in the SHA-3 competition use some variant
of a wide-pipe strategy. The internal state of the hash is larger
than the output size by a factor of two or more, and (in most, though
I think JH just takes raw bits from state) there is an output function
that crunches the final state into the output. It appears Ted was
ahead of his time with the trick or folding an SHA-1 output in half to
get 80 output bits. It also appears that Ted's trick is no longer
needed; the hashes do it for us now.

All the finalists have an -256 and an -512 mode. Either way, we can
have a get256() function that gets 256 hashed bits. Things like
Skein-512-256 can be used directly; that has 512-bit internal state
and 256-bit output. For another hash, we might have to use the version
with 512-bit output and do our own folding to 256. However, it is
straightforward to have a get256() function either way. That should be
the only output function from the primary pool, and its output should
never go outside the driver. It should be used only to initialise or
update the other pools.

The get256() function should start by mixing in some new entropy. At
one point, Ted had an add_timer_randomness() call for this. I recall
some discussion about taking it out and I don't see it now. As I see
it, every extraction of main pool entropy should add something new
first. Perhaps add_timer_randomness() is enough, but we might look
around the kernel to see if there's more that could be got -- is the
process or file descriptor table random enough to be worth throwing
in, for example?

get256() should also mix at least the hash value worth of bits back
into the pool at the end. Currently this is done in extract_buf() with
a call to mix_pool_bytes(). I'd do it rather differently.

I suggest a new mixing primitive, to be used for all pools. It would
actually work with structure members, but here is simple code to
explain it:

#define SIZE 32 // must be even
#define DELTA 7// must be odd, no common factor with SIZE

static u64 buffer[SIZE] ;
static u64 *p = buffer ;
static u64 *end = Buffer+SIZE ;

// point q where p will be after SIZE iterations
static u64 *q = p + (SIZE/2) + DELTA ;

void mix64( u64 *data)
{
   *p ^= *data ;
   // pseudo-Hadamard transform
   *p += *q ;
   *q += *p ;
   // increment pointers
   p += DELTA ;
   q += DELTA ;
   // wrap around if needed
   if(p = end) p -= SIZE ;
   if(q = end) q -= SIZE ;
}

This has some nice properties. It is only used for
internally-generated data, to mix hash or cipher outputs back into
pools. It may be more efficient than mix_pool_bytes() which is
designed to handle external inputs and works one byte at a time. It is
somewhat non-linear because it mixes XORs with addition.

The starting point for q is chosen so that after SIZE/2 iterations,
every word in the pool is changed. In the example, with SIZE = 32 and
DELTA = 7, p runs through all words in 32 iterations. After 16
iterations, p has increased by (16*7)%32 = 16. For the second 16, it
starts at 16+7. That is just where q starts for the first 16, so
between them p and q update every word in the first 16 iterations.
This works for any values of SIZE and DELTA, provided SIZE is even and
the two have no common factors.

get256() should end with four calls to mix64(), mixing all its data back in.

I would make the two output pools AES-128 contexts, 22 64-bit words
each, and use a variant of counter mode to generate the outputs.
Counter mode has been rather extensively analyzed, notably in the
Schneier et al.Yarrow