Re: [PATCH] Kernel entropy rework

2020-01-24 Thread maya
On Sat, Dec 21, 2019 at 10:08:20PM +, Taylor R Campbell wrote:
> The attached patch set reworks the kernel entropy subsystem.

This patch conflicted (hard) due to a forced push and some changes.
I re-created it, http://coypu.sdf.org/0001-asdf.patch

Unfortunately it crashes at early boot, and did on my preivous attempt
as well. This is on amd64.


Re: [PATCH] Kernel entropy rework

2019-12-26 Thread Taylor R Campbell
> Date: Thu, 26 Dec 2019 20:16:20 +
> From: m...@netbsd.org
> 
> On Sat, Dec 21, 2019 at 10:08:20PM +, Taylor R Campbell wrote:
> > - Replace SHA-1/LFSR entropy pool by Keccak sponge.
> 
> The peanut gallery would like to inquire if you have a secret BLAKE2
> version of this patch ready.

It's a good question, but I don't have a secret BLAKE2 version of this
patch ready.

The reason I chose Keccak is that there is a published well-understood
confidence-inspiring construction for alternately consuming inputs
(not necessarily uniformly distributed!) and generating outputs using
a permutation like Keccak -- namely a sponge duplex[1][2].  These are
exactly the operations we need for an entropy pool: enter samples,
extract key material.

(Our `enter' is `feed' from the paper; our `extract' is `fetch, then
forget', so that we provide key erasure (or `backtracking resistance'
or `forward secrecy') at every request.)

All of the logic is generic in terms of a permutation.  It currently
uses Keccak-p[1600,24], the same permutation as SHA-3 uses, but you
could drop in a different permutation if you wanted, like Gimli (which
I drafted for fun).  So what about BLAKE2?

Although BLAKE2 was derived from a permutation-based design, ChaCha,
the ChaCha permutation has various symmetries that have to be broken
by inputs beyond the adversary's control -- the constant words -- and
in BLAKE2 the fixed permutation was adapted into a keyed permutation,
i.e. a block cipher, also requiring a constant to break symmetries.
So neither one can just be dropped into the duplex construction
without analysis.

One could certainly cook up a scheme based on BLAKE2, and it might
provide better software performance -- in principle, anyway, if we
could do enough vectorization in the kernel! -- but:

(a) While BLAKE2 inspires confidence for what it does, there's no
_existing_ published well-understood confidence-inspiring
construction based on BLAKE2 -- or the BLAKE2 block cipher, or the
ChaCha permutation -- like a sponge duplex with the two operations
we need, entropy_enter and entropy_extract.  (Not that I know of,
anyway!)

(b) The performance of entropy_enter/entropy_extract is not critical:

- The input samples are usually low-volume -- even during high
  interrupt activity, the _cryptography_ operations are limited to
  when softints can get a word in edgewise; additional samples are
  just discarded until then, to prevent cryptography operations
  from adding to interrupt latency.

- The output is used only to draw a key for a PRNG, namely NIST
  Hash_DRBG with SHA-256 (formerly NIST CTR_DRBG with AES-128),
  which is what generates the data you read from /dev/urandom, and
  I'm not changing that algorithm at the moment.

So it'd be rather surprising if the cryptography in the entropy
pool itself turned out to be a bottleneck.

Since the entropy pool is basically the single most security-critical
piece of infrastructure for any cryptography such as you need to
safely use the modern internet, and since it's not likely to be
performance-critical, I figured that it's less important that it be
potentially vectorizable to maximize throughput in software like
BLAKE2 -- and much more important that it be well-understood and
inspire confidence like the sponge duplex construction with Keccak
does.


1. Guido Bertoni, Joan Daemen, Michaël Peeters, and Gilles Van Assche,
   `Sponge-Based Pseudo-Random Number Generators', in Stefan Mangard
   and François-Xavier Standaert, eds., Cryptographic Hardware and
   Embedded Systems CHES 2010, Springer LNCS 6225, pp. 33--47.
   https://link.springer.com/chapter/10.1007/978-3-642-15031-9_3
   https://keccak.team/files/SpongePRNG.pdf

2. Guido Bertoni, Joan Daemen, Michaël Peeters, and Gilles Van Assche,
   `Duplexing the Sponge: Single-Pass Authenticated Encryption and
   Other Applications', in Ali Miri and Serge Vaudenay, eds., Selected
   Areas in Cryptography SAC 2011, Springer LNCS 7118, pp. 320--337.
   https://link.springer.com/chapter/10.1007/978-3-642-28496-0_19
   https://keccak.team/files/SpongeDuplex.pdf


Re: [PATCH] Kernel entropy rework

2019-12-26 Thread maya
On Sat, Dec 21, 2019 at 10:08:20PM +, Taylor R Campbell wrote:
> - Replace SHA-1/LFSR entropy pool by Keccak sponge.

The peanut gallery would like to inquire if you have a secret BLAKE2
version of this patch ready.


Re: [PATCH] Kernel entropy rework

2019-12-23 Thread Taylor R Campbell
> Date: Sun, 22 Dec 2019 01:27:58 +
> From: 
> 
> > On Dec 21, 2019, at 5:08 PM, Taylor R Campbell  wrote:
> >  - For (e.g.) keyboard interrupt and network packet timings, this
> >is zero, because an adversary can cause events to happen with
> >timing that leads to predictable samples entering the pool.
> 
> That seems overly pessimistic, depending on the timer resolution.
> If you have a CPU cycle timer, then it is perfectly reasonable to
> claim a bit or two of entropy, since an adversary doesn't have the
> ability to control the timing of those events to nanosecond
> accuracy, nor the ability to control internal processing delays
> (like memory cache misses) which introduce variability way in excess
> of a CPU cycle.

When I'm typing, there might be nonzero entropy in the cycle counts
between keystrokes on my 2.3 GHz laptop.  But I don't have a good
model, justified by physics, for the probability distribution on the
differences in cycle counts.  And I definitely don't have one that
works for all keyboards on all types of buses on all machines under
all typing styles by all adversaries, including my cat sleeping on my
keyboard.  Do you?

We can wave hands about cache misses, but adversaries often have a lot
of avenues for influencing the microarchitectural state of the CPU
via, e.g., network packets that trigger specific code paths.  So I'm
not willing to commit to higher entropy estimates for something that
was never designed to be unpredictable in the first place unless
there's a confidence-inspiring argument -- involving physics and
engineering of the hardware -- to justify it.

This may well be pessimistic.  It would be great if we could make
better promises to users -- but only if we can be honest in those
promises.  Right now we are trading honesty for expediency.  My hope
is that, for most users, everything else in this change and other
recent changes will render moot the motivation for that bargain in the
first place.


Of course, driver authors have also taken claims by various hardware
RNG vendors at face value:

- Sometimes they are at least backed up by design documents, like the
  Intel RDRAND/RDSEED.

- Sometimes they are backed up by external audits, like the VIA C3
  XSTORE RNG.

- Sometimes they are just opaque device registers or DMA commands and
  you have to do science on them yourself, like the Allwinner Sun8i
  Crypto Engine TRNG, which seems to have at most 1 bit of min-entropy
  for every 100 bits of data (and is currently disabled by default
  until I have a better idea of what to expect from it).

What can I honestly promise works?  Flip a coin 256 times and run
`echo hthhthhhthhh... >> /dev/random'; then the random seed
management of rndctl should take care of you for the foreseeable
future.  Of course, most users don't have the patience for that.
Fortunately, most users do not have Intel in their threat model; for
those users it is reasonable to rely on Intel RDRAND/RDSEED, for
instance.


Re: [PATCH] Kernel entropy rework

2019-12-22 Thread Paul.Koning



> On Dec 21, 2019, at 5:08 PM, Taylor R Campbell  wrote:
> 
> 
> 
> The attached patch set reworks the kernel entropy subsystem.
> 
> ...
>  - For (e.g.) keyboard interrupt and network packet timings, this
>is zero, because an adversary can cause events to happen with
>timing that leads to predictable samples entering the pool.

That seems overly pessimistic, depending on the timer resolution.  If you have 
a CPU cycle timer, then it is perfectly reasonable to claim a bit or two of 
entropy, since an adversary doesn't have the ability to control the timing of 
those events to nanosecond accuracy, nor the ability to control internal 
processing delays (like memory cache misses) which introduce variability way in 
excess of a CPU cycle.

paul


Re: [PATCH] Kernel entropy rework

2019-12-21 Thread Taylor R Campbell
I guess I should have included a single unified diff rather than just
a git format-patch set.  From a CVS tree, the attached
entropy-unified.patch can be applied with

cd /path/to/src && patch -p1 < /path/to/entropy-unified.patch

(And from a git tree, you can apply the patch set in the original
email with `git am < /path/to/entropy.git'.)


Some background, from a cryptographer, on the historical system of
entropy `estimation' and `depletion' which has been essentially
unchanged in Linux and NetBSD since it was introduced in the 1990s,
and which I'm proposing to change now to better reflect modern
cryptography:

https://research.nccgroup.com/2019/12/19/on-linuxs-random-number-generation/
https://research.nccgroup.com/2019/12/19/on-linuxs-random-number-generation/