Re: kaslr: better rng

2017-11-11 Thread Thor Lancelot Simon
On Sat, Nov 11, 2017 at 09:23:39PM +, Taylor R Campbell wrote:
> 
> Speaking of which, you should read 256 bits out of rdseed, not 64, and
> fall back to rdrand if rdseed is not available.

cpu_rng already has the code needed to do this -- best to use it, perhaps?

Thor


Re: kaslr: better rng

2017-11-11 Thread Taylor R Campbell
> Date: Sat, 11 Nov 2017 21:48:54 +0100
> From: Maxime Villard 
> 
> The hash of y against rdtsc and rdseed is the new secret, not just y. It's
> not a recurrent Yn+1 = H(Yn) construction from a single Y0 seed - in which
> case one leak in the chain would indeed compromise the randomness of the
> following generations. It's rather Yn+1 = H(Yn, rdtsc, rdseed), where even
> if Yn is entirely revealed, you still can't predict Yn+1 reliably.
> 
> But I understand that re-using potentially revealed information in
> the hash weakens the prng in probably non-negligible ways, and that
> splitting the output as you said is better.

A little more to the point, the kernel's cryptography should be secure
if _either_

(a) the original seed on disk is unpredictable, _or_
(b) the seed we get out of rdseed is unpredictable.

I don't expect more than a few bits of the tsc to be unpredictable,
and not every machine supports rdseed.  If we don't assume they're
unpredictable, then it reduces to Yn+1 = H(Yn) for some predictable H.

Speaking of which, you should read 256 bits out of rdseed, not 64, and
fall back to rdrand if rdseed is not available.

> > Rather, we should carve up the output of SHAKE256 into a new 32-byte secret
> > and an n-byte buffer of outputs for some convenient n:
> > 
> > nseed(32) || buffer(480) = SHAKE256(oseed(32) || whatever else)
> 
> The output size we use is 32 bytes, not 512. So I guess we should increase it
> to 64, keep the lower half secret (and pass it to the hash function), and use
> the upper half as a random (and potentially revealed) buffer?

Whatever's convenient.  It doesn't really matter for security.

> (Side note regarding "revealed": if 32byte-worth of
> randomly-generated kernel VAs were to be revealed, we would be
> having a much bigger problem in the first place.)

Yes, but it's easier to engineer this by using a generically secure
PRNG; then I don't even have to think about quantifying the impact of
downstream leaks.

> I would favor certified algorithms, not that my opinion matters lot in this
> regard.

I care about security, understandability, and maintainability, not
about certification, but as I understand it the reason we use
AES128-CTR-DRBG for /dev/random is to make certification easier.  tls@
can probably comment on this.

> > Also, where do you update the seed that is then passed to rnd_seed to
> > initialize the real kernel entropy pool?
> 
> I don't, and that's a problem. To update the seed we need to
> recompute its SHA1 signature, and I'm not implementing SHA1 in the
> prekern (yet?). Either we extend entropy-file to contain three
> separate seeds (bootloader / prekern / kernel), or we choose to use
> SHA3 by default in the kernel and have the prekern update the seed
> and recompute its signature dynamically.

Blahh!  I don't even know why we have the SHA1 hash there.

Can you just use the SHA1 in libkern (and the SHA3 that will with any
luck soon be in libkern), or are there constraints on the size of the
prekern that prevent you from doing so?


Re: kaslr: better rng

2017-11-11 Thread Maxime Villard

Le 11/11/2017 à 20:14, Taylor R Campbell a écrit :

If we _both_ reveal y = H(x) for some initial secret x, and then use y
as the new secret, we've just revealed the new secret. 


The hash of y against rdtsc and rdseed is the new secret, not just y. It's
not a recurrent Yn+1 = H(Yn) construction from a single Y0 seed - in which
case one leak in the chain would indeed compromise the randomness of the
following generations. It's rather Yn+1 = H(Yn, rdtsc, rdseed), where even
if Yn is entirely revealed, you still can't predict Yn+1 reliably.

But I understand that re-using potentially revealed information in the hash
weakens the prng in probably non-negligible ways, and that splitting the output
as you said is better.


Rather, we should carve up the output of SHAKE256 into a new 32-byte secret
and an n-byte buffer of outputs for some convenient n:

nseed(32) || buffer(480) = SHAKE256(oseed(32) || whatever else)


The output size we use is 32 bytes, not 512. So I guess we should increase it
to 64, keep the lower half secret (and pass it to the hash function), and use
the upper half as a random (and potentially revealed) buffer?

(Side note regarding "revealed": if 32byte-worth of randomly-generated kernel
VAs were to be revealed, we would be having a much bigger problem in the first
place.)


See  for further
discussion of this structure (and criticism of the obscenely
complicated NIST DRBGs in SP800-90A).

It may be simpler to use a Keccak duplex like the draft entropy pool
I've been sitting on for a while, which has some theoretically nicer
properties[1] though hasn't yet appeared in a NIST standard the way
SHAKE256 has.  Input from someone who is likely to care about
certification more than I do would be appreciated here.


I would favor certified algorithms, not that my opinion matters lot in this
regard.


Also, where do you update the seed that is then passed to rnd_seed to
initialize the real kernel entropy pool?


I don't, and that's a problem. To update the seed we need to recompute its
SHA1 signature, and I'm not implementing SHA1 in the prekern (yet?). Either
we extend entropy-file to contain three separate seeds (bootloader / prekern /
kernel), or we choose to use SHA3 by default in the kernel and have the
prekern update the seed and recompute its signature dynamically.


Finally, some minor nits:

- Don't call it `rndpool', because that has a specific meaning in
   NetBSD already, which is something else.  Just `struct prng' is
   fine, if confined to that file.


alright


- Define functions with foo(void), not foo(), to get prototype
   checking from the C compiler.

- Newline before function name in definitions; see share/misc/style
   for more details.

- No need for casts between void * and other types.

- Use unsigned, not u_int; u_int, u_intN_t, , are vestiges of older
   days.

- Avoid externs in .c files.  Include the right header file for the
   declaration of bootinfo.


These are common in the prekern and not particularly related to the prng, so
they will be addressed later. Also need to get rid of redef.h etc.

Maxime


Re: kaslr: better rng

2017-11-11 Thread Taylor R Campbell
> Date: Sat, 11 Nov 2017 15:48:12 +0100
> From: Maxime Villard 
> 
> It is based on the SHAKE256 hash function, which can produce a variable
> sized output. We use an area of 32 bytes, and regenerate it as many times
> as needed.
> 
> The first time, it is generated with:
> 
>   area = SHAKE256(entropy-file, rdseed, rdtsc)
> 
> When all of the bytes in the area have been consumed, it is regenerated
> this way:
> 
>   area = SHAKE256(area, rdseed, rdtsc)

If we _both_ reveal y = H(x) for some initial secret x, and then use y
as the new secret, we've just revealed the new secret.  Rather, we
should carve up the output of SHAKE256 into a new 32-byte secret and
an n-byte buffer of outputs for some convenient n:

nseed(32) || buffer(480) = SHAKE256(oseed(32) || whatever else)

See  for further
discussion of this structure (and criticism of the obscenely
complicated NIST DRBGs in SP800-90A).

It may be simpler to use a Keccak duplex like the draft entropy pool
I've been sitting on for a while, which has some theoretically nicer
properties[1] though hasn't yet appeared in a NIST standard the way
SHAKE256 has.  Input from someone who is likely to care about
certification more than I do would be appreciated here.

Also, where do you update the seed that is then passed to rnd_seed to
initialize the real kernel entropy pool?

Finally, some minor nits:

- Don't call it `rndpool', because that has a specific meaning in
  NetBSD already, which is something else.  Just `struct prng' is
  fine, if confined to that file.

- Define functions with foo(void), not foo(), to get prototype
  checking from the C compiler.

- Newline before function name in definitions; see share/misc/style
  for more details.

- No need for casts between void * and other types.

- Use unsigned, not u_int; u_int, u_intN_t, , are vestiges of older
  days.

- Avoid externs in .c files.  Include the right header file for the
  declaration of bootinfo.


[1] https://keccak.team/files/SpongePRNG.pdf


SHA-3

2017-11-11 Thread Taylor R Campbell
Back in 2014, I proposed to import a simple SHA-3 implementation into
libc and libkern.  There were two main objections:

1. SHA-3 wasn't finalized yet, so it might change.
2. OpenSSL might add an incompatible SHA-3 API like they did SHA-2.

The SHA-3 standard has since been finalized (with no changes), so
objection (1) is moot.

OpenSSL has not added a SHA-3 API like they did SHA-2, only an EVP
method for SHA-3, so objection (2) might be moot, but of course who
can say with OpenSSL.

That said, right now, I want this for kernel and boot loader use, so
I'm not even proposing to make the SHA-3 symbols public in libc yet.
However, since we may want to expose this, I've put it in src/common
and made sure there's build goo that works for libc -- just no new
installed header files or public symbols yet.  Later we can either
publish them or remove them easily.

Patch attached, derived from the portable C SHA-3 code I wrote at
.

Objections?
Index: common/lib/libc/Makefile.inc
===
RCS file: /cvsroot/src/common/lib/libc/Makefile.inc,v
retrieving revision 1.16
diff -p -u -r1.16 Makefile.inc
--- common/lib/libc/Makefile.inc10 Aug 2014 23:25:49 -  1.16
+++ common/lib/libc/Makefile.inc11 Nov 2017 18:16:14 -
@@ -4,7 +4,7 @@
 
 COMMON_DIR:=${.PARSEDIR}
 COMMON_CODEDIRS=atomic gen gmon inet md net stdlib string sys
-COMMON_CODEDIRS+=hash/sha1 hash/sha2 hash/rmd160 hash/murmurhash
+COMMON_CODEDIRS+=hash/sha1 hash/sha2 hash/sha3 hash/rmd160 hash/murmurhash
 
 .if defined(COMMON_MACHINE_ARCH) && !empty(COMMON_MACHINE_ARCH) && \
 exists(${COMMON_DIR}/arch/${COMMON_MACHINE_ARCH})
Index: common/lib/libc/hash/sha3/keccak.c
===
RCS file: common/lib/libc/hash/sha3/keccak.c
diff -N common/lib/libc/hash/sha3/keccak.c
--- /dev/null   1 Jan 1970 00:00:00 -
+++ common/lib/libc/hash/sha3/keccak.c  11 Nov 2017 18:16:14 -
@@ -0,0 +1,186 @@
+/*-
+ * Copyright (c) 2015 Taylor R. Campbell
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include 
+
+#if defined(_KERNEL) || defined(_STANDALONE)
+__KERNEL_RCSID(0, "$NetBSD$");
+
+#include 
+#else
+__RCSID("$NetBSD$");
+
+#include 
+#endif
+
+#include "keccak.h"
+
+#definesecret  /* can't use in variable-time operations, should zero */
+
+#defineFOR5(X, STMT) do
  \
+{\
+   (X) = 0; STMT;\
+   (X) = 1; STMT;\
+   (X) = 2; STMT;\
+   (X) = 3; STMT;\
+   (X) = 4; STMT;\
+} while (0)
+
+static inline secret uint64_t
+rol64(secret uint64_t v, unsigned c)
+{
+
+   return ((v << c) | (v >> (64 - c)));
+}
+
+static inline void
+keccakf1600_theta(secret uint64_t A[25])
+{
+   secret uint64_t C0, C1, C2, C3, C4;
+   unsigned y;
+
+   C0 = C1 = C2 = C3 = C4 = 0;
+   FOR5(y, {
+   C0 ^= A[0 + 5*y];
+   C1 ^= A[1 + 5*y];
+   C2 ^= A[2 + 5*y];
+   C3 ^= A[3 + 5*y];
+   C4 ^= A[4 + 5*y];
+   });
+   FOR5(y, {
+   A[0 + 5*y] ^= C4 ^ rol64(C1, 1);
+   A[1 + 5*y] ^= C0 ^ rol64(C2, 1);
+   A[2 + 5*y] ^= C1 ^ rol64(C3, 1);
+   A[3 + 5*y] ^= C2 ^ rol64(C4, 1);
+ 

Re: kaslr: better rng

2017-11-11 Thread Maxime Villard

Following a conversation with Taylor, I ended up with the following
implementation for the prekern [1] [2]. It uses a set of seeds that are
hashed together in rounds, and it doesn't use an additional file.

It is based on the SHAKE256 hash function, which can produce a variable
sized output. We use an area of 32 bytes, and regenerate it as many times
as needed.

The first time, it is generated with:

area = SHAKE256(entropy-file, rdseed, rdtsc)

When all of the bytes in the area have been consumed, it is regenerated
this way:

area = SHAKE256(area, rdseed, rdtsc)

The SHAKE/Keccak code is from Taylor, I just added prng_* wrappers.

rdseed and rdtsc each give a 8byte seed, and entropy-file gives a 512byte
one. We don't checksum the latter, because we would need SHA1, which I am
not implementing here.

Feel free to tell me if there's something obviously wrong in all of this;
I won't hide that PRNGs are not things I work on every day.

[1] http://m00nbsd.net/garbage/prekern/prng.c
[2] http://m00nbsd.net/garbage/prekern/prng.diff