Re: [PATCH] drivers/virt: vmgenid: add vm generation id driver

2020-10-17 Thread Colm MacCarthaigh




On 17 Oct 2020, at 6:24, Jason A. Donenfeld wrote:


There are a few design goals of notifying userspace: it should be
fast, because people who are using userspace RNGs are usually doing so
in the first place to completely avoid syscall overhead for whatever
high performance application they have - e.g. I recall conversations
with Colm about his TLS implementation needing to make random IVs
_really_ fast.


That’s our old friend TLS1.1 in CBC mode, which needs a random 
explicit IV for every record sent. Speed is still a reason at the 
margins in cases like that, but getrandom() is really fast. A stickier 
problem is that getrandom() is not certified for use with every 
compliance standard, and those often dictate precise use of some NIST 
DRBG or NRBG construction. That keeps people proliferating user-space 
RNGs even when speed isn’t as important.



It should also happen as early as possible, with no
race or as minimal as possible race window, so that userspace doesn't
begin using old randomness and then switch over after the damage is
already done.


+1 to this, and I’d add that anyone making VM snapshots that they plan 
to restore from multiple times really needs to think this through top to 
bottom. The system would likely need to be put in to some kind of 
quiescent state when the snapshot is taken.



So, anyway, here are a few options with some pros and cons for the
kernel notifying userspace that its RNG should reseed.

1. SIGRND - a new signal. Lol.

2. Userspace opens a file descriptor that it can epoll on. Pros are
that many notification mechanisms already use this. Cons is that this
requires syscall and might be more racy than we want. Another con is
that this a new thing for userspace programs to do.


A library like OpenSSL or BoringSSL also has to account for running 
inside a chroot, which also makes this hard.



Any thoughts on 4c? Is that utterly insane, or does that actually get
us somewhere close to what we want?


I still like 4c, and as a user-space crypto-person, and a VM person, 
they have a lot of appeal. Alex and Adrian’s replies get into some of 
the sufficiency challenge. But for user-space libraries like the *SSLs, 
the JVMs, and other runtimes where RNGs show up, it could plug in easily 
enough.


-
Colm



Re: [PATCH] drivers/virt: vmgenid: add vm generation id driver

2020-10-17 Thread Colm MacCarthaigh




On 16 Oct 2020, at 21:02, Jann Horn wrote:

On Sat, Oct 17, 2020 at 5:36 AM Willy Tarreau  wrote:
But in userspace, we just need a simple counter. There's no need for
us to worry about anything else, like timestamps or whatever. If we
repeatedly fork a paused VM, the forked VMs will see the same counter
value, but that's totally fine, because the only thing that matters to
userspace is that the counter changes when the VM is forked.


For user-space, even a single bit would do. We added MADVISE_WIPEONFORK 
so that userspace libraries can detect fork()/clone() robustly, for the 
same reasons. It just wipes a page as the indicator, which is 
effectively a single-bit signal, and it works well. On the user-space 
side of this, I’m keen to find a solution like that that we can use 
fairly easily inside of portable libraries and applications. The “have 
I forked” checks do end up in hot paths, so it’s nice if they can be 
CPU cache friendly. Comparing a whole 128-bit value wouldn’t be my 
favorite.



And actually, since the value is a cryptographically random 128-bit
value, I think that we should definitely use it to help reseed the
kernel's RNG, and keep it secret from userspace. That way, even if the
VM image is public, we can ensure that going forward, the kernel RNG
will return securely random data.


If the image is public, you need some extra new raw entropy from 
somewhere. The gen-id could be mixed in, that can’t do any harm as 
long as rigorous cryptographic mixing with the prior state is used, but 
if that’s all you do then the final state is still deterministic and 
non-secret. The kernel would need to use the change as a trigger to 
measure some entropy (e.g. interrupts and RDRAND, or whatever). Our just 
define the machine contract as “this has to be unique random data and 
if it’s not unique, or if it’s pubic, you’re toast”.



-
Colm





Re: [PATCH] drivers/virt: vmgenid: add vm generation id driver

2020-10-17 Thread Colm MacCarthaigh




On 16 Oct 2020, at 22:01, Jann Horn wrote:


On Sat, Oct 17, 2020 at 6:34 AM Colm MacCarthaigh 
 wrote:
For user-space, even a single bit would do. We added 
MADVISE_WIPEONFORK
so that userspace libraries can detect fork()/clone() robustly, for 
the

same reasons. It just wipes a page as the indicator, which is
effectively a single-bit signal, and it works well. On the user-space
side of this, I’m keen to find a solution like that that we can use
fairly easily inside of portable libraries and applications. The 
“have
I forked” checks do end up in hot paths, so it’s nice if they can 
be

CPU cache friendly. Comparing a whole 128-bit value wouldn’t be my
favorite.


I'm pretty sure a single bit is not enough if you want to have a
single page, shared across the entire system, that stores the VM
forking state; you need a counter for that.


You’re right. WIPEONFORK is more like a single-bit per use. If it’s 
something system wide then a counter is better.



So the RNG state after mixing in the new VM Generation ID would
contain 128 bits of secret entropy not known to anyone else, including
people with access to the VM image.

Now, 128 bits of cryptographically random data aren't _optimal_; I
think something on the order of 256 bits would be nicer from a
theoretical standpoint. But in practice I think we'll be good with the
128 bits we're getting (since the number of users who fork a VM image
is probably not going to be so large that worst-case collision
probabilities matter).


This reminds me on key/IV usage limits for AES encryption, where the 
same birthday bounds apply, and even though 256-bits would be better, we 
routinely make 128-bit birthday bounds work for massively scalable 
systems.



The kernel would need to use the change as a trigger to
measure some entropy (e.g. interrupts and RDRAND, or whatever). Our 
just
define the machine contract as “this has to be unique random data 
and

if it’s not unique, or if it’s pubic, you’re toast”.


As far as I can tell from Microsoft's spec, that is a guarantee we're
already getting.


Neat.

-
Colm