subject:"get_random_bytes returns bad randomness before seeding is complete"

Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-04 Thread Stephan Müller

Am Freitag, 2. Juni 2017, 16:59:56 CEST schrieb Jason A. Donenfeld:

Hi Jason,

> Alternatively, I'm open to other solutions people might come up with.

One addition, there is an issue (I would call it a bug) in random.c before 4.8 
where the nonblocking_pool is not reseeded during early boot even though 
entropy may be available. That issue aggravates early boot time entropy issues 
for user and kernel land.

I have not heard about accepting or rejecting it, so I am wondering how 
patches go into random.c at all.

[1] https://patchwork.kernel.org/patch/9620431/

Ciao
Stephan

Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-03 Thread Stephan Müller

Am Sonntag, 4. Juni 2017, 00:54:39 CEST schrieb Jeffrey Walton:

Hi Jeffrey,

> On Sat, Jun 3, 2017 at 5:45 PM, Sandy Harris  wrote:
> > ...
> > Of course this will fail on systems with no high-res timer. Are there
> > still some of those? It might be done in about 1000 times as long on a
> > system that lacks the realtime library's nanosecond timer but has the
> > Posix standard microsecond timer, implying a delay time in the
> > milliseconds. Would that be acceptable in those cases?
> 
> A significant portion of the use cases should include mobile devices.
> Device sales outnumbered desktop and server sales several years ago.
> 
> Many devices are sensor rich. Even the low-end ones come with
> accelorometers for gaming. A typical one has 3 or 4 sensors, and
> higher-end ones have 7 or 8 sensors. An Evo 4G has 7 of them.
> 

I think those devices are covered with the kernels 4.8+. That kernel uses 
solely interrupts as noise source for the first stage we talk about here.

Not having done any particular measurements with the latest kernels on mobile 
devices, but based on my experience with my LRNG assessment, I could fathom 
that mobile devices have a fully seeded ChaCha20 DRNG before user space 
starts.

Just to give an illustration: I have a Lenovo T540 which receives more than 
256 interrupts before late_initcall. On all system with a high-res timer, each 
interrupt will give more than one bit of entropy. Conversely, on my MacBook 
Pro 2015,  at late_initcall the kernel received less than 100 interrupts. In a 
KVM guest with very little devices, I also have some 100 interrupts before 
late_initcall. These measurements are taken with the same kernel and same 
kernel configs.

Ciao
Stephan

Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-03 Thread Jeffrey Walton

On Sun, Jun 4, 2017 at 1:48 AM, Stephan Müller  wrote:
> Am Freitag, 2. Juni 2017, 16:59:56 CEST schrieb Jason A. Donenfeld:
>
>> Alternatively, I'm open to other solutions people might come up with.
>
> How about stirring in some data from the Jitter RNG that we have in the kernel
> already and that is used for the DRBG in case get_random_bytes has
> insufficient entropy? Yes, two kernel developers said that this RNG is
> useless, where in fact a lot of hardware and even crypto folks say that this
> approach has merits.

Almost anything has to be better than (1) silent failures, and (2)
draining the little entropy available when the generators are starting
and trying to become operational.

The [negative] use case for (2) is systemd. See, for example,
https://github.com/systemd/systemd/issues/4167.

Jeff

Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-03 Thread Stephan Müller

Am Freitag, 2. Juni 2017, 16:59:56 CEST schrieb Jason A. Donenfeld:

Hi Jason,

> 
> Alternatively, I'm open to other solutions people might come up with.

How about stirring in some data from the Jitter RNG that we have in the kernel 
already and that is used for the DRBG in case get_random_bytes has 
insufficient entropy? Yes, two kernel developers said that this RNG is 
useless, where in fact a lot of hardware and even crypto folks say that this 
approach has merits.

In any case, it cannot destroy the (not present) entropy at boot time anyway. 
Thus, take some 32, 48 or 64 bytes from it right at the start of the kernel, 
and we should be better (from the view point of quite some folks) or not worse 
off (view point of two developers here).

As this RNG does not depend on any in-kernel facility, it is always available 
at any time.

PS: I could revive a patch adding this to random.c that I sent long ago if 
desired.

Ciao
Stephan

Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-03 Thread Daniel Micay

On 3 June 2017 at 18:54, Jeffrey Walton  wrote:
> On Sat, Jun 3, 2017 at 5:45 PM, Sandy Harris  wrote:
>> ...
>> Of course this will fail on systems with no high-res timer. Are there
>> still some of those? It might be done in about 1000 times as long on a
>> system that lacks the realtime library's nanosecond timer but has the
>> Posix standard microsecond timer, implying a delay time in the
>> milliseconds. Would that be acceptable in those cases?
>
> A significant portion of the use cases should include mobile devices.
> Device sales outnumbered desktop and server sales several years ago.
>
> Many devices are sensor rich. Even the low-end ones come with
> accelorometers for gaming. A typical one has 3 or 4 sensors, and
> higher-end ones have 7 or 8 sensors. An Evo 4G has 7 of them.
>
> There's no wanting for entropy in many of the use cases. The thing
> that is lacking seems to be taking advantage of it.
>
> Jeff

Hardware random number generator support is also standard on even
low-end mobile devices. The Linux kernel now knows to feed some of the
entropy from those hardware random generators into the kernel CSPRNG
when the driver is initialized but that doesn't happen until fairly
late in the kernel's boot process. The sensors present the same issue.
They aren't available when the kernel starts needing entropy for
features like SSP and KASLR or other early boot uses, unlike
RDRAND/RDSEED on modern x86_64 CPUs.

For userspace, Android's init system blocks until a certain amount of
entropy is obtained from one for the kernel CSPRNG. It's possible for
there to be no hwrandom but I think that's very rare now since the
standard SoCs used everywhere have it available. The device vendor
would probably need to go out of the way to break it. Android also
regularly saves a persistent random seed and restores it on boot. It
also mixes in entropy from the hardware generator regularly since the
kernel didn't know how to do that before, just like it didn't know how
to grab any initial entropy from the hardware generator.

I don't think it's worth worrying too much about mobile. Slimmer
embedded devices that probably don't even save / restore a seed in
many cases or generate keys on first boot before that helps are the
real issue. At least if you're not focused on KASLR and other early
probabilistic kernel exploit mitigations where there's a lack of a way
to get entropy in early boot right now unless the bootloader helps.

Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-03 Thread Jeffrey Walton

On Sat, Jun 3, 2017 at 5:45 PM, Sandy Harris  wrote:
> ...
> Of course this will fail on systems with no high-res timer. Are there
> still some of those? It might be done in about 1000 times as long on a
> system that lacks the realtime library's nanosecond timer but has the
> Posix standard microsecond timer, implying a delay time in the
> milliseconds. Would that be acceptable in those cases?

A significant portion of the use cases should include mobile devices.
Device sales outnumbered desktop and server sales several years ago.

Many devices are sensor rich. Even the low-end ones come with
accelorometers for gaming. A typical one has 3 or 4 sensors, and
higher-end ones have 7 or 8 sensors. An Evo 4G has 7 of them.

There's no wanting for entropy in many of the use cases. The thing
that is lacking seems to be taking advantage of it.

Jeff

Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-03 Thread Sandy Harris

Stephan's driver, the HAVEGE system & several others purport to
extract entropy from a series of timer calls. Probably the best
analysis is in the Mcguire et al. paper at
https://static.lwn.net/images/conf/rtlws11/random-hardware.pdf & the
simplest code in my user-space driver at
https://github.com/sandy-harris/maxwell The only kernel-space code I
know of is Stephan's.

If the claim that such calls give entropy is accepted (which I think
it should be) then if we get one bit per call, need 100 or so bits &
space the calls 100 ns apart, loading up a decent chunk of startup
entropy takes about 10,000 ns or 10 microseconds which looks like an
acceptable delay. Can we just do that very early in the boot process?

Of course this will fail on systems with no high-res timer. Are there
still some of those? It might be done in about 1000 times as long on a
system that lacks the realtime library's nanosecond timer but has the
Posix standard microsecond timer, implying a delay time in the
milliseconds. Would that be acceptable in those cases?

Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-03 Thread Jason A. Donenfeld

On Sat, Jun 3, 2017 at 7:04 AM, Theodore Ts'o  wrote:
> has been pretty terrible?
> This kind of "my shit doesn't stink, but yours does", is not
> The reason why I keep harping on this is because I'm concerned about
> an absolutist attitude towards technical design, where the good is the

Moving past that, did you see the [PATCH RCF 0/3] series I posted
yesterday? Would be helpful to have your feedback on that approach and
implementation strategy. Since it seems like you're preferring
cleaning up things individually, rather than the systemic rnginit
solution I initially proposed, I moved forward with implementing an
RFC-version of that. I'm pretty sure so quickly compromising and going
with what I perceived you thought was best is a strong indication that
there isn't an, "absolutist attitude towards technical design".
However, if you do somehow find evidence of that kind of claim in my
[PATCH] set, please do bring it up, and I'll try to adjust to be more
pleasing.

> We're going to have to look at a representative sample of the call
> sites to figure this out.  The simple case is where the call site is
> only run in response to a userspace system call.  There, blocking
> makes perfect sense.  I'm just not sure there are many callers of
> get_random_ bytes() where this is the case.

In the patch series I sent earlier, the reason I split things into
wait_for_random_bytes, which just blocks until the pool is ready, and
then the convenience combiner of get_random_bytes_wait, which calls
wait_for_random_bytes and then get_random_bytes, is because I was
thinking there might be a few places where we can't actually sleep
during the get_random_bytes call, due to in_interrupt() or whatever,
but that there's some process-context area that's _always_ called
before get_random_bytes, like a userspace configuration API or an
ioctl, so we could simply put a call to wait_for_random_bytes, and
then be sure that all calls to get_random_bytes after that are safe.

I guess I'll see in practice if this is actually a useful way of doing
it, once I dig in and start modifying representative call sites.

> When would a timeout be useful?  If you are using get_random_bytes()
> for security reasons, does the security reason go away after 15
> seconds?  Or even 30 seconds?

I was thinking that returning to userspace with -ETIMEDOUT or
something might be more desirable in some odd situations (which ones?)
than just waiting for a signal and responding with
-EINTR/-ERESTARTSYS.  That might turn out to be not true, in which
case I guess I won't add that API, as you suggested.

> Also, it is possible that we may have architectures, without
> fine-grained clocks, where we don't initialize the rng until after
> userspace as sharted running.  So it's not clear adding a rnginit
> section makes sense.  Even if we put it as late as possible --- say,
> after "late", what do we do if don't have the CRNG fully
> negotiated after the last of the "late" drivers have been run?

My idea was that it would be eventually inserted on the callback from
add_random_ready_callback. You're right that this would not be okay
for things like filesystems, but maybe it'd be appropriate for things
like crypto/rng.c? Or, perhaps the blocking API on configuration-time
would be better, anyway, for things like that. You seem wary of this
approach, so I'm going to roll with your suggestions above and see how
they work out. It it pans out great, if not, maybe we'll revisit this
down the road once I have a better picture of what the call sites are
like.

Jason

Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-02 Thread Theodore Ts'o

On Sat, Jun 03, 2017 at 01:58:25AM +0200, Jason A. Donenfeld wrote:
> Hi Ted,
> 
> Based on the tone of your last email, before I respond to your
> individual points

May I gently point out that *your* tone that started this whole thread
has been pretty terrible?   Quoting your your first message:

   "Yes, yes, you have arguments for why you're keeping this
   pathological, but you're still wrong, and this api is still a bug."

This kind of "my shit doesn't stink, but yours does", is not
particularly useful if you are trying to have a constructive
discussion.  If you think the /dev/urandom question wasn't appropriate
to discuss in this thread, then why the *hell* did you bring it up
*first*?

The reason why I keep harping on this is because I'm concerned about
an absolutist attitude towards technical design, where the good is the
enemy of the perfect, and where bricking machines in the pursuit of
cryptographic perfection is considered laudable.  Yes, if you apply
thermite to the CPU and the hard drives, the system will be secure.
But it also won't be terribly useful.

And to me, the phrase "you're still wrong" doesn't seem to admit any
concessions towards acknowledging that there might be another side to
the security versus usability debate.  And I'd ***really*** rather not
waste my time trying to work with someone who doesn't care about
usability, because my focus is in practical engineering, which means
if no one uses the system, or can use the system, then I consider it a
failure, even if it is 100% secure.  And this is true no matter
whether we're talking about /dev/urandom or some enhancement to
get_random_bytes().

> Do you have any opinions on the issue of what the most flexible API
> would be? There are a lot of varieties of wait_event. The default one
> is uninterruptable, but as Stephan and Herbert saw when they were
> working on this was that it could be dangerous in certain
> circumstances not to allow interruption. So probably we'd want an
> interruptable variety. But then maybe we want to support the other
> wait_event_* modes too, like killable, timeout, hrtimeout, io, and so
> on. There's a huge list. These are all implemented as macros in
> wait.h, and I don't really want to have a different get_random_bytes_*
> function that corresponds to _each_ of these wait_event_* functions,
> since that'd be overwhelming.

We're going to have to look at a representative sample of the call
sites to figure this out.  The simple case is where the call site is
only run in response to a userspace system call.  There, blocking
makes perfect sense.  I'm just not sure there are many callers of
get_random_ bytes() where this is the case.

There are definitely quite a few callsites of get_random_bytes() where
the caller is in the middle of an interrupt service routine, or is
holding spinlock.  There, we are *not* allowed to block.  So any kind
of blocking interface isn't going to be allowed; the async callback is
the only thing which is guaranteed to work, in fact.  Mercifully, many
of these are cases where prandom_u32() makes sense.

> So I was thinking maybe a simple flag and a timeout param could work:

When would a timeout be useful?  If you are using get_random_bytes()
for security reasons, does the security reason go away after 15
seconds?  Or even 30 seconds?

> > Or maybe we can then help figure out what percentage of the callsites
> > can be fixed with a synchronous interface, and fix some number of them
> > just to demonstrate that the synchronous interface does work well.
> 
> I was planning on doing this to at least a couple callsites in my
> upcoming patch series that adds the synchronous interface. It seems
> like the right way to see if the API is good or not.

This is absolutely the right approach.  See my above comments about
why there may not be that many places where a synchronous interface
will work out.  And I'm going to be rather surprised if there are
places where having a timeout parameter will be helpful.  But I could
be wrong; let's see if we can find cases where we (a) need secure
bytes, (b) are allowed to block, and (c) where a timeout would make
sense.

> Right. In my initial email I proposed a distinction:
> 
> - External module loading is usually in a different process context,
> so multiple modules can be attempted to be loaded at the same time. In
> this case, it is probably safe to block in request_module.
> - Built-in modules are loaded ± linearly, from init/main.c, so these
> really can't block each other. For this, I was thinking of introducing
> an rnginit section, to add to the list of things like lateinit. The
> rnginit drivers would be loaded _after_ the RNG has initialized, so
> that they don't get blocked.

... except that we already have an order in which drivers are added,
so moving a driver to the rnginit section might mean that some kernel
module that depends on that driver being loaded also needs to be moved
after rnginit.  We already have the following

Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-02 Thread Sandy Harris

The only sensible & general solution for the initialisation problem
that I have seen is John Denker's.
http://www.av8n.com/computer/htm/secure-random.htm#sec-boot-image

If I read that right, it would require only minor kernel changes &
none to the API Ted & others are worrying about. It would be secure
except against an enemy who can read your kernel image or interfere
with your install process. Assuming permissions are set sensibly, that
means an enemy who already has root & such an enemy has lots of much
easier ways to break things, so we need not worry about that case.

The difficulty is that it would require significant changes to
installation scripts. Still, since it is a general solution to a real
problem, it might be better to implement that rather than work on the
other suggestions in the thread.

Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-02 Thread Jason A. Donenfeld

Hi Ted,

Based on the tone of your last email, before I respond to your
individual points, I think it's worth noting that the intent of this
thread is to get a sampling of opinions of the issue of
get_random_bytes, so that I can write a patch that fixes this issue
(or a series of issues) using some strategy that we agree is a good
one. I've already implemented prototypes of a few different ideas, so
that I can discuss them competently here (a personal thing, I tend to
learn best by "doing"), and so with list feedback, I'll know in which
direction I should go for refinement eventually leading to a [PATCH]
submission. In case it wasn't evident from my last patch series for
drivers/char/random.c, despite not being corporately backed, I really
have a fun time working on this part of the kernel in my freetime.
With that said...

On Fri, Jun 2, 2017 at 9:07 PM, Theodore Ts'o  wrote:
> It's not _my_ API, it's *our* API
> much don't break backwards compatibility,
> broken, and they complain, and you tell them to suck it,
> If you are using the word *you*, and speaking as an outside to the
> community, they you can kvetch all you like.  But you're an outsider,
> take your complaints seriously.

As I wrote earlier, if you want to bike-shed the technical aspects and
political aspects of fixing /dev/urandom, we can do that in another
thread. It's obviously a complex topic worthy of discussion, and I'd
like that discussion to be technically worthy of the issues at hand,
some of which you've raised here. So please, start a new topic for
that, and I'll happily follow up, all with the intent of writing a
patch series and running compatibility tests and whatnot; as I said
before, I like working on this stuff. However, the focus of this
thread is get_random_bytes, so please try to stay focused, lest this
get derailed.

So, you wrote:

> And if your answer is just, "blah blah di blah blah", don't be
> surprised if others respond to you in exactly the same way.
> Specifically, by saying to you (in your words), "I don't care".

I really don't care about bikeshedding this with you here, now, in
this thread. I care about solving the get_random_bytes situation.

>> While we're on the topic of that, you might consider adding a simple
>> synchronous interface.
>
> There's that word "you" again

Yea, whatever, Ted. Can I rephrase? "As the maintainer of
drivers/char/random.c, you might consider adding a synchronous
interface so that the consumers of the file you maintain can benefit
from it; alternatively, and perhaps easier for you, express your
support for such an idea, and I'll gladly write the patch." Better? I
ask you for the benefit of the doubt that this is what I had in mind
when I wrote "you" -- that I'm seeking approval of the idea before
moving my code past prototype stage. You're the boss, so whether I
write the patch or you write the patch or someone else writes the
patch, it really _is_ you whose consideration matters the most.

> This is open source --- want to send patches?  It sounds like it's a
> workable, good idea.

Awesome, happy you like it. Yes, absolutely, I'll start cleaning
things up and will send a series.

Do you have any opinions on the issue of what the most flexible API
would be? There are a lot of varieties of wait_event. The default one
is uninterruptable, but as Stephan and Herbert saw when they were
working on this was that it could be dangerous in certain
circumstances not to allow interruption. So probably we'd want an
interruptable variety. But then maybe we want to support the other
wait_event_* modes too, like killable, timeout, hrtimeout, io, and so
on. There's a huge list. These are all implemented as macros in
wait.h, and I don't really want to have a different get_random_bytes_*
function that corresponds to _each_ of these wait_event_* functions,
since that'd be overwhelming. So I was thinking maybe a simple flag
and a timeout param could work:

int get_random_bytes_seeded(u8 *buf, size_t len, bool
is_interruptable, unsigned long timeout);

If timeout==0, the timeout is infinite. If in_interruptible is true,
we use wait_event_interruptible_*, otherwise we use wait_event_*, and
the ret value is the usual return value from wait_event_*.

Does that seem like a good simplification that will cover most use
cases? We could of course add exceeding complexity and choice, but I
was thinking this would cover most use cases, at least to begin.

What do you think?

> ...or that I disagree with your prior point.  I think you're being
> lazy, and trying to make it someone else's problem and standing on the
> side lines and complaining, as opposed to trying to help solve the
> problem.

Um, no, what an offensive insinuation. I'm trying to solve the
problem. I'd like your honest feedback, since your maintainer, before
I start fixing this.

> No, of course we can't audit all of the code, but it's probably a good
> idea to take a random sample, and to analyze them, so we can get a
> sense

Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-02 Thread Theodore Ts'o

On Fri, Jun 02, 2017 at 07:44:04PM +0200, Jason A. Donenfeld wrote:
> On Fri, Jun 2, 2017 at 7:26 PM, Theodore Ts'o  wrote:
> > I tried making /dev/urandom block.
> > So if you're a security focused individual who is kvetching
> > And if we're breaking
> 
> Yes yes, bla bla, predictable response. I don't care. Your API is
> still broken. Excuses excuses. Yes, somebody needs to do the work in
> the end, maybe that person can be me, maybe you, maybe somebody else.

It's not _my_ API, it's *our* API --- that is the Linux kernel
community's.  And part of the rules of this community is that we very
much don't break backwards compatibility, unless there is a really
good reason, where Linus gets to decide if it's a really good reason.
So if you care a lot about this issue, then you need to do the work to
make the change, and part of it is showing, to a high degree of
certainty, that it won't break backwards compatibility.

Because if you don't, and you flout community norms, and users get
broken, and they complain, and you tell them to suck it, then Linus
will pull out is patented clue stick, and tell you that you have in
fact flouted community norms, correct you publically, and then revert
your change.

If you are using the word *you*, and speaking as an outside to the
community, they you can kvetch all you like.  But you're an outsider,
and don't have to listen to you.  But if you want to make a positive
difference here, and you're passionate about it --- this is you would
need to do.  That being said, we're all volunteers, so if you don't
want to bother, that's fine.  But then don't be surprised if we don't
take your complaints seriously.

> While we're on the topic of that, you might consider adding a simple
> synchronous interface.

There's that word "you" again

> I realize that the get_blocking_random_bytes
> attempt was aborted as soon as it began, because of issues of
> cancelability, but you could just expose the usual array of wait,
> wait_interruptable, wait_killable, etc, or just make that wait object
> and condition non-static so others can use it as needed. Having to
> wrap the current asynchronous API like this kludge is kind of a
> downer:

This is open source --- want to send patches?  It sounds like it's a
workable, good idea.

> No, what it means is that the particularities of individual examples I
> picked at random don't matter. Are we really going to take the time to
> audit each and every callsite to see "do they need actually random
> data? can this be called pre-userspace?" I mentioned this in my
> initial email. As I said there, I think analyzing all the cases one by
> one is fragile, and more will pop up, and that's not really the right
> way to approach this. And furthermore, as alluded to above, even
> fixing clearly-broken places means using that hard-to-use asynchronous
> API, which adds even more potentially buggy TCB to these drivers and
> all the rest. Not a good strategy.
> 
> Seeing as you took the time to actually respond to the
> _particularities_ of each individual random example I picked could
> indicate that you've missed this point prior.

...or that I disagree with your prior point.  I think you're being
lazy, and trying to make it someone else's problem and standing on the
side lines and complaining, as opposed to trying to help solve the
problem.

No, of course we can't audit all of the code, but it's probably a good
idea to take a random sample, and to analyze them, so we can get a
sense of what the issues are.  And then maybe we can find a way to
quickly find a class of users that can be easily fixed by using
prandom_u32() (for example).

Or maybe we can then help figure out what percentage of the callsites
can be fixed with a synchronous interface, and fix some number of them
just to demonstrate that the synchronous interface does work well.

> Right, it was him and Stephan (CCd). They initially started by adding
> get_blocking_random_bytes, but then replaced this with the
> asynchronous one, because they realized it could block forever. As I
> said above, though, I still think a blocking API would be useful,
> perhaps just with more granularity for the way in which it blocks.

It depends on where it's being used.  If it's part of module load,
especially if it's one that's done automatically, having something
that blocks forever might not be all that useful.  Especially if it
blocks device drivers from being albe to be initialized enough to
actually supply entropy to the whole system.

Or maybe (in the case of stack canaries), the answer is we should
start with crappy random numbers, but then once the random number
generator has been initialized, we can use the callback to get
cryptographically secure random number generators, and then we need to
figure out how to phase out use of the old crappy random numbers and
substitute in the exclusive use of the good random numbers.  Because
saying that we'll just simply not allow any processes to start

Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-02 Thread Kees Cook

On Fri, Jun 2, 2017 at 10:41 AM, Daniel Micay  wrote:
> On Fri, 2017-06-02 at 17:53 +0200, Jason A. Donenfeld wrote:
>> (Meanwhile...)
>>
>> In my own code, I'm currently playing with a workaround that looks
>> like this:
>>
>> --- a/src/main.c
>> +++ b/src/main.c
>>
>> +#include 
>> +#include 
>>
>> +struct rng_initializer {
>> +   struct completion done;
>> +   struct random_ready_callback cb;
>> +};
>> +static void rng_initialized_callback(struct random_ready_callback
>> *cb)
>> +{
>> +   complete(_of(cb, struct rng_initializer, cb)->done);
>> +}
>> +
>> static int __init mod_init(void)
>> {
>>int ret;
>> +   struct rng_initializer rng = {
>> +   .done = COMPLETION_INITIALIZER(rng.done),
>> +   .cb = { .owner = THIS_MODULE, .func =
>> rng_initialized_callback }
>> +   };
>> +
>> +   ret = add_random_ready_callback();
>> +   if (!ret)
>> +   wait_for_completion();
>> +   else if (ret != -EALREADY)
>> +   return ret;
>>
>>do_things_with_get_random_bytes_maybe();
>>
>> Depending on the situation, however, I could imagine that
>> wait_for_completion never returning, if its blocking activity that
>> contributes to the seed actually being available, if this is called
>> from a compiled-in module, so I find this a bit sub-optimal...
>
> One of the early uses is initializing the stack canary value for SSP in
> very early boot. If that blocks, it's going to be blocking nearly
> anything else from happening.
>
> On x86, that's only the initial canary since the per-task canaries end
> up being used, but elsewhere at least without SMP disabled or changes to
> GCC that's all there is so the entropy matters.

And just to note, building with GCC_PLUGIN_LATENT_ENTROPY, while it
(correctly) doesn't credit entropy to the pool, should at least make
the pool less deterministic between boots.

-Kees

-- 
Kees Cook
Pixel Security

Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-02 Thread Jason A. Donenfeld

On Fri, Jun 2, 2017 at 7:41 PM, Daniel Micay  wrote:
> One of the early uses is initializing the stack canary value for SSP in
> very early boot. If that blocks, it's going to be blocking nearly
> anything else from happening.
>
> On x86, that's only the initial canary since the per-task canaries end
> up being used, but elsewhere at least without SMP disabled or changes to
> GCC that's all there is so the entropy matters.

If this is the case, then we simply need a function called
get_random_bytes_but_potentially_crappy_ones_because_we_are_desperate_for_anything(),
which would respond with a weaker guarantee than that
get_random_bytes(), which the documentation says always returns
cryptographically secure numbers.

Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-02 Thread Jason A. Donenfeld

On Fri, Jun 2, 2017 at 7:26 PM, Theodore Ts'o  wrote:
> I tried making /dev/urandom block.
> So if you're a security focused individual who is kvetching
> And if we're breaking

Yes yes, bla bla, predictable response. I don't care. Your API is
still broken. Excuses excuses. Yes, somebody needs to do the work in
the end, maybe that person can be me, maybe you, maybe somebody else.
Regardless, we can take this to a different thread if you'd like to
bikeshed about that particular point for another five millennia. But
that isn't the main topic of this thread, so I'm not going to get
sucked into that blackhole.

> the time, we hadn't yet created the asynchronous interfaces that allow
> kernel code to do the right thing, even if it was massively
> inconvenient.

While we're on the topic of that, you might consider adding a simple
synchronous interface. I realize that the get_blocking_random_bytes
attempt was aborted as soon as it began, because of issues of
cancelability, but you could just expose the usual array of wait,
wait_interruptable, wait_killable, etc, or just make that wait object
and condition non-static so others can use it as needed. Having to
wrap the current asynchronous API like this kludge is kind of a
downer:

https://git.zx2c4.com/WireGuard/diff/src/config.c?id=c77145a8248dfc49e307eae7d7edd5fdca8d5559

> At this point, I think we should be completely open to making it be a
> config option, and if it looks like for common configs we're not
> spamming dmesg, we could even it make it be the default.  We just also
> need to give driver writers some easy-to-understand receipes to fix
> their drivers to do the right thing.  If we do that first, it's much
> more likely they will actually fix their kernel code, instead of just
> turning the warning off.

That's a good point. In my initial email I looked down on the
whack-a-mole approach, because nobody is ever going to analyze all of
those cases. But... if dmesgs are going wild for some people, we'll
have upset users doing the hard work for us. So maybe it's worthwhile
to turn that warning on by default.

> This is basically the exponential backoff which used in ethernet
> networks, and the *real* answer is that they should be using
> prandom_u32().
> I think in practice most IBM customers will be safe because they tend

No, what it means is that the particularities of individual examples I
picked at random don't matter. Are we really going to take the time to
audit each and every callsite to see "do they need actually random
data? can this be called pre-userspace?" I mentioned this in my
initial email. As I said there, I think analyzing all the cases one by
one is fragile, and more will pop up, and that's not really the right
way to approach this. And furthermore, as alluded to above, even
fixing clearly-broken places means using that hard-to-use asynchronous
API, which adds even more potentially buggy TCB to these drivers and
all the rest. Not a good strategy.

Seeing as you took the time to actually respond to the
_particularities_ of each individual random example I picked could
indicate that you've missed this point prior.

>> And so on and so on and so on. If you grep the source as I did, you'll
>> find there's no shortage of head-scratchers. "Hmmm," you ask yourself,
>> "could this be called before userspace has ensured that /dev/urandom
>> is seeded? do we actually _need_ good randomness here, or does it not
>> matter?" I guess you could try to reason about each and every one of
>> them -- you might even have those same questions about the silly
>> examples I pasted above -- but that one-by-one methodology seems
>> excessively fragile and arduous.
>
> This is a fair point.

Glad we're on the same page about that.

> That developer was Herbert Xu, and he added it about two years ago.
> See commit 205a525c3342.

Right, it was him and Stephan (CCd). They initially started by adding
get_blocking_random_bytes, but then replaced this with the
asynchronous one, because they realized it could block forever. As I
said above, though, I still think a blocking API would be useful,
perhaps just with more granularity for the way in which it blocks.

> So the problem with doing this by default, for all modules, is that on
> those platforms which don't have good hardware support for seeding the
> non-blocking pool quickly, if we defer all modules, we will also be
> deferring the means by which we get the entropy needed to seed the
> non-blocking pool.  So for example, if you have a bunch of networking
> drivers that are using get_random_bytes() for exponential backoff,
> when they *should* be using prandom_u32(), if we don't fix *that* bug,
> simply deferring the module load will also defer the entropy input
> from the networking interrupts from the networking card.

"We shouldn't fix legitimate cryptographic bugs, because the rest of
our codebase is too buggy to support anything reasonable."
Sounds like it'd be worthwhile to begin fixing

Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-02 Thread Daniel Micay

On Fri, 2017-06-02 at 17:53 +0200, Jason A. Donenfeld wrote:
> (Meanwhile...)
> 
> In my own code, I'm currently playing with a workaround that looks
> like this:
> 
> --- a/src/main.c
> +++ b/src/main.c
> 
> +#include 
> +#include 
> 
> +struct rng_initializer {
> +   struct completion done;
> +   struct random_ready_callback cb;
> +};
> +static void rng_initialized_callback(struct random_ready_callback
> *cb)
> +{
> +   complete(_of(cb, struct rng_initializer, cb)->done);
> +}
> +
> static int __init mod_init(void)
> {
>int ret;
> +   struct rng_initializer rng = {
> +   .done = COMPLETION_INITIALIZER(rng.done),
> +   .cb = { .owner = THIS_MODULE, .func =
> rng_initialized_callback }
> +   };
> +
> +   ret = add_random_ready_callback();
> +   if (!ret)
> +   wait_for_completion();
> +   else if (ret != -EALREADY)
> +   return ret;
> 
>do_things_with_get_random_bytes_maybe();
> 
> Depending on the situation, however, I could imagine that
> wait_for_completion never returning, if its blocking activity that
> contributes to the seed actually being available, if this is called
> from a compiled-in module, so I find this a bit sub-optimal...

One of the early uses is initializing the stack canary value for SSP in
very early boot. If that blocks, it's going to be blocking nearly
anything else from happening.

On x86, that's only the initial canary since the per-task canaries end
up being used, but elsewhere at least without SMP disabled or changes to
GCC that's all there is so the entropy matters.

Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-02 Thread Theodore Ts'o

On Fri, Jun 02, 2017 at 04:59:56PM +0200, Jason A. Donenfeld wrote:
> /dev/urandom will return bad randomness before its seeded, rather than
> blocking, and despite years and years of discussion and bike shedding,
> this behavior hasn't changed, and the vast majority of cryptographers
> and security experts remain displeased. It _should_ block until its
> been seeded for the first time, just like the new getrandom() syscall,
> and this important bug fix (which is not a real api break) should be
> backported to all stable kernels. (Userspace doesn't even have a
> reliable way of determining whether or not it's been seeded!) Yes,
> yes, you have arguments for why you're keeping this pathological, but
> you're still wrong, and this api is still a bug. Forcing userspace
> architectures to work around your bug, as your arguments usually go,
> is disquieting.

I tried making /dev/urandom block.  The zero day kernel testing within
a few hours told us that Ubuntu LTS at the time and OpenWrt would fail
to boot.  And when Python made a change to call getrandom(2) with the
default blocking semantic instead of using /dev/urandom, some
distributions using systemd stopped booting.

So if you're a security focused individual who is kvetching that we're
asking to force userspace to change to fix "a bug", you need to
understand that making the change you're suggesting will *also*
require making changes to userspace.  And if you want to go try to
convince Linus Torvalds that it's OK to make a backwards-incompatible
changes that break Ubuntu LTS and OpenWRT by causing them to fail to
boot --- be my guest.

And if we're breaking distributions from them booting when their use
of /dev/urandom is not security critical, I suspect Linus is not going
to be impressed that you are breaking those users for no beneft.  (For
example, Python's use of getrandom(2) was to prevent denial of service
attacks when Python is used as a fast-CGI web server script, and it
was utterly pointless from security perspective to block when the
python script was creating on-demand systemd unit files, where the DOS
threat was completely irrelevant.)

> As the printk implies, it's possible for get_random_bytes() to be
> called before it's seeded. Why was that helpful printk put behind an
> ifdef? I suspect because people would see the lines in their dmesg and
> write emails like this one to the list.

The #ifdef and the printk was there from the very beginning.

commit 392a546dc8368d1745f9891ef3f8f7c380de8650
Author: Theodore Ts'o 
Date:   Sun Nov 3 18:24:08 2013 -0500

random: add debugging code to detect early use of get_random_bytes()

Signed-off-by: "Theodore Ts'o" 

It was four years ago, so I don't remember the exact circumstances,
but if I recall the issue was not wanting to spam the dmesg.  Also, at
the time, we hadn't yet created the asynchronous interfaces that allow
kernel code to do the right thing, even if it was massively
inconvenient.

At this point, I think we should be completely open to making it be a
config option, and if it looks like for common configs we're not
spamming dmesg, we could even it make it be the default.  We just also
need to give driver writers some easy-to-understand receipes to fix
their drivers to do the right thing.  If we do that first, it's much
more likely they will actually fix their kernel code, instead of just
turning the warning off.

> drivers/net/ieee802154/at86rf230.c:
> static int at86rf230_probe(struct spi_device *spi)
> {
> ...
> get_random_bytes(csma_seed, ARRAY_SIZE(csma_seed));
> 
> No clue what this driver is for or if it actually needs good
> randomness, but using get_random_bytes in a hardware probe function
> can happen prior to userspace's ability to seed.

What, you mean as a computer scientist you didn't immediately
understand that csma doesn't refer to CSMA/CD --- or "Carrier-sense
multiple access with collision avoidance"?  For shame!  :-)

This is basically the exponential backoff which used in ethernet
networks, and the *real* answer is that they should be using
prandom_u32().

> arch/s390/crypto/prng.c seems to call get_random_bytes on module load,
> which can happen pre-userspace, for seeding some kind of RNG of its
> own.

It appears to be accessing a hardware cryptographic processor
function, and seems to do its own entropy gathering relying on stack
garbage as well as using get_random_bytes().  By default it is defined
as a module, so I suspect in most situations it's called
post-userspace, but agreed that it is not guaranteed to be the case.
I think in practice most IBM customers will be safe because they tend
to use distro kernels that will almost certainly be building it as a
module, but your point is well taken.

> And so on and so on and so on. If you grep the source as I did, you'll
> find there's no shortage of head-scratchers. "Hmmm," you ask yourself,
> "could this be called before userspace has ensured that /dev/urandom
> is

Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-02 Thread Jason A. Donenfeld

Further investigations: if the whack-a-mole approach is desirable,
perhaps many of those get_random_bytes calls should be converted to
get_blocking_random_bytes. In that case, this commit, which removed
this helpful API, should be reverted:

commit c2719503f5e1e6213d716bb078bdad01e28ebcbf
Author: Herbert Xu 
Date:   Tue Jun 9 18:19:42 2015 +0800

   random: Remove kernel blocking API

   This patch removes the kernel blocking API as it has been completely
   replaced by the callback API.

   Signed-off-by: Herbert Xu 

diff --git a/drivers/char/random.c b/drivers/char/random.c
index a1576ed1d88e..d0da5d852d41 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1265,18 +1265,6 @@ void get_random_bytes(void *buf, int nbytes)
EXPORT_SYMBOL(get_random_bytes);

/*
- * Equivalent function to get_random_bytes with the difference that this
- * function blocks the request until the nonblocking_pool is initialized.
- */
-void get_blocking_random_bytes(void *buf, int nbytes)
-{
-   if (unlikely(nonblocking_pool.initialized == 0))
-   wait_event(urandom_init_wait, nonblocking_pool.initialized);
-   extract_entropy(_pool, buf, nbytes, 0, 0);
-}
-EXPORT_SYMBOL(get_blocking_random_bytes);
-
-/*

Re: get_random_bytes returns bad randomness before seeding is complete

2017-06-02 Thread Jason A. Donenfeld

(Meanwhile...)

In my own code, I'm currently playing with a workaround that looks like this:

--- a/src/main.c
+++ b/src/main.c

+#include 
+#include 

+struct rng_initializer {
+   struct completion done;
+   struct random_ready_callback cb;
+};
+static void rng_initialized_callback(struct random_ready_callback *cb)
+{
+   complete(_of(cb, struct rng_initializer, cb)->done);
+}
+
static int __init mod_init(void)
{
   int ret;
+   struct rng_initializer rng = {
+   .done = COMPLETION_INITIALIZER(rng.done),
+   .cb = { .owner = THIS_MODULE, .func = rng_initialized_callback }
+   };
+
+   ret = add_random_ready_callback();
+   if (!ret)
+   wait_for_completion();
+   else if (ret != -EALREADY)
+   return ret;

   do_things_with_get_random_bytes_maybe();

Depending on the situation, however, I could imagine that
wait_for_completion never returning, if its blocking activity that
contributes to the seed actually being available, if this is called
from a compiled-in module, so I find this a bit sub-optimal...

get_random_bytes returns bad randomness before seeding is complete

2017-06-02 Thread Jason A. Donenfeld

Hi folks,

This email is about an issue with get_random_bytes(), the CSPRNG used
inside the kernel for generating keys and nonces and whatnot. However,
I will begin with an aside:

/dev/urandom will return bad randomness before its seeded, rather than
blocking, and despite years and years of discussion and bike shedding,
this behavior hasn't changed, and the vast majority of cryptographers
and security experts remain displeased. It _should_ block until its
been seeded for the first time, just like the new getrandom() syscall,
and this important bug fix (which is not a real api break) should be
backported to all stable kernels. (Userspace doesn't even have a
reliable way of determining whether or not it's been seeded!) Yes,
yes, you have arguments for why you're keeping this pathological, but
you're still wrong, and this api is still a bug. Forcing userspace
architectures to work around your bug, as your arguments usually go,
is disquieting. Anyway, that's not what this email is about, but given
that as a backdrop, here's a different-but-related, and perhaps more
petulant, issue...

The problem this email intends to address is this:

void get_random_bytes(void *buf, int nbytes)
{
#if DEBUG_RANDOM_BOOT > 0
   if (!crng_ready())
   printk(KERN_NOTICE "random: %pF get_random_bytes called "
  "with crng_init = %d\n", (void *) _RET_IP_, crng_init);
#endif
   ...
   extract_crng(buf);

Or, on older kernels:

void get_random_bytes(void *buf, int nbytes)
{
#if DEBUG_RANDOM_BOOT > 0
   if (unlikely(nonblocking_pool.initialized == 0))
   printk(KERN_NOTICE "random: %pF get_random_bytes called "
  "with %d bits of entropy available\n",
  (void *) _RET_IP_,
  nonblocking_pool.entropy_total);
#endif
   trace_get_random_bytes(nbytes, _RET_IP_);
   extract_entropy(_pool, buf, nbytes, 0, 0);
}

As the printk implies, it's possible for get_random_bytes() to be
called before it's seeded. Why was that helpful printk put behind an
ifdef? I suspect because people would see the lines in their dmesg and
write emails like this one to the list.

I anticipate an argument coming from the never-block /dev/urandom
cartel that goes something along the lines of, "get_random_bytes() is
only called in paths initiated by a syscall; therefore, userspace must
ensure /dev/urandom, which is the same pool as get_random_bytes, has
been properly seeded, before making any syscalls that might lead to
get_random_bytes() being called."

If you've already given up on the general initiative of trying to urge
them to make /dev/urandom block until seeded, then this might seem
like a reasonable argument. If /dev/urandom is broken, it doesn't
matter if get_random_bytes() is broken too, if the required
work-around for one is the same as for the other.

But the premise is flawed. get_random_bytes() can be called before any
syscalls are made, before userspace even has an opportunity to ensure
/dev/urandom is seeded. That's what that printk in the ifdef is all
about. Bad news bears.

Grepping through the source tree for get_random_bytes, I just opened a
few files at random to see what the deal was. Here's one:

drivers/net/ieee802154/at86rf230.c:
static int at86rf230_probe(struct spi_device *spi)
{
...
get_random_bytes(csma_seed, ARRAY_SIZE(csma_seed));

No clue what this driver is for or if it actually needs good
randomness, but using get_random_bytes in a hardware probe function
can happen prior to userspace's ability to seed.

arch/s390/crypto/prng.c seems to call get_random_bytes on module load,
which can happen pre-userspace, for seeding some kind of RNG of its
own.

net/ipv6/addrconf.c:
static void __ipv6_regen_rndid(struct inet6_dev *idev)
{
regen:
   get_random_bytes(idev->rndid, sizeof(idev->rndid));

This is in the networking stack when bringing up an interface and
assigning randomly assigned v6 addresses. While you might argue that
userspace is required for networking, remember that the kernel
actually has the ability to initiate networking all on its own, before
userspace; the kernel even has its own dhcp client! Yowza. In fact, on
that note:

net/ipv4/ipconfig.c:
static int __init ic_open_devs(void)
{
...
get_random_bytes(>xid, sizeof(__be32));

And so on and so on and so on. If you grep the source as I did, you'll
find there's no shortage of head-scratchers. "Hmmm," you ask yourself,
"could this be called before userspace has ensured that /dev/urandom
is seeded? do we actually _need_ good randomness here, or does it not
matter?" I guess you could try to reason about each and every one of
them -- you might even have those same questions about the silly
examples I pasted above -- but that one-by-one methodology seems
excessively fragile and arduous.

There must have been a developer who thought about this at least once
before, because random.c does have a callback notifier mechanism for
drivers to learn

Re: get_random_bytes returns bad randomness before seeding is complete

Re: get_random_bytes returns bad randomness before seeding is complete

Re: get_random_bytes returns bad randomness before seeding is complete

Re: get_random_bytes returns bad randomness before seeding is complete

Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete

Re: get_random_bytes returns bad randomness before seeding is complete

Re: get_random_bytes returns bad randomness before seeding is complete

Re: get_random_bytes returns bad randomness before seeding is complete

Re: get_random_bytes returns bad randomness before seeding is complete

Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete

Re: get_random_bytes returns bad randomness before seeding is complete

Re: get_random_bytes returns bad randomness before seeding is complete

Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete

Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete

Re: get_random_bytes returns bad randomness before seeding is complete

Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete

Re: get_random_bytes returns bad randomness before seeding is complete

Re: get_random_bytes returns bad randomness before seeding is complete

Re: get_random_bytes returns bad randomness before seeding is complete

get_random_bytes returns bad randomness before seeding is complete

20 matches

Site Navigation

Mail list logo

Footer information