Re: get_random_bytes returns bad randomness before seeding is complete
Am Freitag, 2. Juni 2017, 16:59:56 CEST schrieb Jason A. Donenfeld: Hi Jason, > Alternatively, I'm open to other solutions people might come up with. One addition, there is an issue (I would call it a bug) in random.c before 4.8 where the nonblocking_pool is not reseeded during early boot even though entropy may be available. That issue aggravates early boot time entropy issues for user and kernel land. I have not heard about accepting or rejecting it, so I am wondering how patches go into random.c at all. [1] https://patchwork.kernel.org/patch/9620431/ Ciao Stephan
Re: get_random_bytes returns bad randomness before seeding is complete
Am Sonntag, 4. Juni 2017, 00:54:39 CEST schrieb Jeffrey Walton: Hi Jeffrey, > On Sat, Jun 3, 2017 at 5:45 PM, Sandy Harriswrote: > > ... > > Of course this will fail on systems with no high-res timer. Are there > > still some of those? It might be done in about 1000 times as long on a > > system that lacks the realtime library's nanosecond timer but has the > > Posix standard microsecond timer, implying a delay time in the > > milliseconds. Would that be acceptable in those cases? > > A significant portion of the use cases should include mobile devices. > Device sales outnumbered desktop and server sales several years ago. > > Many devices are sensor rich. Even the low-end ones come with > accelorometers for gaming. A typical one has 3 or 4 sensors, and > higher-end ones have 7 or 8 sensors. An Evo 4G has 7 of them. > I think those devices are covered with the kernels 4.8+. That kernel uses solely interrupts as noise source for the first stage we talk about here. Not having done any particular measurements with the latest kernels on mobile devices, but based on my experience with my LRNG assessment, I could fathom that mobile devices have a fully seeded ChaCha20 DRNG before user space starts. Just to give an illustration: I have a Lenovo T540 which receives more than 256 interrupts before late_initcall. On all system with a high-res timer, each interrupt will give more than one bit of entropy. Conversely, on my MacBook Pro 2015, at late_initcall the kernel received less than 100 interrupts. In a KVM guest with very little devices, I also have some 100 interrupts before late_initcall. These measurements are taken with the same kernel and same kernel configs. Ciao Stephan
Re: get_random_bytes returns bad randomness before seeding is complete
On Sun, Jun 4, 2017 at 1:48 AM, Stephan Müllerwrote: > Am Freitag, 2. Juni 2017, 16:59:56 CEST schrieb Jason A. Donenfeld: > >> Alternatively, I'm open to other solutions people might come up with. > > How about stirring in some data from the Jitter RNG that we have in the kernel > already and that is used for the DRBG in case get_random_bytes has > insufficient entropy? Yes, two kernel developers said that this RNG is > useless, where in fact a lot of hardware and even crypto folks say that this > approach has merits. Almost anything has to be better than (1) silent failures, and (2) draining the little entropy available when the generators are starting and trying to become operational. The [negative] use case for (2) is systemd. See, for example, https://github.com/systemd/systemd/issues/4167. Jeff
Re: get_random_bytes returns bad randomness before seeding is complete
Am Freitag, 2. Juni 2017, 16:59:56 CEST schrieb Jason A. Donenfeld: Hi Jason, > > Alternatively, I'm open to other solutions people might come up with. How about stirring in some data from the Jitter RNG that we have in the kernel already and that is used for the DRBG in case get_random_bytes has insufficient entropy? Yes, two kernel developers said that this RNG is useless, where in fact a lot of hardware and even crypto folks say that this approach has merits. In any case, it cannot destroy the (not present) entropy at boot time anyway. Thus, take some 32, 48 or 64 bytes from it right at the start of the kernel, and we should be better (from the view point of quite some folks) or not worse off (view point of two developers here). As this RNG does not depend on any in-kernel facility, it is always available at any time. PS: I could revive a patch adding this to random.c that I sent long ago if desired. Ciao Stephan
Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete
On 3 June 2017 at 18:54, Jeffrey Waltonwrote: > On Sat, Jun 3, 2017 at 5:45 PM, Sandy Harris wrote: >> ... >> Of course this will fail on systems with no high-res timer. Are there >> still some of those? It might be done in about 1000 times as long on a >> system that lacks the realtime library's nanosecond timer but has the >> Posix standard microsecond timer, implying a delay time in the >> milliseconds. Would that be acceptable in those cases? > > A significant portion of the use cases should include mobile devices. > Device sales outnumbered desktop and server sales several years ago. > > Many devices are sensor rich. Even the low-end ones come with > accelorometers for gaming. A typical one has 3 or 4 sensors, and > higher-end ones have 7 or 8 sensors. An Evo 4G has 7 of them. > > There's no wanting for entropy in many of the use cases. The thing > that is lacking seems to be taking advantage of it. > > Jeff Hardware random number generator support is also standard on even low-end mobile devices. The Linux kernel now knows to feed some of the entropy from those hardware random generators into the kernel CSPRNG when the driver is initialized but that doesn't happen until fairly late in the kernel's boot process. The sensors present the same issue. They aren't available when the kernel starts needing entropy for features like SSP and KASLR or other early boot uses, unlike RDRAND/RDSEED on modern x86_64 CPUs. For userspace, Android's init system blocks until a certain amount of entropy is obtained from one for the kernel CSPRNG. It's possible for there to be no hwrandom but I think that's very rare now since the standard SoCs used everywhere have it available. The device vendor would probably need to go out of the way to break it. Android also regularly saves a persistent random seed and restores it on boot. It also mixes in entropy from the hardware generator regularly since the kernel didn't know how to do that before, just like it didn't know how to grab any initial entropy from the hardware generator. I don't think it's worth worrying too much about mobile. Slimmer embedded devices that probably don't even save / restore a seed in many cases or generate keys on first boot before that helps are the real issue. At least if you're not focused on KASLR and other early probabilistic kernel exploit mitigations where there's a lack of a way to get entropy in early boot right now unless the bootloader helps.
Re: get_random_bytes returns bad randomness before seeding is complete
On Sat, Jun 3, 2017 at 5:45 PM, Sandy Harriswrote: > ... > Of course this will fail on systems with no high-res timer. Are there > still some of those? It might be done in about 1000 times as long on a > system that lacks the realtime library's nanosecond timer but has the > Posix standard microsecond timer, implying a delay time in the > milliseconds. Would that be acceptable in those cases? A significant portion of the use cases should include mobile devices. Device sales outnumbered desktop and server sales several years ago. Many devices are sensor rich. Even the low-end ones come with accelorometers for gaming. A typical one has 3 or 4 sensors, and higher-end ones have 7 or 8 sensors. An Evo 4G has 7 of them. There's no wanting for entropy in many of the use cases. The thing that is lacking seems to be taking advantage of it. Jeff
Re: get_random_bytes returns bad randomness before seeding is complete
Stephan's driver, the HAVEGE system & several others purport to extract entropy from a series of timer calls. Probably the best analysis is in the Mcguire et al. paper at https://static.lwn.net/images/conf/rtlws11/random-hardware.pdf & the simplest code in my user-space driver at https://github.com/sandy-harris/maxwell The only kernel-space code I know of is Stephan's. If the claim that such calls give entropy is accepted (which I think it should be) then if we get one bit per call, need 100 or so bits & space the calls 100 ns apart, loading up a decent chunk of startup entropy takes about 10,000 ns or 10 microseconds which looks like an acceptable delay. Can we just do that very early in the boot process? Of course this will fail on systems with no high-res timer. Are there still some of those? It might be done in about 1000 times as long on a system that lacks the realtime library's nanosecond timer but has the Posix standard microsecond timer, implying a delay time in the milliseconds. Would that be acceptable in those cases?
Re: get_random_bytes returns bad randomness before seeding is complete
On Sat, Jun 3, 2017 at 7:04 AM, Theodore Ts'owrote: > has been pretty terrible? > This kind of "my shit doesn't stink, but yours does", is not > The reason why I keep harping on this is because I'm concerned about > an absolutist attitude towards technical design, where the good is the Moving past that, did you see the [PATCH RCF 0/3] series I posted yesterday? Would be helpful to have your feedback on that approach and implementation strategy. Since it seems like you're preferring cleaning up things individually, rather than the systemic rnginit solution I initially proposed, I moved forward with implementing an RFC-version of that. I'm pretty sure so quickly compromising and going with what I perceived you thought was best is a strong indication that there isn't an, "absolutist attitude towards technical design". However, if you do somehow find evidence of that kind of claim in my [PATCH] set, please do bring it up, and I'll try to adjust to be more pleasing. > We're going to have to look at a representative sample of the call > sites to figure this out. The simple case is where the call site is > only run in response to a userspace system call. There, blocking > makes perfect sense. I'm just not sure there are many callers of > get_random_ bytes() where this is the case. In the patch series I sent earlier, the reason I split things into wait_for_random_bytes, which just blocks until the pool is ready, and then the convenience combiner of get_random_bytes_wait, which calls wait_for_random_bytes and then get_random_bytes, is because I was thinking there might be a few places where we can't actually sleep during the get_random_bytes call, due to in_interrupt() or whatever, but that there's some process-context area that's _always_ called before get_random_bytes, like a userspace configuration API or an ioctl, so we could simply put a call to wait_for_random_bytes, and then be sure that all calls to get_random_bytes after that are safe. I guess I'll see in practice if this is actually a useful way of doing it, once I dig in and start modifying representative call sites. > When would a timeout be useful? If you are using get_random_bytes() > for security reasons, does the security reason go away after 15 > seconds? Or even 30 seconds? I was thinking that returning to userspace with -ETIMEDOUT or something might be more desirable in some odd situations (which ones?) than just waiting for a signal and responding with -EINTR/-ERESTARTSYS. That might turn out to be not true, in which case I guess I won't add that API, as you suggested. > Also, it is possible that we may have architectures, without > fine-grained clocks, where we don't initialize the rng until after > userspace as sharted running. So it's not clear adding a rnginit > section makes sense. Even if we put it as late as possible --- say, > after "late", what do we do if don't have the CRNG fully > negotiated after the last of the "late" drivers have been run? My idea was that it would be eventually inserted on the callback from add_random_ready_callback. You're right that this would not be okay for things like filesystems, but maybe it'd be appropriate for things like crypto/rng.c? Or, perhaps the blocking API on configuration-time would be better, anyway, for things like that. You seem wary of this approach, so I'm going to roll with your suggestions above and see how they work out. It it pans out great, if not, maybe we'll revisit this down the road once I have a better picture of what the call sites are like. Jason
Re: get_random_bytes returns bad randomness before seeding is complete
On Sat, Jun 03, 2017 at 01:58:25AM +0200, Jason A. Donenfeld wrote: > Hi Ted, > > Based on the tone of your last email, before I respond to your > individual points May I gently point out that *your* tone that started this whole thread has been pretty terrible? Quoting your your first message: "Yes, yes, you have arguments for why you're keeping this pathological, but you're still wrong, and this api is still a bug." This kind of "my shit doesn't stink, but yours does", is not particularly useful if you are trying to have a constructive discussion. If you think the /dev/urandom question wasn't appropriate to discuss in this thread, then why the *hell* did you bring it up *first*? The reason why I keep harping on this is because I'm concerned about an absolutist attitude towards technical design, where the good is the enemy of the perfect, and where bricking machines in the pursuit of cryptographic perfection is considered laudable. Yes, if you apply thermite to the CPU and the hard drives, the system will be secure. But it also won't be terribly useful. And to me, the phrase "you're still wrong" doesn't seem to admit any concessions towards acknowledging that there might be another side to the security versus usability debate. And I'd ***really*** rather not waste my time trying to work with someone who doesn't care about usability, because my focus is in practical engineering, which means if no one uses the system, or can use the system, then I consider it a failure, even if it is 100% secure. And this is true no matter whether we're talking about /dev/urandom or some enhancement to get_random_bytes(). > Do you have any opinions on the issue of what the most flexible API > would be? There are a lot of varieties of wait_event. The default one > is uninterruptable, but as Stephan and Herbert saw when they were > working on this was that it could be dangerous in certain > circumstances not to allow interruption. So probably we'd want an > interruptable variety. But then maybe we want to support the other > wait_event_* modes too, like killable, timeout, hrtimeout, io, and so > on. There's a huge list. These are all implemented as macros in > wait.h, and I don't really want to have a different get_random_bytes_* > function that corresponds to _each_ of these wait_event_* functions, > since that'd be overwhelming. We're going to have to look at a representative sample of the call sites to figure this out. The simple case is where the call site is only run in response to a userspace system call. There, blocking makes perfect sense. I'm just not sure there are many callers of get_random_ bytes() where this is the case. There are definitely quite a few callsites of get_random_bytes() where the caller is in the middle of an interrupt service routine, or is holding spinlock. There, we are *not* allowed to block. So any kind of blocking interface isn't going to be allowed; the async callback is the only thing which is guaranteed to work, in fact. Mercifully, many of these are cases where prandom_u32() makes sense. > So I was thinking maybe a simple flag and a timeout param could work: When would a timeout be useful? If you are using get_random_bytes() for security reasons, does the security reason go away after 15 seconds? Or even 30 seconds? > > Or maybe we can then help figure out what percentage of the callsites > > can be fixed with a synchronous interface, and fix some number of them > > just to demonstrate that the synchronous interface does work well. > > I was planning on doing this to at least a couple callsites in my > upcoming patch series that adds the synchronous interface. It seems > like the right way to see if the API is good or not. This is absolutely the right approach. See my above comments about why there may not be that many places where a synchronous interface will work out. And I'm going to be rather surprised if there are places where having a timeout parameter will be helpful. But I could be wrong; let's see if we can find cases where we (a) need secure bytes, (b) are allowed to block, and (c) where a timeout would make sense. > Right. In my initial email I proposed a distinction: > > - External module loading is usually in a different process context, > so multiple modules can be attempted to be loaded at the same time. In > this case, it is probably safe to block in request_module. > - Built-in modules are loaded ± linearly, from init/main.c, so these > really can't block each other. For this, I was thinking of introducing > an rnginit section, to add to the list of things like lateinit. The > rnginit drivers would be loaded _after_ the RNG has initialized, so > that they don't get blocked. ... except that we already have an order in which drivers are added, so moving a driver to the rnginit section might mean that some kernel module that depends on that driver being loaded also needs to be moved after rnginit. We already have the following
Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete
The only sensible & general solution for the initialisation problem that I have seen is John Denker's. http://www.av8n.com/computer/htm/secure-random.htm#sec-boot-image If I read that right, it would require only minor kernel changes & none to the API Ted & others are worrying about. It would be secure except against an enemy who can read your kernel image or interfere with your install process. Assuming permissions are set sensibly, that means an enemy who already has root & such an enemy has lots of much easier ways to break things, so we need not worry about that case. The difficulty is that it would require significant changes to installation scripts. Still, since it is a general solution to a real problem, it might be better to implement that rather than work on the other suggestions in the thread.
Re: get_random_bytes returns bad randomness before seeding is complete
Hi Ted, Based on the tone of your last email, before I respond to your individual points, I think it's worth noting that the intent of this thread is to get a sampling of opinions of the issue of get_random_bytes, so that I can write a patch that fixes this issue (or a series of issues) using some strategy that we agree is a good one. I've already implemented prototypes of a few different ideas, so that I can discuss them competently here (a personal thing, I tend to learn best by "doing"), and so with list feedback, I'll know in which direction I should go for refinement eventually leading to a [PATCH] submission. In case it wasn't evident from my last patch series for drivers/char/random.c, despite not being corporately backed, I really have a fun time working on this part of the kernel in my freetime. With that said... On Fri, Jun 2, 2017 at 9:07 PM, Theodore Ts'owrote: > It's not _my_ API, it's *our* API > much don't break backwards compatibility, > broken, and they complain, and you tell them to suck it, > If you are using the word *you*, and speaking as an outside to the > community, they you can kvetch all you like. But you're an outsider, > take your complaints seriously. As I wrote earlier, if you want to bike-shed the technical aspects and political aspects of fixing /dev/urandom, we can do that in another thread. It's obviously a complex topic worthy of discussion, and I'd like that discussion to be technically worthy of the issues at hand, some of which you've raised here. So please, start a new topic for that, and I'll happily follow up, all with the intent of writing a patch series and running compatibility tests and whatnot; as I said before, I like working on this stuff. However, the focus of this thread is get_random_bytes, so please try to stay focused, lest this get derailed. So, you wrote: > And if your answer is just, "blah blah di blah blah", don't be > surprised if others respond to you in exactly the same way. > Specifically, by saying to you (in your words), "I don't care". I really don't care about bikeshedding this with you here, now, in this thread. I care about solving the get_random_bytes situation. >> While we're on the topic of that, you might consider adding a simple >> synchronous interface. > > There's that word "you" again Yea, whatever, Ted. Can I rephrase? "As the maintainer of drivers/char/random.c, you might consider adding a synchronous interface so that the consumers of the file you maintain can benefit from it; alternatively, and perhaps easier for you, express your support for such an idea, and I'll gladly write the patch." Better? I ask you for the benefit of the doubt that this is what I had in mind when I wrote "you" -- that I'm seeking approval of the idea before moving my code past prototype stage. You're the boss, so whether I write the patch or you write the patch or someone else writes the patch, it really _is_ you whose consideration matters the most. > This is open source --- want to send patches? It sounds like it's a > workable, good idea. Awesome, happy you like it. Yes, absolutely, I'll start cleaning things up and will send a series. Do you have any opinions on the issue of what the most flexible API would be? There are a lot of varieties of wait_event. The default one is uninterruptable, but as Stephan and Herbert saw when they were working on this was that it could be dangerous in certain circumstances not to allow interruption. So probably we'd want an interruptable variety. But then maybe we want to support the other wait_event_* modes too, like killable, timeout, hrtimeout, io, and so on. There's a huge list. These are all implemented as macros in wait.h, and I don't really want to have a different get_random_bytes_* function that corresponds to _each_ of these wait_event_* functions, since that'd be overwhelming. So I was thinking maybe a simple flag and a timeout param could work: int get_random_bytes_seeded(u8 *buf, size_t len, bool is_interruptable, unsigned long timeout); If timeout==0, the timeout is infinite. If in_interruptible is true, we use wait_event_interruptible_*, otherwise we use wait_event_*, and the ret value is the usual return value from wait_event_*. Does that seem like a good simplification that will cover most use cases? We could of course add exceeding complexity and choice, but I was thinking this would cover most use cases, at least to begin. What do you think? > ...or that I disagree with your prior point. I think you're being > lazy, and trying to make it someone else's problem and standing on the > side lines and complaining, as opposed to trying to help solve the > problem. Um, no, what an offensive insinuation. I'm trying to solve the problem. I'd like your honest feedback, since your maintainer, before I start fixing this. > No, of course we can't audit all of the code, but it's probably a good > idea to take a random sample, and to analyze them, so we can get a > sense
Re: get_random_bytes returns bad randomness before seeding is complete
On Fri, Jun 02, 2017 at 07:44:04PM +0200, Jason A. Donenfeld wrote: > On Fri, Jun 2, 2017 at 7:26 PM, Theodore Ts'owrote: > > I tried making /dev/urandom block. > > So if you're a security focused individual who is kvetching > > And if we're breaking > > Yes yes, bla bla, predictable response. I don't care. Your API is > still broken. Excuses excuses. Yes, somebody needs to do the work in > the end, maybe that person can be me, maybe you, maybe somebody else. It's not _my_ API, it's *our* API --- that is the Linux kernel community's. And part of the rules of this community is that we very much don't break backwards compatibility, unless there is a really good reason, where Linus gets to decide if it's a really good reason. So if you care a lot about this issue, then you need to do the work to make the change, and part of it is showing, to a high degree of certainty, that it won't break backwards compatibility. Because if you don't, and you flout community norms, and users get broken, and they complain, and you tell them to suck it, then Linus will pull out is patented clue stick, and tell you that you have in fact flouted community norms, correct you publically, and then revert your change. If you are using the word *you*, and speaking as an outside to the community, they you can kvetch all you like. But you're an outsider, and don't have to listen to you. But if you want to make a positive difference here, and you're passionate about it --- this is you would need to do. That being said, we're all volunteers, so if you don't want to bother, that's fine. But then don't be surprised if we don't take your complaints seriously. > While we're on the topic of that, you might consider adding a simple > synchronous interface. There's that word "you" again > I realize that the get_blocking_random_bytes > attempt was aborted as soon as it began, because of issues of > cancelability, but you could just expose the usual array of wait, > wait_interruptable, wait_killable, etc, or just make that wait object > and condition non-static so others can use it as needed. Having to > wrap the current asynchronous API like this kludge is kind of a > downer: This is open source --- want to send patches? It sounds like it's a workable, good idea. > No, what it means is that the particularities of individual examples I > picked at random don't matter. Are we really going to take the time to > audit each and every callsite to see "do they need actually random > data? can this be called pre-userspace?" I mentioned this in my > initial email. As I said there, I think analyzing all the cases one by > one is fragile, and more will pop up, and that's not really the right > way to approach this. And furthermore, as alluded to above, even > fixing clearly-broken places means using that hard-to-use asynchronous > API, which adds even more potentially buggy TCB to these drivers and > all the rest. Not a good strategy. > > Seeing as you took the time to actually respond to the > _particularities_ of each individual random example I picked could > indicate that you've missed this point prior. ...or that I disagree with your prior point. I think you're being lazy, and trying to make it someone else's problem and standing on the side lines and complaining, as opposed to trying to help solve the problem. No, of course we can't audit all of the code, but it's probably a good idea to take a random sample, and to analyze them, so we can get a sense of what the issues are. And then maybe we can find a way to quickly find a class of users that can be easily fixed by using prandom_u32() (for example). Or maybe we can then help figure out what percentage of the callsites can be fixed with a synchronous interface, and fix some number of them just to demonstrate that the synchronous interface does work well. > Right, it was him and Stephan (CCd). They initially started by adding > get_blocking_random_bytes, but then replaced this with the > asynchronous one, because they realized it could block forever. As I > said above, though, I still think a blocking API would be useful, > perhaps just with more granularity for the way in which it blocks. It depends on where it's being used. If it's part of module load, especially if it's one that's done automatically, having something that blocks forever might not be all that useful. Especially if it blocks device drivers from being albe to be initialized enough to actually supply entropy to the whole system. Or maybe (in the case of stack canaries), the answer is we should start with crappy random numbers, but then once the random number generator has been initialized, we can use the callback to get cryptographically secure random number generators, and then we need to figure out how to phase out use of the old crappy random numbers and substitute in the exclusive use of the good random numbers. Because saying that we'll just simply not allow any processes to start
Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete
On Fri, Jun 2, 2017 at 10:41 AM, Daniel Micaywrote: > On Fri, 2017-06-02 at 17:53 +0200, Jason A. Donenfeld wrote: >> (Meanwhile...) >> >> In my own code, I'm currently playing with a workaround that looks >> like this: >> >> --- a/src/main.c >> +++ b/src/main.c >> >> +#include >> +#include >> >> +struct rng_initializer { >> + struct completion done; >> + struct random_ready_callback cb; >> +}; >> +static void rng_initialized_callback(struct random_ready_callback >> *cb) >> +{ >> + complete(_of(cb, struct rng_initializer, cb)->done); >> +} >> + >> static int __init mod_init(void) >> { >>int ret; >> + struct rng_initializer rng = { >> + .done = COMPLETION_INITIALIZER(rng.done), >> + .cb = { .owner = THIS_MODULE, .func = >> rng_initialized_callback } >> + }; >> + >> + ret = add_random_ready_callback(); >> + if (!ret) >> + wait_for_completion(); >> + else if (ret != -EALREADY) >> + return ret; >> >>do_things_with_get_random_bytes_maybe(); >> >> Depending on the situation, however, I could imagine that >> wait_for_completion never returning, if its blocking activity that >> contributes to the seed actually being available, if this is called >> from a compiled-in module, so I find this a bit sub-optimal... > > One of the early uses is initializing the stack canary value for SSP in > very early boot. If that blocks, it's going to be blocking nearly > anything else from happening. > > On x86, that's only the initial canary since the per-task canaries end > up being used, but elsewhere at least without SMP disabled or changes to > GCC that's all there is so the entropy matters. And just to note, building with GCC_PLUGIN_LATENT_ENTROPY, while it (correctly) doesn't credit entropy to the pool, should at least make the pool less deterministic between boots. -Kees -- Kees Cook Pixel Security
Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete
On Fri, Jun 2, 2017 at 7:41 PM, Daniel Micaywrote: > One of the early uses is initializing the stack canary value for SSP in > very early boot. If that blocks, it's going to be blocking nearly > anything else from happening. > > On x86, that's only the initial canary since the per-task canaries end > up being used, but elsewhere at least without SMP disabled or changes to > GCC that's all there is so the entropy matters. If this is the case, then we simply need a function called get_random_bytes_but_potentially_crappy_ones_because_we_are_desperate_for_anything(), which would respond with a weaker guarantee than that get_random_bytes(), which the documentation says always returns cryptographically secure numbers.
Re: get_random_bytes returns bad randomness before seeding is complete
On Fri, Jun 2, 2017 at 7:26 PM, Theodore Ts'owrote: > I tried making /dev/urandom block. > So if you're a security focused individual who is kvetching > And if we're breaking Yes yes, bla bla, predictable response. I don't care. Your API is still broken. Excuses excuses. Yes, somebody needs to do the work in the end, maybe that person can be me, maybe you, maybe somebody else. Regardless, we can take this to a different thread if you'd like to bikeshed about that particular point for another five millennia. But that isn't the main topic of this thread, so I'm not going to get sucked into that blackhole. > the time, we hadn't yet created the asynchronous interfaces that allow > kernel code to do the right thing, even if it was massively > inconvenient. While we're on the topic of that, you might consider adding a simple synchronous interface. I realize that the get_blocking_random_bytes attempt was aborted as soon as it began, because of issues of cancelability, but you could just expose the usual array of wait, wait_interruptable, wait_killable, etc, or just make that wait object and condition non-static so others can use it as needed. Having to wrap the current asynchronous API like this kludge is kind of a downer: https://git.zx2c4.com/WireGuard/diff/src/config.c?id=c77145a8248dfc49e307eae7d7edd5fdca8d5559 > At this point, I think we should be completely open to making it be a > config option, and if it looks like for common configs we're not > spamming dmesg, we could even it make it be the default. We just also > need to give driver writers some easy-to-understand receipes to fix > their drivers to do the right thing. If we do that first, it's much > more likely they will actually fix their kernel code, instead of just > turning the warning off. That's a good point. In my initial email I looked down on the whack-a-mole approach, because nobody is ever going to analyze all of those cases. But... if dmesgs are going wild for some people, we'll have upset users doing the hard work for us. So maybe it's worthwhile to turn that warning on by default. > This is basically the exponential backoff which used in ethernet > networks, and the *real* answer is that they should be using > prandom_u32(). > I think in practice most IBM customers will be safe because they tend No, what it means is that the particularities of individual examples I picked at random don't matter. Are we really going to take the time to audit each and every callsite to see "do they need actually random data? can this be called pre-userspace?" I mentioned this in my initial email. As I said there, I think analyzing all the cases one by one is fragile, and more will pop up, and that's not really the right way to approach this. And furthermore, as alluded to above, even fixing clearly-broken places means using that hard-to-use asynchronous API, which adds even more potentially buggy TCB to these drivers and all the rest. Not a good strategy. Seeing as you took the time to actually respond to the _particularities_ of each individual random example I picked could indicate that you've missed this point prior. >> And so on and so on and so on. If you grep the source as I did, you'll >> find there's no shortage of head-scratchers. "Hmmm," you ask yourself, >> "could this be called before userspace has ensured that /dev/urandom >> is seeded? do we actually _need_ good randomness here, or does it not >> matter?" I guess you could try to reason about each and every one of >> them -- you might even have those same questions about the silly >> examples I pasted above -- but that one-by-one methodology seems >> excessively fragile and arduous. > > This is a fair point. Glad we're on the same page about that. > That developer was Herbert Xu, and he added it about two years ago. > See commit 205a525c3342. Right, it was him and Stephan (CCd). They initially started by adding get_blocking_random_bytes, but then replaced this with the asynchronous one, because they realized it could block forever. As I said above, though, I still think a blocking API would be useful, perhaps just with more granularity for the way in which it blocks. > So the problem with doing this by default, for all modules, is that on > those platforms which don't have good hardware support for seeding the > non-blocking pool quickly, if we defer all modules, we will also be > deferring the means by which we get the entropy needed to seed the > non-blocking pool. So for example, if you have a bunch of networking > drivers that are using get_random_bytes() for exponential backoff, > when they *should* be using prandom_u32(), if we don't fix *that* bug, > simply deferring the module load will also defer the entropy input > from the networking interrupts from the networking card. "We shouldn't fix legitimate cryptographic bugs, because the rest of our codebase is too buggy to support anything reasonable." Sounds like it'd be worthwhile to begin fixing
Re: [kernel-hardening] Re: get_random_bytes returns bad randomness before seeding is complete
On Fri, 2017-06-02 at 17:53 +0200, Jason A. Donenfeld wrote: > (Meanwhile...) > > In my own code, I'm currently playing with a workaround that looks > like this: > > --- a/src/main.c > +++ b/src/main.c > > +#include > +#include > > +struct rng_initializer { > + struct completion done; > + struct random_ready_callback cb; > +}; > +static void rng_initialized_callback(struct random_ready_callback > *cb) > +{ > + complete(_of(cb, struct rng_initializer, cb)->done); > +} > + > static int __init mod_init(void) > { >int ret; > + struct rng_initializer rng = { > + .done = COMPLETION_INITIALIZER(rng.done), > + .cb = { .owner = THIS_MODULE, .func = > rng_initialized_callback } > + }; > + > + ret = add_random_ready_callback(); > + if (!ret) > + wait_for_completion(); > + else if (ret != -EALREADY) > + return ret; > >do_things_with_get_random_bytes_maybe(); > > Depending on the situation, however, I could imagine that > wait_for_completion never returning, if its blocking activity that > contributes to the seed actually being available, if this is called > from a compiled-in module, so I find this a bit sub-optimal... One of the early uses is initializing the stack canary value for SSP in very early boot. If that blocks, it's going to be blocking nearly anything else from happening. On x86, that's only the initial canary since the per-task canaries end up being used, but elsewhere at least without SMP disabled or changes to GCC that's all there is so the entropy matters.
Re: get_random_bytes returns bad randomness before seeding is complete
On Fri, Jun 02, 2017 at 04:59:56PM +0200, Jason A. Donenfeld wrote: > /dev/urandom will return bad randomness before its seeded, rather than > blocking, and despite years and years of discussion and bike shedding, > this behavior hasn't changed, and the vast majority of cryptographers > and security experts remain displeased. It _should_ block until its > been seeded for the first time, just like the new getrandom() syscall, > and this important bug fix (which is not a real api break) should be > backported to all stable kernels. (Userspace doesn't even have a > reliable way of determining whether or not it's been seeded!) Yes, > yes, you have arguments for why you're keeping this pathological, but > you're still wrong, and this api is still a bug. Forcing userspace > architectures to work around your bug, as your arguments usually go, > is disquieting. I tried making /dev/urandom block. The zero day kernel testing within a few hours told us that Ubuntu LTS at the time and OpenWrt would fail to boot. And when Python made a change to call getrandom(2) with the default blocking semantic instead of using /dev/urandom, some distributions using systemd stopped booting. So if you're a security focused individual who is kvetching that we're asking to force userspace to change to fix "a bug", you need to understand that making the change you're suggesting will *also* require making changes to userspace. And if you want to go try to convince Linus Torvalds that it's OK to make a backwards-incompatible changes that break Ubuntu LTS and OpenWRT by causing them to fail to boot --- be my guest. And if we're breaking distributions from them booting when their use of /dev/urandom is not security critical, I suspect Linus is not going to be impressed that you are breaking those users for no beneft. (For example, Python's use of getrandom(2) was to prevent denial of service attacks when Python is used as a fast-CGI web server script, and it was utterly pointless from security perspective to block when the python script was creating on-demand systemd unit files, where the DOS threat was completely irrelevant.) > As the printk implies, it's possible for get_random_bytes() to be > called before it's seeded. Why was that helpful printk put behind an > ifdef? I suspect because people would see the lines in their dmesg and > write emails like this one to the list. The #ifdef and the printk was there from the very beginning. commit 392a546dc8368d1745f9891ef3f8f7c380de8650 Author: Theodore Ts'oDate: Sun Nov 3 18:24:08 2013 -0500 random: add debugging code to detect early use of get_random_bytes() Signed-off-by: "Theodore Ts'o" It was four years ago, so I don't remember the exact circumstances, but if I recall the issue was not wanting to spam the dmesg. Also, at the time, we hadn't yet created the asynchronous interfaces that allow kernel code to do the right thing, even if it was massively inconvenient. At this point, I think we should be completely open to making it be a config option, and if it looks like for common configs we're not spamming dmesg, we could even it make it be the default. We just also need to give driver writers some easy-to-understand receipes to fix their drivers to do the right thing. If we do that first, it's much more likely they will actually fix their kernel code, instead of just turning the warning off. > drivers/net/ieee802154/at86rf230.c: > static int at86rf230_probe(struct spi_device *spi) > { > ... > get_random_bytes(csma_seed, ARRAY_SIZE(csma_seed)); > > No clue what this driver is for or if it actually needs good > randomness, but using get_random_bytes in a hardware probe function > can happen prior to userspace's ability to seed. What, you mean as a computer scientist you didn't immediately understand that csma doesn't refer to CSMA/CD --- or "Carrier-sense multiple access with collision avoidance"? For shame! :-) This is basically the exponential backoff which used in ethernet networks, and the *real* answer is that they should be using prandom_u32(). > arch/s390/crypto/prng.c seems to call get_random_bytes on module load, > which can happen pre-userspace, for seeding some kind of RNG of its > own. It appears to be accessing a hardware cryptographic processor function, and seems to do its own entropy gathering relying on stack garbage as well as using get_random_bytes(). By default it is defined as a module, so I suspect in most situations it's called post-userspace, but agreed that it is not guaranteed to be the case. I think in practice most IBM customers will be safe because they tend to use distro kernels that will almost certainly be building it as a module, but your point is well taken. > And so on and so on and so on. If you grep the source as I did, you'll > find there's no shortage of head-scratchers. "Hmmm," you ask yourself, > "could this be called before userspace has ensured that /dev/urandom > is
Re: get_random_bytes returns bad randomness before seeding is complete
Further investigations: if the whack-a-mole approach is desirable, perhaps many of those get_random_bytes calls should be converted to get_blocking_random_bytes. In that case, this commit, which removed this helpful API, should be reverted: commit c2719503f5e1e6213d716bb078bdad01e28ebcbf Author: Herbert XuDate: Tue Jun 9 18:19:42 2015 +0800 random: Remove kernel blocking API This patch removes the kernel blocking API as it has been completely replaced by the callback API. Signed-off-by: Herbert Xu diff --git a/drivers/char/random.c b/drivers/char/random.c index a1576ed1d88e..d0da5d852d41 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -1265,18 +1265,6 @@ void get_random_bytes(void *buf, int nbytes) EXPORT_SYMBOL(get_random_bytes); /* - * Equivalent function to get_random_bytes with the difference that this - * function blocks the request until the nonblocking_pool is initialized. - */ -void get_blocking_random_bytes(void *buf, int nbytes) -{ - if (unlikely(nonblocking_pool.initialized == 0)) - wait_event(urandom_init_wait, nonblocking_pool.initialized); - extract_entropy(_pool, buf, nbytes, 0, 0); -} -EXPORT_SYMBOL(get_blocking_random_bytes); - -/*
Re: get_random_bytes returns bad randomness before seeding is complete
(Meanwhile...) In my own code, I'm currently playing with a workaround that looks like this: --- a/src/main.c +++ b/src/main.c +#include +#include +struct rng_initializer { + struct completion done; + struct random_ready_callback cb; +}; +static void rng_initialized_callback(struct random_ready_callback *cb) +{ + complete(_of(cb, struct rng_initializer, cb)->done); +} + static int __init mod_init(void) { int ret; + struct rng_initializer rng = { + .done = COMPLETION_INITIALIZER(rng.done), + .cb = { .owner = THIS_MODULE, .func = rng_initialized_callback } + }; + + ret = add_random_ready_callback(); + if (!ret) + wait_for_completion(); + else if (ret != -EALREADY) + return ret; do_things_with_get_random_bytes_maybe(); Depending on the situation, however, I could imagine that wait_for_completion never returning, if its blocking activity that contributes to the seed actually being available, if this is called from a compiled-in module, so I find this a bit sub-optimal...
get_random_bytes returns bad randomness before seeding is complete
Hi folks, This email is about an issue with get_random_bytes(), the CSPRNG used inside the kernel for generating keys and nonces and whatnot. However, I will begin with an aside: /dev/urandom will return bad randomness before its seeded, rather than blocking, and despite years and years of discussion and bike shedding, this behavior hasn't changed, and the vast majority of cryptographers and security experts remain displeased. It _should_ block until its been seeded for the first time, just like the new getrandom() syscall, and this important bug fix (which is not a real api break) should be backported to all stable kernels. (Userspace doesn't even have a reliable way of determining whether or not it's been seeded!) Yes, yes, you have arguments for why you're keeping this pathological, but you're still wrong, and this api is still a bug. Forcing userspace architectures to work around your bug, as your arguments usually go, is disquieting. Anyway, that's not what this email is about, but given that as a backdrop, here's a different-but-related, and perhaps more petulant, issue... The problem this email intends to address is this: void get_random_bytes(void *buf, int nbytes) { #if DEBUG_RANDOM_BOOT > 0 if (!crng_ready()) printk(KERN_NOTICE "random: %pF get_random_bytes called " "with crng_init = %d\n", (void *) _RET_IP_, crng_init); #endif ... extract_crng(buf); Or, on older kernels: void get_random_bytes(void *buf, int nbytes) { #if DEBUG_RANDOM_BOOT > 0 if (unlikely(nonblocking_pool.initialized == 0)) printk(KERN_NOTICE "random: %pF get_random_bytes called " "with %d bits of entropy available\n", (void *) _RET_IP_, nonblocking_pool.entropy_total); #endif trace_get_random_bytes(nbytes, _RET_IP_); extract_entropy(_pool, buf, nbytes, 0, 0); } As the printk implies, it's possible for get_random_bytes() to be called before it's seeded. Why was that helpful printk put behind an ifdef? I suspect because people would see the lines in their dmesg and write emails like this one to the list. I anticipate an argument coming from the never-block /dev/urandom cartel that goes something along the lines of, "get_random_bytes() is only called in paths initiated by a syscall; therefore, userspace must ensure /dev/urandom, which is the same pool as get_random_bytes, has been properly seeded, before making any syscalls that might lead to get_random_bytes() being called." If you've already given up on the general initiative of trying to urge them to make /dev/urandom block until seeded, then this might seem like a reasonable argument. If /dev/urandom is broken, it doesn't matter if get_random_bytes() is broken too, if the required work-around for one is the same as for the other. But the premise is flawed. get_random_bytes() can be called before any syscalls are made, before userspace even has an opportunity to ensure /dev/urandom is seeded. That's what that printk in the ifdef is all about. Bad news bears. Grepping through the source tree for get_random_bytes, I just opened a few files at random to see what the deal was. Here's one: drivers/net/ieee802154/at86rf230.c: static int at86rf230_probe(struct spi_device *spi) { ... get_random_bytes(csma_seed, ARRAY_SIZE(csma_seed)); No clue what this driver is for or if it actually needs good randomness, but using get_random_bytes in a hardware probe function can happen prior to userspace's ability to seed. arch/s390/crypto/prng.c seems to call get_random_bytes on module load, which can happen pre-userspace, for seeding some kind of RNG of its own. net/ipv6/addrconf.c: static void __ipv6_regen_rndid(struct inet6_dev *idev) { regen: get_random_bytes(idev->rndid, sizeof(idev->rndid)); This is in the networking stack when bringing up an interface and assigning randomly assigned v6 addresses. While you might argue that userspace is required for networking, remember that the kernel actually has the ability to initiate networking all on its own, before userspace; the kernel even has its own dhcp client! Yowza. In fact, on that note: net/ipv4/ipconfig.c: static int __init ic_open_devs(void) { ... get_random_bytes(>xid, sizeof(__be32)); And so on and so on and so on. If you grep the source as I did, you'll find there's no shortage of head-scratchers. "Hmmm," you ask yourself, "could this be called before userspace has ensured that /dev/urandom is seeded? do we actually _need_ good randomness here, or does it not matter?" I guess you could try to reason about each and every one of them -- you might even have those same questions about the silly examples I pasted above -- but that one-by-one methodology seems excessively fragile and arduous. There must have been a developer who thought about this at least once before, because random.c does have a callback notifier mechanism for drivers to learn