Re: Linux messages full of `random: get_random_u32 called from`
On Fri, 2018-05-18 at 19:22 -0400, Theodore Y. Ts'o wrote: > On Fri, May 18, 2018 at 10:56:18PM +, Trent Piepho wrote: > > > > Let's look at what we're doing after this fix: > > Want non-cryptographic random data for UUID, ask kernel for it. > > Kernel has non-cryptographic random data, won't give it to us. > > Wait one second for cryptographic random data, which we didn't need. > > Give up and create our own random data, which is non-cryptographic and > > even worse than what the kernel could have given us from the start. > > > > util-linux falls back to rand() seeded with the pid, uid, tv_sec, and > > tv_usec from gettimeofday(). Pretty bad on an embedded system with no > > RTC and worse than what the kernel in crng_init 1 state can give us. > > So what util-linux's libuuid could do is fall back to using > /dev/urandom instead. Whether or not you retry for a second before > you fall back to /dev/urandom really depends on how important the > second U in UUID ("unique") is to you. If you use lower quality > randomness, you can potentially risk getting non-unique UUID's. Does it really matter how long one waits? The fact that there is a fallback that can be used would seem to provide a guarantee of randomness/uniquness only as good as that fallback. And here is the fallback, https://github.com/karelzak/util-linux/blob/m aster/lib/randutils.c#L64 It doesn't seem all that great. Can we say that the kernel, e.g. urandom, can always provide random data at least as good as the above without blocking? If the kernel is always as good or better, then what's the point of having the inferior fallback? > If you don't worry leaking your computer's identity and the time when > the UUID was generated, the application could also use the time-based > UUID's. There are privacy implications for doing so, it's not libuuid will still ask for random data to initialize its clock file: https://github.com/karelzak/util-linux/blob/master/libuuid/src/gen_uuid .c#L281 > > It would seem to be a fact that there will be users of non- > > cryptographic random data in early boot. What is the best practice for > > that? To fall back to each user trying "to find randomly-looking > > things on an 1990s Unix." That doesn't seem good to me. But what's > > the better way? > > We could add a new flag to getrandom(2), but application authors can > just as easily fall back to using /dev/urandom. The real concern I I wouldn't say just as easily. It's a more complex code path, documented across multiple man pages and requires certain file system access that getrandom() doesn't. But it's certainly readily achievable, so maybe that's good enough. I think a flag to getrandom would result in fewer mistakes in userspace code. > have is application authors that actually *really* need cryptographic > randomness, but they're too lazy to figure out a way to defer key > generation until the last possible moment. Would it be safe to say the the randutils code in util-linux would be better off falling back to /dev/urandom instead of what it does? If authors that really need cryptographic data use random_get_bytes() or uuid_generate(), they'll get code that automatically falls back to gettimeofday(). And probably not even know it. I get your concern about lazy authors using an API that isn't appropriate for their use case. But we have this api already, in util-linux and code copied/inspired by it, and it seems there are use cases where it is appropriate. If we make it better(*), then does the risk of it being used where it shouldn't go up? (*) Better: use the best available random data that can be provided without blocking. > There are other things we can do to add support in the bootloader to > read an entropy state file and inject it into the kernel alongside the > initrd and boot command line. But that doesn't completely solve the > problem; you still have to deal with the "frest from the factory, This is problematic on a number of embedded platforms. The bootloader might have no writable persistent storage to read/write this entropy from. This requires drivers for the storage hardware, ability to deal with the storage being in an inconsistent state, and security of the storage. Assuming hardware for writable storage even exists. So if I want u-boot to read/write an encrypted and authenticated flash file system, there is a lot of code to put in the bootloader! And now we have to worry about that being exploited. Maybe this means the bootloader needs an encryption key that it didn't previous need have access to. Some systems have a limit on bootloader size and RAM. Cyclone 5 is 64kB, which pretty much requires a two stage bootloader. Arria 10 has 256kB and boots in a single stage, but bootloader features are quite limited. On imx23, it's possible to boot directly into linux with no bootloader at all. The cpu's rom can initialize the hardware enough to run linux just from info in the mxs boot image format.
Re: Linux messages full of `random: get_random_u32 called from`
On Fri, May 18, 2018 at 10:56:18PM +, Trent Piepho wrote: > > I feel like "fix" might overstate the result a bit. > > This ends up taking a full second to make each UUID. Having gone to > great effort to make an iMX25 complete userspace startup in 250 ms, a > full second, per UUID, in early startup is pretty appalling. > > Let's look at what we're doing after this fix: > Want non-cryptographic random data for UUID, ask kernel for it. > Kernel has non-cryptographic random data, won't give it to us. > Wait one second for cryptographic random data, which we didn't need. > Give up and create our own random data, which is non-cryptographic and > even worse than what the kernel could have given us from the start. > > util-linux falls back to rand() seeded with the pid, uid, tv_sec, and > tv_usec from gettimeofday(). Pretty bad on an embedded system with no > RTC and worse than what the kernel in crng_init 1 state can give us. So what util-linux's libuuid could do is fall back to using /dev/urandom instead. Whether or not you retry for a second before you fall back to /dev/urandom really depends on how important the second U in UUID ("unique") is to you. If you use lower quality randomness, you can potentially risk getting non-unique UUID's. If you don't worry leaking your computer's identity and the time when the UUID was generated, the application could also use the time-based UUID's. There are privacy implications for doing so, it's not something we can do automatically (or at least I can't recommend it). Also, if you don't have the clock sequence file and/or you don't have a writable root, you might need some randomness anyway to protect against non-monotonically increasing system time. > It would seem to be a fact that there will be users of non- > cryptographic random data in early boot. What is the best practice for > that? To fall back to each user trying "to find randomly-looking > things on an 1990s Unix." That doesn't seem good to me. But what's > the better way? We could add a new flag to getrandom(2), but application authors can just as easily fall back to using /dev/urandom. The real concern I have is application authors that actually *really* need cryptographic randomness, but they're too lazy to figure out a way to defer key generation until the last possible moment. There are other things we can do to add support in the bootloader to read an entropy state file and inject it into the kernel alongside the initrd and boot command line. But that doesn't completely solve the problem; you still have to deal with the "frest from the factory, first time out of box" experience. And if you have trusted random number generation hardware, and are reasonably certain you don't have to worry about a state-sponsored agency from intercepting hardware shipments and gimmicking your hardware, that can be a solution as well. So there are things we can do to improve some of the scenarios. Unfortunately, there is no silver bullet that will address all of them. - Ted
Re: Linux messages full of `random: get_random_u32 called from`
On Thu, 2018-05-17 at 22:32 -0400, Theodore Y. Ts'o wrote: > On Fri, May 18, 2018 at 01:27:03AM +, Trent Piepho wrote: > > I've hit this on an embedded system. mke2fs hangs trying to format a > > persistent writable filesystem, which is where the random seed to > > initialize the kernel entropy pool would be stored, because it wants 16 > > bytes of non-cryptographic random data for a filesystem UUID, and util- > > linux libuuid calls getrandom(16, 0) - no GRND_RANDOM flag - and this > > hangs for over four minutes. > > This is fixed in util-linux 2.32. It ships with the following commits: I feel like "fix" might overstate the result a bit. This ends up taking a full second to make each UUID. Having gone to great effort to make an iMX25 complete userspace startup in 250 ms, a full second, per UUID, in early startup is pretty appalling. Let's look at what we're doing after this fix: Want non-cryptographic random data for UUID, ask kernel for it. Kernel has non-cryptographic random data, won't give it to us. Wait one second for cryptographic random data, which we didn't need. Give up and create our own random data, which is non-cryptographic and even worse than what the kernel could have given us from the start. util-linux falls back to rand() seeded with the pid, uid, tv_sec, and tv_usec from gettimeofday(). Pretty bad on an embedded system with no RTC and worse than what the kernel in crng_init 1 state can give us. What took microseconds now takes a seconds. We have lower quality random data than we had before. Seems like two steps backward. Can't we do better? How about adding a flag to getrandom() that allows the kernel to return low-quality data if high-quality data would require blocking? It would seem to be a fact that there will be users of non- cryptographic random data in early boot. What is the best practice for that? To fall back to each user trying "to find randomly-looking things on an 1990s Unix." That doesn't seem good to me. But what's the better way?
Re: Linux messages full of `random: get_random_u32 called from`
On Fri, May 18, 2018 at 01:27:03AM +, Trent Piepho wrote: > > I've hit this on an embedded system. mke2fs hangs trying to format a > persistent writable filesystem, which is where the random seed to > initialize the kernel entropy pool would be stored, because it wants 16 > bytes of non-cryptographic random data for a filesystem UUID, and util- > linux libuuid calls getrandom(16, 0) - no GRND_RANDOM flag - and this > hangs for over four minutes. This is fixed in util-linux 2.32. It ships with the following commits: commit edc1c90cb972fdca1f66be5a8e2b0706bd2a4949 Author: Karel Zak Date: Tue Mar 20 14:17:24 2018 +0100 lib/randutils: don't break on EAGAIN, use usleep() The current code uses lose_counter to make more attempts to read random numbers. It seems better to wait a moment between attempts to avoid busy loop (we do the same in all-io.h). The worst case is 1 second delay for all random_get_bytes() on systems with uninitialized entropy pool -- for example you call sfdisk (MBR Id or GPT UUIDs) on very first boot, etc. In this case it will use libc rand() as a fallback solution. Note that we do not use random numbers for security sensitive things like keys or so. It's used for random based UUIDs etc. Addresses: https://github.com/karelzak/util-linux/pull/603 Signed-off-by: Karel Zak commit a9cf659e0508c1f56813a7d74c64f67bbc962538 Author: Carlo Caione Date: Mon Mar 19 10:31:07 2018 + lib/randutils: Do not block on getrandom() In Endless we have hit a problem when using 'sfdisk' on the really first boot to automatically expand the rootfs partition. On this platform 'sfdisk' is blocking on getrandom() because not enough random bytes are available. This is an ARM platform without a hwrng. We fix this passing GRND_NONBLOCK to getrandom(). 'sfdisk' will use the best entropy it has available and fallback only as necessary. Signed-off-by: Carlo Caione Interestingly, these commits in util-linux landed *before* the patches to address CVE-2018-1108 appeared in the kernel in April 2019. This was because the issue of libuuid was blocking on a handful of embedded systems even for we made this change in Linux's random driver. (It just made this problem more likely to be visbile on a larger number of systems; but it was always there.) - Ted
Re: Linux messages full of `random: get_random_u32 called from`
Since I wasn't on this thread from the start, I can only find a way to reply to message in mbox format on patchwork, and this seemed the best. On Fri, 2018-04-27 at 16:10 -0400, Theodore Tso wrote: > > > This is why ultimately, we do need to attack this problem from both > ends, which means teaching userspace programs to only request > cryptographic-grade randomness when it is really needed --- and most > of the time, if the user has not logged in yet, you probably don't > need cryptographic-grade randomness I've hit this on an embedded system. mke2fs hangs trying to format a persistent writable filesystem, which is where the random seed to initialize the kernel entropy pool would be stored, because it wants 16 bytes of non-cryptographic random data for a filesystem UUID, and util- linux libuuid calls getrandom(16, 0) - no GRND_RANDOM flag - and this hangs for over four minutes. Some things I've seen here don't work in the embedded world. The user will not log in. No one logs in. There are not even user accounts with a valid password that could log in. The storage comes pre-written with a static image from the manufacturer or is programmed from a static image via JTAG or some other out of band step. It cannot be different from device to device when it first boots. No saved entropy. The bootloader gets entropy from writable storage to give to the kernel? Can't do that. The bootloader has no access to writable storage. I understand that if someone wants cryptographic-grade randomness early in boot when that just isn't available and isn't going to be available, then that isn't going to happen and lying to the consumer about the randomness of the data isn't the answer. But I just want UUIDs for a filesystem. And the systemd machineid for the journal file. It seems the util-linux authors thought, apparently incorrectly, that getrandom() without GRND_RANDOM was a good way to do get it. What is the right way? The fact that so many userspace consumers get it wrong might be a sign that this is lacking or at least very non- obvious. I want random data and I want it now. It's ok if it's low entropy. This seems to be a very real, and unavoidable, thing in early boot. And crng_init == 1 seems to be the intended way to do this. What's the way to get random data of crng_init==1 quality without blocking?
Re: Linux messages full of `random: get_random_u32 called from`
On Wed, May 2, 2018 at 5:25 PM, Theodore Y. Ts'o wrote: > On Wed, May 02, 2018 at 10:49:34AM -0700, Laura Abbott wrote: >> >> It is a Fedora patch we're carrying >> https://src.fedoraproject.org/rpms/libgcrypt/blob/master/f/libgcrypt-1.6.2-fips-ctor.patch#_23 >> so yes, it is a Fedora specific use case. >> From talking to the libgcrypt team, this is a FIPS mode requirement >> to run power on self test at the library constructor and the self >> test of libgrcypt ends up requiring a fully seeded RNG. Citation >> is in section 9.10 of >> https://csrc.nist.gov/CSRC/media/Projects/Cryptographic-Module-Validation-Program/documents/fips140-2/FIPS1402IG.pdf > > Forgive me if this is a stupid question, but does Fedora need FIPS > compliance? Or is this something which is only required for RHEL? > > ("Here's to FIPS: the cause of, and solution to, all of Life's > problems." :-) > One of the advantages of carrying such things in Fedora is we find these problems before RHEL does and hopefully there is a solution in place before they ever even see it. >From the rawhide end, I just brought in virtio-rng as inline vs module, this works around the issue for lots of users, but not all. GCE is still impacted, and a user came to complain about it already last night. And of course any other virt platform without virtio-rng, or some hardware. Most hardware installs don't have dracut-fips so they will boot, eventually. Justin
Re: Linux messages full of `random: get_random_u32 called from`
On Wed 2018-05-02 18:25:22, Theodore Y. Ts'o wrote: > On Wed, May 02, 2018 at 10:49:34AM -0700, Laura Abbott wrote: > > > > It is a Fedora patch we're carrying > > https://src.fedoraproject.org/rpms/libgcrypt/blob/master/f/libgcrypt-1.6.2-fips-ctor.patch#_23 > > so yes, it is a Fedora specific use case. > > From talking to the libgcrypt team, this is a FIPS mode requirement > > to run power on self test at the library constructor and the self > > test of libgrcypt ends up requiring a fully seeded RNG. Citation > > is in section 9.10 of > > https://csrc.nist.gov/CSRC/media/Projects/Cryptographic-Module-Validation-Program/documents/fips140-2/FIPS1402IG.pdf > > Forgive me if this is a stupid question, but does Fedora need FIPS > compliance? Or is this something which is only required for RHEL? If RHEL needs it, Fedora needs it, too -- as Fedora is a beta test for RHEL. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: Linux messages full of `random: get_random_u32 called from`
On Wed, May 02, 2018 at 10:49:34AM -0700, Laura Abbott wrote: > > It is a Fedora patch we're carrying > https://src.fedoraproject.org/rpms/libgcrypt/blob/master/f/libgcrypt-1.6.2-fips-ctor.patch#_23 > so yes, it is a Fedora specific use case. > From talking to the libgcrypt team, this is a FIPS mode requirement > to run power on self test at the library constructor and the self > test of libgrcypt ends up requiring a fully seeded RNG. Citation > is in section 9.10 of > https://csrc.nist.gov/CSRC/media/Projects/Cryptographic-Module-Validation-Program/documents/fips140-2/FIPS1402IG.pdf Forgive me if this is a stupid question, but does Fedora need FIPS compliance? Or is this something which is only required for RHEL? ("Here's to FIPS: the cause of, and solution to, all of Life's problems." :-) - Ted
Re: Linux messages full of `random: get_random_u32 called from`
On 05/02/2018 09:26 AM, Theodore Y. Ts'o wrote: On Wed, May 02, 2018 at 07:09:11AM -0500, Justin Forbes wrote: Yes, Fedora libgcrypt is carrying a patch which makes it particularly painful for us, we have reached out to the libgcrypt maintainer to follow up on that end. But as I said before, even without that code path (no dracut-fips) we are seeing some instances of 4 minute boots. This is not really a workable user experience. And are you sure that every cloud platform and VM platform offers, makes it possible to config virtio-rng? Unfortunately, the answer is no. Google Compute Engine, alas, does not currently support virtio-rng. With my Google hat on, I can't comment on future product features. With my upstream developer hat on, I'll give you three guesses what I have been advocating and pushing for internally, and the first two don't count. :-) That being said, I just booted a Debian 9 (Stable, aka Stretch) standard kernel, and then installed 4.17-rc3 (which has the CVE-2018-1108 patches). The crng_init=2 message doesn't appear immediately, and it does appear quite a bit later comapred to the standard 4.9.0-6-amd64 Debian 9 kernel. However, the lack of a fully initialized random pool doesn't prevent the standard Debian 9 image from booting: May 2 15:33:42 localhost kernel: [0.00] Linux version 4.17.0-rc3-xfstests (tytso@cwcc) (gcc version 7.3.0 (Debian 7.3.0-16)) #169 SMP Wed May 2 11:28:17 EDT 2018 May 2 15:33:42 localhost kernel: [1.456883] random: fast init done May 2 15:33:46 rng-testing systemd[1]: Startup finished in 3.202s (kernel) + 5.963s (userspace) = 9.166s. May 2 15:33:46 rng-testing google-accounts: INFO Starting Google Accounts daemon. May 2 15:44:39 rng-testing kernel: [ 661.436664] random: crng init done So it really does appear to be something going on with Fedora's userspace; can you help try to track down what it is? Thanks, - Ted It is a Fedora patch we're carrying https://src.fedoraproject.org/rpms/libgcrypt/blob/master/f/libgcrypt-1.6.2-fips-ctor.patch#_23 so yes, it is a Fedora specific use case. From talking to the libgcrypt team, this is a FIPS mode requirement to run power on self test at the library constructor and the self test of libgrcypt ends up requiring a fully seeded RNG. Citation is in section 9.10 of https://csrc.nist.gov/CSRC/media/Projects/Cryptographic-Module-Validation-Program/documents/fips140-2/FIPS1402IG.pdf The response was this _could_ be fixed in libgcrypt but it needs to be done carefully to ensure nothing actually gets broken. So in the mean time we're stuck with userspace getting blocked whenever some program decides to use libgcrypt too early. Thanks, Laura
Re: Linux messages full of `random: get_random_u32 called from`
On Wed, May 02, 2018 at 07:09:11AM -0500, Justin Forbes wrote: > Yes, Fedora libgcrypt is carrying a patch which makes it particularly > painful for us, we have reached out to the libgcrypt maintainer to > follow up on that end. But as I said before, even without that code > path (no dracut-fips) we are seeing some instances of 4 minute boots. > This is not really a workable user experience. And are you sure that > every cloud platform and VM platform offers, makes it possible to > config virtio-rng? Unfortunately, the answer is no. Google Compute Engine, alas, does not currently support virtio-rng. With my Google hat on, I can't comment on future product features. With my upstream developer hat on, I'll give you three guesses what I have been advocating and pushing for internally, and the first two don't count. :-) That being said, I just booted a Debian 9 (Stable, aka Stretch) standard kernel, and then installed 4.17-rc3 (which has the CVE-2018-1108 patches). The crng_init=2 message doesn't appear immediately, and it does appear quite a bit later comapred to the standard 4.9.0-6-amd64 Debian 9 kernel. However, the lack of a fully initialized random pool doesn't prevent the standard Debian 9 image from booting: May 2 15:33:42 localhost kernel: [0.00] Linux version 4.17.0-rc3-xfstests (tytso@cwcc) (gcc version 7.3.0 (Debian 7.3.0-16)) #169 SMP Wed May 2 11:28:17 EDT 2018 May 2 15:33:42 localhost kernel: [1.456883] random: fast init done May 2 15:33:46 rng-testing systemd[1]: Startup finished in 3.202s (kernel) + 5.963s (userspace) = 9.166s. May 2 15:33:46 rng-testing google-accounts: INFO Starting Google Accounts daemon. May 2 15:44:39 rng-testing kernel: [ 661.436664] random: crng init done So it really does appear to be something going on with Fedora's userspace; can you help try to track down what it is? Thanks, - Ted
Re: Linux messages full of `random: get_random_u32 called from`
On Tue, May 1, 2018 at 7:02 PM, Theodore Y. Ts'o wrote: > On Tue, May 01, 2018 at 05:35:56PM -0500, Justin Forbes wrote: >> >> I have not reproduced in GCE myself. We did get some confirmation >> that removing dracut-fips does make the problem less dire (but I >> wouldn't call a 4 minute boot a win, but booting in 4 minutes is >> better than not booting at all). Specifically systemd calls libgcrypt >> before it even opens the log with fips there, and this is before >> virtio-rng modules could even load. Right now though, we are looking >> at pretty much any possible options as the majority of people are >> calling for me to backout the patches completely from rawhide. > > FWIW, Debian Testing is using systemd 238, and from what I can tell > it's calling libgcrypt and it has the same (as near as I can tell) > totally pointless hmac nonsense, and it's not a problem that I can > see. Of course, Debian and Fedora may have a different set of > patches > Yes, Fedora libgcrypt is carrying a patch which makes it particularly painful for us, we have reached out to the libgcrypt maintainer to follow up on that end. But as I said before, even without that code path (no dracut-fips) we are seeing some instances of 4 minute boots. This is not really a workable user experience. And are you sure that every cloud platform and VM platform offers, makes it possible to config virtio-rng? Justin
Re: Linux messages full of `random: get_random_u32 called from`
On Tue, May 01, 2018 at 08:56:04PM -0400, Theodore Y. Ts'o wrote: > On Tue, May 01, 2018 at 05:43:17PM -0700, Sultan Alsawaf wrote: > > > > I've attached what I think is a reasonable stopgap solution until this is > > actually fixed. If you're willing to revert the CVE-2018-1108 patches > > completely, then I don't think you'll mind using this patch in the meantime. > > I would put it slightly differently; reverting the CVE-2018-1108 > patches is less dangerous than what you are proposing in your attached > patch. > > Again, I think the right answer is to fix userspace to not require > cryptographic grade entropy during early system startup, and for > people to *think* about what they are doing. I've looked at the > systemd's use of hmac in journal-authenticate, and as near as I can > tell, there isn't any kind of explanation about why it was necessary, > or what threat it was trying to protect against. > > - Ted Why is /dev/urandom so much more dangerous than /dev/random? The more I search, the more I see that many sources consider /dev/urandom to be cryptographically secure... and since I hold down a single key on the keyboard to make my computer boot without any kernel workarounds, I'm sure the NSA would eventually notice my predictable behavior and get their hands on my Richard Stallman photos. Fixing all the "broken" userspace instances of entropy usage during early system startup is a tall order. What about barebone machines used as remote servers? I feel like just "fixing userspace" isn't going to cover all of the usecases that the CVE-2018-1108 patches broke. Sultan
Re: Linux messages full of `random: get_random_u32 called from`
On Tue, May 01, 2018 at 05:43:17PM -0700, Sultan Alsawaf wrote: > > I've attached what I think is a reasonable stopgap solution until this is > actually fixed. If you're willing to revert the CVE-2018-1108 patches > completely, then I don't think you'll mind using this patch in the meantime. I would put it slightly differently; reverting the CVE-2018-1108 patches is less dangerous than what you are proposing in your attached patch. Again, I think the right answer is to fix userspace to not require cryptographic grade entropy during early system startup, and for people to *think* about what they are doing. I've looked at the systemd's use of hmac in journal-authenticate, and as near as I can tell, there isn't any kind of explanation about why it was necessary, or what threat it was trying to protect against. - Ted
Re: Linux messages full of `random: get_random_u32 called from`
On Tue, May 01, 2018 at 05:35:56PM -0500, Justin Forbes wrote: > > I have not reproduced in GCE myself. We did get some confirmation > that removing dracut-fips does make the problem less dire (but I > wouldn't call a 4 minute boot a win, but booting in 4 minutes is > better than not booting at all). Specifically systemd calls libgcrypt > before it even opens the log with fips there, and this is before > virtio-rng modules could even load. Right now though, we are looking > at pretty much any possible options as the majority of people are > calling for me to backout the patches completely from rawhide. I've attached what I think is a reasonable stopgap solution until this is actually fixed. If you're willing to revert the CVE-2018-1108 patches completely, then I don't think you'll mind using this patch in the meantime. Sultan >From 5be2efdde744d3c55db3df81c0493fc67dc35620 Mon Sep 17 00:00:00 2001 From: Sultan Alsawaf Date: Tue, 1 May 2018 17:36:17 -0700 Subject: [PATCH] random: use urandom instead of random for now and speed up crng init With the fixes for CVE-2018-1108, /dev/random now requires user-provided entropy on quite a few machines lacking high levels of boot entropy in order to complete its initialization. This causes issues on environments where userspace depends on /dev/random in order to finish booting completely (i.e., userspace will remain stuck, unable to boot, waiting for entropy more-or-less indefinitely until the user provides it via something like keystrokes or mouse movements). As a temporary workaround, redirect /dev/random to /dev/urandom instead, and speed up the initialization process by slightly relaxing the threshold for interrupts to go towards adding one bit of entropy credit (only until initialization is complete). Signed-off-by: Sultan Alsawaf --- drivers/char/mem.c| 3 ++- drivers/char/random.c | 9 ++--- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/char/mem.c b/drivers/char/mem.c index ffeb60d3434c..cc9507f01c79 100644 --- a/drivers/char/mem.c +++ b/drivers/char/mem.c @@ -870,7 +870,8 @@ static const struct memdev { #endif [5] = { "zero", 0666, &zero_fops, 0 }, [7] = { "full", 0666, &full_fops, 0 }, -[8] = { "random", 0666, &random_fops, 0 }, +/* Redirect /dev/random to /dev/urandom until /dev/random is fixed */ +[8] = { "random", 0666, &urandom_fops, 0 }, [9] = { "urandom", 0666, &urandom_fops, 0 }, #ifdef CONFIG_PRINTK [11] = { "kmsg", 0644, &kmsg_fops, 0 }, diff --git a/drivers/char/random.c b/drivers/char/random.c index d9e38523b383..bce3b43cdd3b 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -1200,9 +1200,12 @@ void add_interrupt_randomness(int irq) return; } - if ((fast_pool->count < 64) && - !time_after(now, fast_pool->last + HZ)) - return; + if (fast_pool->count < 64) { + unsigned long timeout = crng_ready() ? HZ : HZ / 4; + + if (!time_after(now, fast_pool->last + timeout)) + return; + } r = &input_pool; if (!spin_trylock(&r->lock)) -- 2.14.1
Re: Linux messages full of `random: get_random_u32 called from`
On Tue, May 01, 2018 at 05:35:56PM -0500, Justin Forbes wrote: > > I have not reproduced in GCE myself. We did get some confirmation > that removing dracut-fips does make the problem less dire (but I > wouldn't call a 4 minute boot a win, but booting in 4 minutes is > better than not booting at all). Specifically systemd calls libgcrypt > before it even opens the log with fips there, and this is before > virtio-rng modules could even load. Right now though, we are looking > at pretty much any possible options as the majority of people are > calling for me to backout the patches completely from rawhide. FWIW, Debian Testing is using systemd 238, and from what I can tell it's calling libgcrypt and it has the same (as near as I can tell) totally pointless hmac nonsense, and it's not a problem that I can see. Of course, Debian and Fedora may have a different set of patches - Ted
Re: Linux messages full of `random: get_random_u32 called from`
On Tue, May 1, 2018 at 7:55 AM, Theodore Y. Ts'o wrote: > On Tue, May 01, 2018 at 06:52:47AM -0500, Justin Forbes wrote: >> >> We have also had reports that Fedora users are seeing this on Google >> Compute Engine. > > Can you reproduce this yourself? If so, could you confirm that > removing the dracut-fips package makes the problem go away for you? > I have not reproduced in GCE myself. We did get some confirmation that removing dracut-fips does make the problem less dire (but I wouldn't call a 4 minute boot a win, but booting in 4 minutes is better than not booting at all). Specifically systemd calls libgcrypt before it even opens the log with fips there, and this is before virtio-rng modules could even load. Right now though, we are looking at pretty much any possible options as the majority of people are calling for me to backout the patches completely from rawhide.
Re: Linux messages full of `random: get_random_u32 called from`
On Mon 2018-04-30 12:11:43, Theodore Y. Ts'o wrote: > On Sun, Apr 29, 2018 at 09:34:45PM -0700, Sultan Alsawaf wrote: > > > > What about abusing high-resolution timers to get entropy? Since hrtimers > > can't > > make guarantees down to the nanosecond, there's always a skew between the > > requested expiry time and the actual expiry time. > > > > Please see the attached patch and let me know just how horrible it is. > > So think about exactly where the possible causes of the skew might be > coming from. Look very closely at the software implemntation. The > important thing here is to not get hung up on the software > abstraction, but to look at the *implementation*. (And if it's an > implementation in architecture specific code, we need to look at all > architectures.) > > This applies on the hardware level as hard, but that gets harder > because there many possible hardware implemntations in use out there. > Remember that that on many systems there may be only single clock > crystal, and all other hardware timers maybe derived from that clock > using frequency dividers. (At least for everything on the mainboard.) On "many" systems? No, sorry, computers usually do not behave like this (CMOS RTC has separate clock, for example). I'm pretty sure that not a single machine problems were reported on has this problem. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: Linux messages full of `random: get_random_u32 called from`
On Tue, May 01, 2018 at 06:52:47AM -0500, Justin Forbes wrote: > > We have also had reports that Fedora users are seeing this on Google > Compute Engine. Can you reproduce this yourself? If so, could you confirm that removing the dracut-fips package makes the problem go away for you? Thanks, - Ted
Re: Linux messages full of `random: get_random_u32 called from`
On Mon, Apr 30, 2018 at 4:12 PM, Jeremy Cline wrote: > On 04/29/2018 06:05 PM, Theodore Y. Ts'o wrote: >> On Sun, Apr 29, 2018 at 01:20:33PM -0700, Sultan Alsawaf wrote: >>> On Sun, Apr 29, 2018 at 08:41:01PM +0200, Pavel Machek wrote: Umm. No. https://www.youtube.com/watch?v=xneBjc8z0DE >>> >>> Okay, but /dev/urandom isn't a solution to this problem because it isn't >>> usable >>> until crng init is complete, so it suffers from the same init lag as >>> /dev/random. >> >> It's more accurate to say that using /dev/urandom is no worse than >> before (from a few years ago). There are, alas, plenty of >> distributions and user space application programmers that basically >> got lazy using /dev/urandom, and assumed that there would be plenty of >> entropy during early system startup. >> >> When they switched over the getrandom(2), the most egregious examples >> of this caused pain (and they got fixed), but due to a bug in >> drivers/char/random.c, if getrandom(2) was called after the entropy >> pool was "half initialized", it would not block, but proceed. >> >> Is that exploitable? Well, Jann and I didn't find an _obvious_ way to >> exploit the short coming, which is this wasn't treated like an >> emergency situation ala the embarassing situation we had five years >> ago[1]. >> >> [1] https://factorable.net/paper.html >> >> However, it was enough to make us be uncomfortable, which is why I >> pushed the changes that I did. At least on the devices we had at >> hand, using the distributions that we typically use, the impact seemed >> minimal. Unfortuantely, there is no way to know for sure without >> rolling out change and seeing who screams. In the ideal world, >> software would not require cryptographic randomness immediately after >> boot, before the user logs in. And ***really***, as in [1], softwaret >> should not be generating long-term public keys that are essential to >> the security of the box a few seconds immediately after the device is >> first unboxed and plugged in.i >> >> What would be useful is if people gave reports that listed exactly >> what laptop and distributions they are using. Just "a high spec x86 >> laptop" isn't terribly useful, because *my* brand-new Dell XPS 13 >> running Debian testing is working just fine. The year, model, make, >> and CPU type plus what distribution (and distro version number) you >> are running is useful, so I can assess how wide spread the unhappiness >> is going to be, and what mitigation steps make sense. > > Fedora has started seeing some bug reports on this for Fedora 27[0] and > I've asked reporters to include their hardware details. > > [0] https://bugzilla.redhat.com/show_bug.cgi?id=1572944 > We have also had reports that Fedora users are seeing this on Google Compute Engine. Justin
Re: Linux messages full of `random: get_random_u32 called from`
On 04/29/2018 06:05 PM, Theodore Y. Ts'o wrote: > On Sun, Apr 29, 2018 at 01:20:33PM -0700, Sultan Alsawaf wrote: >> On Sun, Apr 29, 2018 at 08:41:01PM +0200, Pavel Machek wrote: >>> Umm. No. https://www.youtube.com/watch?v=xneBjc8z0DE >> >> Okay, but /dev/urandom isn't a solution to this problem because it isn't >> usable >> until crng init is complete, so it suffers from the same init lag as >> /dev/random. > > It's more accurate to say that using /dev/urandom is no worse than > before (from a few years ago). There are, alas, plenty of > distributions and user space application programmers that basically > got lazy using /dev/urandom, and assumed that there would be plenty of > entropy during early system startup. > > When they switched over the getrandom(2), the most egregious examples > of this caused pain (and they got fixed), but due to a bug in > drivers/char/random.c, if getrandom(2) was called after the entropy > pool was "half initialized", it would not block, but proceed. > > Is that exploitable? Well, Jann and I didn't find an _obvious_ way to > exploit the short coming, which is this wasn't treated like an > emergency situation ala the embarassing situation we had five years > ago[1]. > > [1] https://factorable.net/paper.html > > However, it was enough to make us be uncomfortable, which is why I > pushed the changes that I did. At least on the devices we had at > hand, using the distributions that we typically use, the impact seemed > minimal. Unfortuantely, there is no way to know for sure without > rolling out change and seeing who screams. In the ideal world, > software would not require cryptographic randomness immediately after > boot, before the user logs in. And ***really***, as in [1], softwaret > should not be generating long-term public keys that are essential to > the security of the box a few seconds immediately after the device is > first unboxed and plugged in.i > > What would be useful is if people gave reports that listed exactly > what laptop and distributions they are using. Just "a high spec x86 > laptop" isn't terribly useful, because *my* brand-new Dell XPS 13 > running Debian testing is working just fine. The year, model, make, > and CPU type plus what distribution (and distro version number) you > are running is useful, so I can assess how wide spread the unhappiness > is going to be, and what mitigation steps make sense. Fedora has started seeing some bug reports on this for Fedora 27[0] and I've asked reporters to include their hardware details. [0] https://bugzilla.redhat.com/show_bug.cgi?id=1572944 Regards, Jeremy
Re: Linux messages full of `random: get_random_u32 called from`
On Sun, Apr 29, 2018 at 09:34:45PM -0700, Sultan Alsawaf wrote: > > What about abusing high-resolution timers to get entropy? Since hrtimers can't > make guarantees down to the nanosecond, there's always a skew between the > requested expiry time and the actual expiry time. > > Please see the attached patch and let me know just how horrible it is. So think about exactly where the possible causes of the skew might be coming from. Look very closely at the software implemntation. The important thing here is to not get hung up on the software abstraction, but to look at the *implementation*. (And if it's an implementation in architecture specific code, we need to look at all architectures.) This applies on the hardware level as hard, but that gets harder because there many possible hardware implemntations in use out there. Remember that that on many systems there may be only single clock crystal, and all other hardware timers maybe derived from that clock using frequency dividers. (At least for everything on the mainboard.) - Ted
Re: Linux messages full of `random: get_random_u32 called from`
On Sun, Apr 29, 2018 at 08:11:07PM -0400, Theodore Y. Ts'o wrote: > > What your patch does is assume that there is a full bit of uncertainty > that can be obtained from the information gathered from each > interrupt. I *might* be willing to assume that to be valid on x86 > systems that have a high resolution cycle counter. But on ARM > platforms, especially during system bootup when the user isn't typing > anything and SSD's and flash storage tend to have very predictable > timing patterns? Not a bet I'd be willing to take. Even with a cycle > counter, there's a reason why we assumed that we need to mix in timing > results from 64 interrupts or one second's worth before we would give > a single bit's worth of entropy credit. > > - Ted What about abusing high-resolution timers to get entropy? Since hrtimers can't make guarantees down to the nanosecond, there's always a skew between the requested expiry time and the actual expiry time. Please see the attached patch and let me know just how horrible it is. Sultan >From b0d21c38558c661531d4cb46816fbb36b874a169 Mon Sep 17 00:00:00 2001 From: Sultan Alsawaf Date: Sun, 29 Apr 2018 21:28:08 -0700 Subject: [PATCH] random: use high-res timers to generate entropy until crng init is done --- drivers/char/random.c | 47 +++ 1 file changed, 47 insertions(+) diff --git a/drivers/char/random.c b/drivers/char/random.c index d9e38523b383..af2d60bbcec3 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -286,6 +286,7 @@ #define OUTPUT_POOL_WORDS (1 << (OUTPUT_POOL_SHIFT-5)) #define SEC_XFER_SIZE 512 #define EXTRACT_SIZE 10 +#define ENTROPY_GEN_INTVL_NS (1 * NSEC_PER_MSEC) #define LONGS(x) (((x) + sizeof(unsigned long) - 1)/sizeof(unsigned long)) @@ -408,6 +409,8 @@ static struct fasync_struct *fasync; static DEFINE_SPINLOCK(random_ready_list_lock); static LIST_HEAD(random_ready_list); +static struct hrtimer entropy_gen_hrtimer; + struct crng_state { __u32 state[16]; unsigned long init_time; @@ -2287,3 +2290,47 @@ void add_hwgenerator_randomness(const char *buffer, size_t count, credit_entropy_bits(poolp, entropy); } EXPORT_SYMBOL_GPL(add_hwgenerator_randomness); + +/* + * Generate entropy on init using high-res timers. Although high-res timers + * provide nanosecond precision, they don't actually honor requests to the + * nanosecond. The skew between the expected time difference in nanoseconds and + * the actual time difference can be used as a way to generate entropy on boot + * for machines that lack sufficient boot-time entropy. + */ +static enum hrtimer_restart entropy_timer_cb(struct hrtimer *timer) +{ + static u64 prev_ns; + u64 curr_ns, delta; + + if (crng_ready()) + return HRTIMER_NORESTART; + + curr_ns = ktime_get_mono_fast_ns(); + delta = curr_ns - prev_ns; + + add_interrupt_randomness(delta); + + /* Use the hrtimer skew to make the next interval more unpredictable */ + if (likely(prev_ns)) + hrtimer_add_expires_ns(timer, delta); + else + hrtimer_add_expires_ns(timer, ENTROPY_GEN_INTVL_NS); + + prev_ns = curr_ns; + return HRTIMER_RESTART; +} + +static int entropy_gen_hrtimer_init(void) +{ + if (!IS_ENABLED(CONFIG_HIGH_RES_TIMERS)) + return 0; + + hrtimer_init(&entropy_gen_hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); + + entropy_gen_hrtimer.function = entropy_timer_cb; + hrtimer_start(&entropy_gen_hrtimer, ns_to_ktime(ENTROPY_GEN_INTVL_NS), + HRTIMER_MODE_REL); + return 0; +} +core_initcall(entropy_gen_hrtimer_init); -- 2.14.1
Re: Linux messages full of `random: get_random_u32 called from`
On 04/29/2018 03:05 PM, Theodore Y. Ts'o wrote: What would be useful is if people gave reports that listed exactly what laptop and distributions they are using. Just "a high spec x86 laptop" isn't terribly useful, because*my* brand-new Dell XPS 13 running Debian testing is working just fine. The year, model, make, and CPU type plus what distribution (and distro version number) you are running is useful, so I can assess how wide spread the unhappiness is going to be, and what mitigation steps make sense. I'm pretty sure Fedora is hitting this in our VMs. I just spent some time debugging an issue of a boot delay with someone from the infrastructure team where it would take upwards of 2 minutes to boot. If someone holds down a key, it boots in 4 seconds. There's a qemu reproducer at https://bugzilla.redhat.com/show_bug.cgi?id=1572916#c3 I suggested a cat on the keyboard as a workaround. Independently, we also got a report of a boot hang in GCE with 4.16.4 where as 4.16.3 works which corresponds to the previous report of a stable regression. This was just via IRC so I didn't have time to dig into this. Thanks, Laura
Re: Linux messages full of `random: get_random_u32 called from`
On Sun, Apr 29, 2018 at 07:07:29PM -0400, Dave Jones wrote: > > Why do we continue to print this stuff out when crng_init=1 though ? > > answering my own question, I think.. This is a tristate, and we need it > to be >1 to be quiet, which doesn't happen until.. > > > [ 165.806247] random: crng init done > > this point. Right. What happens is that we divert the first 64 bits of entropy credits directly into the crng state, without initializing the input_pool. So when we hit crng_init=1, the crng has only 64 bits of entropy (conservatively speaking); furthermore, since we aren't doing catastrophic reseeding, if something is continuously reading from /dev/urandom or get_random_bytes() during that time, then the attacker could be able to detremine which one of the 32 states the entropy pool was when the entropy count was 5, and then 5 bits later, poll the output of the pool again, and guess which of the 32 states the pool was in, etc., and effectively keep up with the entropy as it trickles in. This is the reasoning behind catastrophic reseeding; we wait until we have 128 bits of entropy in the input pool, and then we reseed the pool all at once. Why do we have the crng_init=1 state? Because it provides some basic protection for super-early users of the entropy pool. It's essentially a bandaid, and we could improve the time to get to fully initialize by about 33% if we left the pool totally unititalized and only focused on filling the input pool. But given that on many distributions, ssh still insists on initializing long-term public keys at first boot from /dev/urandom, instead of *waiting* until the first time someone attempts to ssh into box, or waiting until getrandom(2) doesn't block --- without hanging the boot --- we have the crng_init=1 hack essentially as a palliative. I view this as working around broken user space. But userspace has been broken for a long time, and users tend to blame the kernel, not userspace - Ted
Re: Linux messages full of `random: get_random_u32 called from`
On Sun, Apr 29, 2018 at 03:49:28PM -0700, Sultan Alsawaf wrote: > On Mon, Apr 30, 2018 at 12:43:48AM +0200, Jason A. Donenfeld wrote: > > > - if ((fast_pool->count < 64) && > > > - !time_after(now, fast_pool->last + HZ)) > > > - return; > > > - > > > > I suspect you still want the rate-limiting in place. But if you _do_ > > want to cheat like this, you could instead just modify the condition > > to only relax the rate limiting when !crng_init(). > > Good idea. Attached a new patch that's less intrusive. It still fixes my > issue, > of course. What your patch does is assume that there is a full bit of uncertainty that can be obtained from the information gathered from each interrupt. I *might* be willing to assume that to be valid on x86 systems that have a high resolution cycle counter. But on ARM platforms, especially during system bootup when the user isn't typing anything and SSD's and flash storage tend to have very predictable timing patterns? Not a bet I'd be willing to take. Even with a cycle counter, there's a reason why we assumed that we need to mix in timing results from 64 interrupts or one second's worth before we would give a single bit's worth of entropy credit. - Ted
Re: Linux messages full of `random: get_random_u32 called from`
On Sun, Apr 29, 2018 at 07:02:02PM -0400, Dave Jones wrote: > On Tue, Apr 24, 2018 at 09:56:21AM -0400, Theodore Y. Ts'o wrote: > > > Can you tell me a bit about your system? What distribution, what > > hardware is present in your sytsem (what architecture, what > > peripherals are attached, etc.)? > > > > There's a reason why we made this --- we were declaring the random > > number pool to be fully intialized before it really was, and that was > > a potential security concern. It's not as bad as the weakness > > discovered by Nadia Heninger in 2012. (See https://factorable.net for > > more details.) However, this is not one of those things where we like > > to fool around. > > > > So I want to understand if this is an issue with a particular hardware > > configuration, or whether it's just a badly designed Linux init system > > or embedded setup, or something else. After all, you wouldn't want > > the NSA spying on all of your network traffic, would you? :-) > > Why do we continue to print this stuff out when crng_init=1 though ? answering my own question, I think.. This is a tristate, and we need it to be >1 to be quiet, which doesn't happen until.. > [ 165.806247] random: crng init done this point. Dave
Re: Linux messages full of `random: get_random_u32 called from`
On Tue, Apr 24, 2018 at 09:56:21AM -0400, Theodore Y. Ts'o wrote: > Can you tell me a bit about your system? What distribution, what > hardware is present in your sytsem (what architecture, what > peripherals are attached, etc.)? > > There's a reason why we made this --- we were declaring the random > number pool to be fully intialized before it really was, and that was > a potential security concern. It's not as bad as the weakness > discovered by Nadia Heninger in 2012. (See https://factorable.net for > more details.) However, this is not one of those things where we like > to fool around. > > So I want to understand if this is an issue with a particular hardware > configuration, or whether it's just a badly designed Linux init system > or embedded setup, or something else. After all, you wouldn't want > the NSA spying on all of your network traffic, would you? :-) Why do we continue to print this stuff out when crng_init=1 though ? (This from debian stable, on a pretty basic atom box, but similar dmesg's on everything else I've put 4.17-rc on so far) [0.00] random: get_random_bytes called from start_kernel+0x96/0x519 with crng_init=0 [0.00] random: get_random_u64 called from __kmem_cache_create+0x39/0x450 with crng_init=0 [0.00] random: get_random_u64 called from cache_random_seq_create+0x76/0x120 with crng_init=0 [0.151401] calling initialize_ptr_random+0x0/0x36 @ 1 [0.151527] initcall initialize_ptr_random+0x0/0x36 returned 0 after 0 usecs [0.294661] calling prandom_init+0x0/0xbd @ 1 [0.294763] initcall prandom_init+0x0/0xbd returned 0 after 0 usecs [1.430529] _warn_unseeded_randomness: 165 callbacks suppressed [1.430540] random: get_random_u64 called from __kmem_cache_create+0x39/0x450 with crng_init=0 [1.430860] random: get_random_u64 called from cache_random_seq_create+0x76/0x120 with crng_init=0 [1.452240] random: get_random_u64 called from copy_process.part.67+0x1ae/0x1e60 with crng_init=0 [2.954901] _warn_unseeded_randomness: 54 callbacks suppressed [2.954910] random: get_random_u64 called from __kmem_cache_create+0x39/0x450 with crng_init=0 [2.955185] random: get_random_u64 called from cache_random_seq_create+0x76/0x120 with crng_init=0 [2.957701] random: get_random_u64 called from __kmem_cache_create+0x39/0x450 with crng_init=0 [6.017364] _warn_unseeded_randomness: 88 callbacks suppressed [6.017373] random: get_random_u64 called from __kmem_cache_create+0x39/0x450 with crng_init=0 [6.042652] random: get_random_u64 called from cache_random_seq_create+0x76/0x120 with crng_init=0 [6.060333] random: get_random_u64 called from __kmem_cache_create+0x39/0x450 with crng_init=0 [6.951978] calling prandom_reseed+0x0/0x2a @ 1 [6.960627] initcall prandom_reseed+0x0/0x2a returned 0 after 105 usecs [7.371745] _warn_unseeded_randomness: 37 callbacks suppressed [7.371759] random: get_random_u64 called from arch_pick_mmap_layout+0x64/0x130 with crng_init=0 [7.395926] random: get_random_u64 called from load_elf_binary+0x4ae/0x1720 with crng_init=0 [7.411549] random: get_random_u32 called from arch_align_stack+0x37/0x50 with crng_init=0 [7.553379] random: systemd-udevd: uninitialized urandom read (16 bytes read) [7.563210] random: systemd-udevd: uninitialized urandom read (16 bytes read) [7.571498] random: systemd-udevd: uninitialized urandom read (16 bytes read) [8.449679] _warn_unseeded_randomness: 154 callbacks suppressed [8.449691] random: get_random_u64 called from copy_process.part.67+0x1ae/0x1e60 with crng_init=0 [8.483097] random: get_random_u64 called from arch_pick_mmap_layout+0x64/0x130 with crng_init=0 [8.497999] random: get_random_u64 called from load_elf_binary+0x4ae/0x1720 with crng_init=0 [9.353904] random: fast init done [9.770384] _warn_unseeded_randomness: 187 callbacks suppressed [9.770398] random: get_random_u32 called from bucket_table_alloc+0x84/0x1b0 with crng_init=1 [9.791514] random: get_random_u32 called from new_slab+0x174/0x680 with crng_init=1 [9.834909] random: get_random_u64 called from copy_process.part.67+0x1ae/0x1e60 with crng_init=1 [ 10.802200] _warn_unseeded_randomness: 168 callbacks suppressed [ 10.802214] random: get_random_u64 called from arch_pick_mmap_layout+0x64/0x130 with crng_init=1 [ 10.802276] random: get_random_u64 called from load_elf_binary+0x4ae/0x1720 with crng_init=1 [ 10.802289] random: get_random_u32 called from arch_align_stack+0x37/0x50 with crng_init=1 [ 11.821109] _warn_unseeded_randomness: 160 callbacks suppressed [ 11.821122] random: get_random_u64 called from copy_process.part.67+0x1ae/0x1e60 with crng_init=1 [ 11.863770] random: get_random_u32 called from bucket_table_alloc+0x84/0x1b0 with crng_init=1 [ 11.869384] random: get_random_u32 called from new_slab+0x174/0x680 with crng_init=1 [ 12.843237] _warn_unseeded_rando
Re: Linux messages full of `random: get_random_u32 called from`
On Mon, Apr 30, 2018 at 12:43:48AM +0200, Jason A. Donenfeld wrote: > > - if ((fast_pool->count < 64) && > > - !time_after(now, fast_pool->last + HZ)) > > - return; > > - > > I suspect you still want the rate-limiting in place. But if you _do_ > want to cheat like this, you could instead just modify the condition > to only relax the rate limiting when !crng_init(). Good idea. Attached a new patch that's less intrusive. It still fixes my issue, of course. Sultan >From 6870b0383b88438d842599aa8608a260e6fb0ed2 Mon Sep 17 00:00:00 2001 From: Sultan Alsawaf Date: Sun, 29 Apr 2018 15:44:27 -0700 Subject: [PATCH] random: don't ratelimit add_interrupt_randomness() until crng is ready --- drivers/char/random.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index 38729baed6ee..8c00c008e797 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -1201,7 +1201,7 @@ void add_interrupt_randomness(int irq, int irq_flags) } if ((fast_pool->count < 64) && - !time_after(now, fast_pool->last + HZ)) + !time_after(now, fast_pool->last + HZ) && crng_ready()) return; r = &input_pool; -- 2.14.1
Re: Linux messages full of `random: get_random_u32 called from`
> - if ((fast_pool->count < 64) && > - !time_after(now, fast_pool->last + HZ)) > - return; > - I suspect you still want the rate-limiting in place. But if you _do_ want to cheat like this, you could instead just modify the condition to only relax the rate limiting when !crng_init().
Re: Linux messages full of `random: get_random_u32 called from`
Hi! > What would be useful is if people gave reports that listed exactly > what laptop and distributions they are using. Just "a high spec x86 > laptop" isn't terribly useful, because *my* brand-new Dell XPS 13 > running Debian testing is working just fine. The year, model, make, > and CPU type plus what distribution (and distro version number) you > are running is useful, so I can assess how wide spread the unhappiness > is going to be, and what mitigation steps make sense. Thinkpad X60, model name : Genuine Intel(R) CPU T2400 @ 1.83GHz pavel@amd:~$ cat /etc/debian_version 8.10 I already posted some dmesg snippets, but system boots. On _this_ boot, it was ok, and I do not see anything: pavel@amd:/data/l/linux-next-32$ dmesg | grep urandom pavel@amd:/data/l/linux-next-32$ Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: Linux messages full of `random: get_random_u32 called from`
On Sun, Apr 29, 2018 at 06:05:19PM -0400, Theodore Y. Ts'o wrote: > It's more accurate to say that using /dev/urandom is no worse than > before (from a few years ago). There are, alas, plenty of > distributions and user space application programmers that basically > got lazy using /dev/urandom, and assumed that there would be plenty of > entropy during early system startup. > > When they switched over the getrandom(2), the most egregious examples > of this caused pain (and they got fixed), but due to a bug in > drivers/char/random.c, if getrandom(2) was called after the entropy > pool was "half initialized", it would not block, but proceed. > > Is that exploitable? Well, Jann and I didn't find an _obvious_ way to > exploit the short coming, which is this wasn't treated like an > emergency situation ala the embarassing situation we had five years > ago[1]. > > [1] https://factorable.net/paper.html > > However, it was enough to make us be uncomfortable, which is why I > pushed the changes that I did. At least on the devices we had at > hand, using the distributions that we typically use, the impact seemed > minimal. Unfortuantely, there is no way to know for sure without > rolling out change and seeing who screams. In the ideal world, > software would not require cryptographic randomness immediately after > boot, before the user logs in. And ***really***, as in [1], softwaret > should not be generating long-term public keys that are essential to > the security of the box a few seconds immediately after the device is > first unboxed and plugged in.i > > What would be useful is if people gave reports that listed exactly > what laptop and distributions they are using. Just "a high spec x86 > laptop" isn't terribly useful, because *my* brand-new Dell XPS 13 > running Debian testing is working just fine. The year, model, make, > and CPU type plus what distribution (and distro version number) you > are running is useful, so I can assess how wide spread the unhappiness > is going to be, and what mitigation steps make sense. > > > What mitigations steps can be taken? > > If you believe in security-through-complexity (the cache architecture > of x86 is *so* complicated no one can understand it, so > Jitterentropy / Haveged *must* be secure), or security-through-secrecy > (the cache architecture of x86 is only avilable to internal architects > inside Intel, so Jitterentropy / Haveged *must* be secure, never mind > that the Intel CPU architects who were asked about it were "nervous"), > then wiring up CONFIG_JITTERENTROPY or using haveged might be one > approach. > > If you believe that Intel hasn't backdoored RDRAND, then installing > rng-tools and running rngd with --enable-drng will enable RDRAND. > That seems to be popular with various defense contractors, perhaps on > the assumption that if it _was_ backdoored (no one knows for sure), it > was probably with the connivance or request of the US government, who > doesn't need to worry about spying on itself. > > Or you can use some kind of open hardware design RNG, such as > ChoasKey[2] from Altus Metrum. But that requires using specially > ordered hardware plugged into a USB slot, and it's probably not a mass > solution. > > [2] https://altusmetrum.org/ChaosKey/ > > > Personally, I prefer fixing the software to simply not require > cryptographic grade entropy before the user has logged in. Because > it's better than the alternatives. > > - Ted > The attached patch fixes my crng init woes. With it, crng init completes 0.86 seconds into boot, but I can't help but feel like a solution this obvious would just expose my Richard Stallman photo collection to prying eyes at the NSA. Thoughts on the patch? Sultan >From 597b0f2b3c986f853bf1d30a7fb9d76869e47fe8 Mon Sep 17 00:00:00 2001 From: Sultan Alsawaf Date: Sun, 29 Apr 2018 15:22:59 -0700 Subject: [PATCH] random: remove ratelimiting from add_interrupt_randomness() --- drivers/char/random.c | 7 --- 1 file changed, 7 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index 38729baed6ee..5b38277b104a 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -574,7 +574,6 @@ static void mix_pool_bytes(struct entropy_store *r, const void *in, struct fast_pool { __u32 pool[4]; - unsigned long last; unsigned short reg_idx; unsigned char count; }; @@ -1195,20 +1194,14 @@ void add_interrupt_randomness(int irq, int irq_flags) crng_fast_load((char *) fast_pool->pool, sizeof(fast_pool->pool))) { fast_pool->count = 0; - fast_pool->last = now; } return; } - if ((fast_pool->count < 64) && - !time_after(now, fast_pool->last + HZ)) - return; - r = &input_pool; if (!spin_trylock(&r->lock))
Re: Linux messages full of `random: get_random_u32 called from`
On Sun, Apr 29, 2018 at 01:20:33PM -0700, Sultan Alsawaf wrote: > On Sun, Apr 29, 2018 at 08:41:01PM +0200, Pavel Machek wrote: > > Umm. No. https://www.youtube.com/watch?v=xneBjc8z0DE > > Okay, but /dev/urandom isn't a solution to this problem because it isn't > usable > until crng init is complete, so it suffers from the same init lag as > /dev/random. It's more accurate to say that using /dev/urandom is no worse than before (from a few years ago). There are, alas, plenty of distributions and user space application programmers that basically got lazy using /dev/urandom, and assumed that there would be plenty of entropy during early system startup. When they switched over the getrandom(2), the most egregious examples of this caused pain (and they got fixed), but due to a bug in drivers/char/random.c, if getrandom(2) was called after the entropy pool was "half initialized", it would not block, but proceed. Is that exploitable? Well, Jann and I didn't find an _obvious_ way to exploit the short coming, which is this wasn't treated like an emergency situation ala the embarassing situation we had five years ago[1]. [1] https://factorable.net/paper.html However, it was enough to make us be uncomfortable, which is why I pushed the changes that I did. At least on the devices we had at hand, using the distributions that we typically use, the impact seemed minimal. Unfortuantely, there is no way to know for sure without rolling out change and seeing who screams. In the ideal world, software would not require cryptographic randomness immediately after boot, before the user logs in. And ***really***, as in [1], softwaret should not be generating long-term public keys that are essential to the security of the box a few seconds immediately after the device is first unboxed and plugged in.i What would be useful is if people gave reports that listed exactly what laptop and distributions they are using. Just "a high spec x86 laptop" isn't terribly useful, because *my* brand-new Dell XPS 13 running Debian testing is working just fine. The year, model, make, and CPU type plus what distribution (and distro version number) you are running is useful, so I can assess how wide spread the unhappiness is going to be, and what mitigation steps make sense. What mitigations steps can be taken? If you believe in security-through-complexity (the cache architecture of x86 is *so* complicated no one can understand it, so Jitterentropy / Haveged *must* be secure), or security-through-secrecy (the cache architecture of x86 is only avilable to internal architects inside Intel, so Jitterentropy / Haveged *must* be secure, never mind that the Intel CPU architects who were asked about it were "nervous"), then wiring up CONFIG_JITTERENTROPY or using haveged might be one approach. If you believe that Intel hasn't backdoored RDRAND, then installing rng-tools and running rngd with --enable-drng will enable RDRAND. That seems to be popular with various defense contractors, perhaps on the assumption that if it _was_ backdoored (no one knows for sure), it was probably with the connivance or request of the US government, who doesn't need to worry about spying on itself. Or you can use some kind of open hardware design RNG, such as ChoasKey[2] from Altus Metrum. But that requires using specially ordered hardware plugged into a USB slot, and it's probably not a mass solution. [2] https://altusmetrum.org/ChaosKey/ Personally, I prefer fixing the software to simply not require cryptographic grade entropy before the user has logged in. Because it's better than the alternatives. - Ted
Re: Linux messages full of `random: get_random_u32 called from`
On Sun, Apr 29, 2018 at 11:18:55PM +0200, Pavel Machek wrote: > So -- I'm pretty sure systemd and friends should be using > /dev/urandom. Maybe gpg wants to use /dev/random. _Maybe_. > > [2.948192] random: systemd: uninitialized urandom read (16 bytes > read) > [2.953526] systemd[1]: systemd 215 running in system mode. (+PAM > +AUDIT +SELINUX +IMA +SYSVINIT +LIBCRYPTSETUP +GCRYPT +ACL +XZ > -SECCOMP -APPARMOR) > [2.980278] systemd[1]: Detected architecture 'x86'. > [3.115072] usb 5-2: New USB device found, idVendor=0483, > idProduct=2016, bcdDevice= 0.01 > [3.119633] usb 5-2: New USB device strings: Mfr=1, Product=2, > SerialNumber=0 > [3.124147] usb 5-2: Product: Biometric Coprocessor > [3.128621] usb 5-2: Manufacturer: STMicroelectronics > [3.163839] systemd[1]: Failed to insert module 'ipv6' > [3.181266] systemd[1]: Set hostname to . > [3.267243] random: systemd-sysv-ge: uninitialized urandom read (16 > bytes read) > [3.669590] random: systemd-sysv-ge: uninitialized urandom read (16 > bytes read) > [3.696242] random: systemd: uninitialized urandom read (16 bytes > read) > [3.700066] random: systemd: uninitialized urandom read (16 bytes > read) > [3.703716] random: systemd: uninitialized urandom read (16 bytes > read) > > Anyway, urandom should need to be seeded once, and then provide random > data forever... which is not impression I get from the dmesg output > above. Boot clearly proceeds... somehow. So now I'm confused. Hmm... Well, the attached patch (which redirects /dev/random to /dev/urandom) didn't fix my boot issue, so I'm at a loss as well. Sultan >From 15f54e2756866956d8713fdec92b54c6c69eb1bb Mon Sep 17 00:00:00 2001 From: Sultan Alsawaf Date: Sun, 29 Apr 2018 12:53:44 -0700 Subject: [PATCH] char: mem: Link /dev/random to /dev/urandom --- drivers/char/mem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/char/mem.c b/drivers/char/mem.c index ffeb60d3434c..0cd22e6100ad 100644 --- a/drivers/char/mem.c +++ b/drivers/char/mem.c @@ -870,7 +870,7 @@ static const struct memdev { #endif [5] = { "zero", 0666, &zero_fops, 0 }, [7] = { "full", 0666, &full_fops, 0 }, -[8] = { "random", 0666, &random_fops, 0 }, +[8] = { "random", 0666, &urandom_fops, 0 }, [9] = { "urandom", 0666, &urandom_fops, 0 }, #ifdef CONFIG_PRINTK [11] = { "kmsg", 0644, &kmsg_fops, 0 }, -- 2.14.1
Re: Linux messages full of `random: get_random_u32 called from`
On Sun 2018-04-29 13:20:33, Sultan Alsawaf wrote: > On Sun, Apr 29, 2018 at 08:41:01PM +0200, Pavel Machek wrote: > > Umm. No. https://www.youtube.com/watch?v=xneBjc8z0DE > > Okay, but /dev/urandom isn't a solution to this problem because it isn't > usable > until crng init is complete, so it suffers from the same init lag as > /dev/random. So -- I'm pretty sure systemd and friends should be using /dev/urandom. Maybe gpg wants to use /dev/random. _Maybe_. [2.948192] random: systemd: uninitialized urandom read (16 bytes read) [2.953526] systemd[1]: systemd 215 running in system mode. (+PAM +AUDIT +SELINUX +IMA +SYSVINIT +LIBCRYPTSETUP +GCRYPT +ACL +XZ -SECCOMP -APPARMOR) [2.980278] systemd[1]: Detected architecture 'x86'. [3.115072] usb 5-2: New USB device found, idVendor=0483, idProduct=2016, bcdDevice= 0.01 [3.119633] usb 5-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [3.124147] usb 5-2: Product: Biometric Coprocessor [3.128621] usb 5-2: Manufacturer: STMicroelectronics [3.163839] systemd[1]: Failed to insert module 'ipv6' [3.181266] systemd[1]: Set hostname to . [3.267243] random: systemd-sysv-ge: uninitialized urandom read (16 bytes read) [3.669590] random: systemd-sysv-ge: uninitialized urandom read (16 bytes read) [3.696242] random: systemd: uninitialized urandom read (16 bytes read) [3.700066] random: systemd: uninitialized urandom read (16 bytes read) [3.703716] random: systemd: uninitialized urandom read (16 bytes read) Anyway, urandom should need to be seeded once, and then provide random data forever... which is not impression I get from the dmesg output above. Boot clearly proceeds... somehow. So now I'm confused. Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: Linux messages full of `random: get_random_u32 called from`
On Sun, Apr 29, 2018 at 08:41:01PM +0200, Pavel Machek wrote: > Umm. No. https://www.youtube.com/watch?v=xneBjc8z0DE Okay, but /dev/urandom isn't a solution to this problem because it isn't usable until crng init is complete, so it suffers from the same init lag as /dev/random. Sultan
Re: Linux messages full of `random: get_random_u32 called from`
On Sun, Apr 29, 2018 at 11:30:57AM -0700, Sultan Alsawaf wrote: > > Mind you, this laptop has a 45W CPU, so power savings were definitely not > considered in its design. Do you have any machines that can provide enough > boot entropy to satisfy crng init without requiring user-provided entropy? My 2018 Dell XPS 13 laptop, running "egrep '(random|EXT4)' /var/log/kern.log": Apr 24 17:05:01 cwcc kernel: [0.00] random: get_random_bytes called from start_kernel+0x83/0x500 with crng_init=0 Apr 24 17:05:01 cwcc kernel: [1.363383] random: fast init done Apr 24 17:05:01 cwcc kernel: [3.567432] random: lvm: uninitialized urandom read (4 bytes read) Apr 24 17:05:01 cwcc kernel: [3.593132] random: lvm: uninitialized urandom read (4 bytes read) Apr 24 17:05:01 cwcc kernel: [7.584838] random: cryptsetup: uninitialized urandom read (2 bytes read) Apr 24 17:05:01 cwcc kernel: [7.600685] random: cryptsetup: uninitialized urandom read (2 bytes read) Apr 24 17:05:01 cwcc kernel: [7.803194] random: cryptsetup: uninitialized urandom read (2 bytes read) Apr 24 17:05:01 cwcc kernel: [7.831050] random: lvm: uninitialized urandom read (4 bytes read) Apr 24 17:05:01 cwcc kernel: [7.851884] random: lvm: uninitialized urandom read (4 bytes read) Apr 24 17:05:01 cwcc kernel: [7.875382] random: lvm: uninitialized urandom read (2 bytes read) Apr 24 17:05:01 cwcc kernel: [8.162552] EXT4-fs (dm-1): mounted filesystem with ordered data mode. Opts: (null) Apr 24 17:05:01 cwcc kernel: [8.646497] random: crng init done - Ted
Re: Linux messages full of `random: get_random_u32 called from`
On Sun 2018-04-29 10:05:41, Sultan Alsawaf wrote: > On Sun, Apr 29, 2018 at 04:32:05PM +0200, Pavel Machek wrote: > > Hi! > > > > > This is why ultimately, we do need to attack this problem from both > > > ends, which means teaching userspace programs to only request > > > cryptographic-grade randomness when it is really needed --- and most > > > of the time, if the user has not logged in yet, you probably don't > > > need cryptographic-grade randomness > > > > IOW moving them from /dev/random to /dev/urandom? > > /dev/urandom isn't cryptographically secure, so that's not an > option. Umm. No. https://www.youtube.com/watch?v=xneBjc8z0DE Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: Linux messages full of `random: get_random_u32 called from`
I'd also like to add that my high-spec x86 laptop exhibits the same issue as my Edgar Chromebook. Here's my dmesg: https://hastebin.com/dofejolobi.go The most interesting line: [ 90.811633] random: crng init done I waited 90 seconds after boot to provide entropy myself, at which point crng init completed. In other words, crng init only completed because I provided the entropy by smashing the keyboard. I could've waited longer and crng init wouldn't have completed without my input. Mind you, this laptop has a 45W CPU, so power savings were definitely not considered in its design. Do you have any machines that can provide enough boot entropy to satisfy crng init without requiring user-provided entropy? Sultan
Re: Linux messages full of `random: get_random_u32 called from`
On Sun, Apr 29, 2018 at 04:32:05PM +0200, Pavel Machek wrote: > Hi! > > > This is why ultimately, we do need to attack this problem from both > > ends, which means teaching userspace programs to only request > > cryptographic-grade randomness when it is really needed --- and most > > of the time, if the user has not logged in yet, you probably don't > > need cryptographic-grade randomness > > IOW moving them from /dev/random to /dev/urandom? > Pavel > > -- > (english) http://www.livejournal.com/~pavelmachek > (cesky, pictures) > http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html /dev/urandom isn't cryptographically secure, so that's not an option.
Re: Linux messages full of `random: get_random_u32 called from`
Hi! > This is why ultimately, we do need to attack this problem from both > ends, which means teaching userspace programs to only request > cryptographic-grade randomness when it is really needed --- and most > of the time, if the user has not logged in yet, you probably don't > need cryptographic-grade randomness IOW moving them from /dev/random to /dev/urandom? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: Linux messages full of `random: get_random_u32 called from`
Hi! On Thu 2018-04-26 19:56:30, Theodore Y. Ts'o wrote: > On Thu, Apr 26, 2018 at 01:22:02PM -0700, Sultan Alsawaf wrote: > > > > Also, regardless of what's hanging on CRNG init, CRNG should be able to > > init on its own in a timely > > manner without the need for user-provided entropy. Userspace was working > > fine before the recent CRNG > > kernel changes, so I don't think this is a userspace bug. > > The CRNG changes were needed because were erroneously saying that the > entropy pool was securely initialized before it really was. Saying > that CRNG should be able to init on its own is much like saying, "Ted > should be able to fly wherever he wants in his own personal Gulfstream > V." It would certainly be _nice_ if I could afford my personal jet. > I certainly wish I were that rich. But the problem is that dollars > (or Euro's) are like entropy, they don't just magically drop out of > the sky. > > If there isn't user-provided entropy, and the hardware isn't providing > sufficient entropy, where did you think the kernel is supposed to get > the entropy from? Should it dial 1-800-TRUST-NSA? Yes, we could dial 1-800-TRUST-NSA. Then nicely ask them to provide us some unbackdoored randomness. Then we'd ignore whatever they say, but would collect randomness from timing and noise on the telephone line. > The other approach would be to compile the kernel with > CONFIG_HW_RANDOM_TPM and to modify drivers/char/tpm/tpm-chip.c tot > initalize chip->hwrng.quality = 500. We've historically made this > something that the system administrator must set via sysfs. This is > because we wanted system adminisrators to explicitly say that they > trust the any hardware manufacturer that (a) they haven't been paid by > your choice of the Chinese MSS or the US NSA to introduce a backdoor,i > and (b) they are competent to actually implemnt a _secure_ hardware > random number generator. Sadly, this has not always been the case. Well, we could actively start accessing suitable device (SD card ? HDD ? CMOS RTC?) when we detect entropy is low. Yes, that would eat power, but that would be better than machine that hangs at boot. We could also access the hwrng, then collect entropy from the timing. TPM is slow chip... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: Linux messages full of `random: get_random_u32 called from`
Hi! > Am 25.04.2018 um 09:41 schrieb Theodore Y. Ts'o: > >Does this help on your system? > > Thank you, after figuring out how to apply the paste, yes it helped on my > Lenovo X60. > > >commit 4e00b339e264802851aff8e73cde7d24b57b18ce > >Author: Theodore Ts'o > >Date: Wed Apr 25 01:12:32 2018 -0400 > > > > random: rate limit unseeded randomness warnings > > On systems without sufficient boot randomness, no point spamming dmesg. > > I guess this is a problem with old hardware? Ok, I see it too, thinkpad x60. But... this machine has spinning harddrive and independend RTC; there really should be enough randomness... Could we exploit either of them as randomness source when we run out of entropy? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: Linux messages full of `random: get_random_u32 called from`
> On Thu, Apr 26, 2018 at 10:20:44PM -0700, Sultan Alsawaf wrote: >> I noted at least 20,000 mmc interrupts before I intervened in the boot >> process to provide entropy >> myself. That's just for mmc, so I'm sure there were even more interrupts >> elsewhere. Is 20k+ interrupts >> really not sufficient? > How did you determine that there were 20,000 mmc interrupts before you > had logged in? Did you have access to the machine w/o having access > to the login prompt? > > I can send a patch (see attached) that will spew large amounts of logs > as each interrupt comes in and the entropy pool is getting intialized. > That's how I test things on QEMU, and Jann did something similar on a > (physical) test machine, so I'm pretty confident that if you were > getting interrupts, it would result in them contributing into the > pool. > > You will need a serial console, or build a kernel with a much larger > dmesg buffer, since if you really are getting that many interrupts it > will cause a lot of log spew. >> There are lots of other sources of entropy available as well, like >> the ever-changing CPU frequencies reported by any recent Intel chip >> (i.e., they report precision down to 1 kHz). > That's something we could look at, but the problem is if there is some > systemd unit during early boot that blocks waiting for the entropy > pool to be initalized, the system will come to a dead halt, and even > the CPU frequency shifts will probably not move much --- just as there > weren't any interrupts while some system startup on the boot path > wedges the system startup waiting for entropy. > > This is why ultimately, we do need to attack this problem from both > ends, which means teaching userspace programs to only request > cryptographic-grade randomness when it is really needed --- and most > of the time, if the user has not logged in yet, you probably don't > need cryptographic-grade randomness > > - Ted > > diff --git a/drivers/char/random.c b/drivers/char/random.c > index cd888d4ee605..69bd29f039e7 100644 > --- a/drivers/char/random.c > +++ b/drivers/char/random.c > @@ -916,6 +916,10 @@ static void crng_reseed(struct crng_state *crng, struct > entropy_store *r) > __u32 key[8]; > } buf; > > + if (crng == &primary_crng) > + pr_notice("random: crng_reseed primary from %px\n", r); > + else > + pr_notice("random: crng_reseed crng %px from %px\n", crng, r); > if (r) { > num = extract_entropy(r, &buf, 32, 16, 0); > if (num == 0) > @@ -1241,6 +1245,10 @@ void add_interrupt_randomness(int irq, int irq_flags) > fast_pool->pool[2] ^= ip; > fast_pool->pool[3] ^= (sizeof(ip) > 4) ? ip >> 32 : > get_reg(fast_pool, regs); > + if (crng_init < 2) > + pr_notice("random: add_interrupt(cycles=0x%08llx, now=%ld, " > + "irq=%d, ip=0x%08lx)\n", > + cycles, now, irq, _RET_IP_); > > fast_mix(fast_pool); > add_interrupt_bench(cycles); > @@ -1282,6 +1290,9 @@ void add_interrupt_randomness(int irq, int irq_flags) > > /* award one bit for the contents of the fast pool */ > credit_entropy_bits(r, credit + 1); > + if (crng_init < 2) > + pr_notice("random: batched into pool in stage %d, bits now %d", > + crng_init, ENTROPY_BITS(r)); > } > EXPORT_SYMBOL_GPL(add_interrupt_randomness); I dumped the contents of /proc/interrupts to dmesg using the attached patch I threw together, and then waited a sufficient amount of time before introducing entropy myself in order to ensure that the interrupt readings were not contaminated by user-provided interrupts. Here is the interesting snippet from my dmesg: [ 30.689076] /proc/interrupts dump: |CPU0 CPU1 CPU2 CPU3 0: 6 0 0 0 IO-APIC 2-edge timer 8: 0 0 1 0 IO-APIC 8-edge rtc0 9: 0533 0 0 IO-APIC 9-fasteoi acpi 10: 0 0 0 0 IO-APIC 10-edge tpm0 29: 0 0 0 0 IO-APIC 29-fasteoi intel_sst_driver 36:203 0 0 0 IO-APIC 36-fasteoi 808622C1:04 37: 0264 0 0 IO-APIC 37-fasteoi 808622C1:05 42: 0 0 0 0 IO-APIC 42-fasteoi dw:dmac-1 43: 0 0 0 0 IO-APIC 43-fasteoi dw:dmac-1 45: 0 0 0 11402 IO-APIC 45-fasteoi mmc0 168: 0 0 1 0 c
Re: Linux messages full of `random: get_random_u32 called from`
On Thu, Apr 26, 2018 at 10:20:44PM -0700, Sultan Alsawaf wrote: > > I noted at least 20,000 mmc interrupts before I intervened in the boot > process to provide entropy > myself. That's just for mmc, so I'm sure there were even more interrupts > elsewhere. Is 20k+ interrupts > really not sufficient? How did you determine that there were 20,000 mmc interrupts before you had logged in? Did you have access to the machine w/o having access to the login prompt? I can send a patch (see attached) that will spew large amounts of logs as each interrupt comes in and the entropy pool is getting intialized. That's how I test things on QEMU, and Jann did something similar on a (physical) test machine, so I'm pretty confident that if you were getting interrupts, it would result in them contributing into the pool. You will need a serial console, or build a kernel with a much larger dmesg buffer, since if you really are getting that many interrupts it will cause a lot of log spew. > There are lots of other sources of entropy available as well, like > the ever-changing CPU frequencies reported by any recent Intel chip > (i.e., they report precision down to 1 kHz). That's something we could look at, but the problem is if there is some systemd unit during early boot that blocks waiting for the entropy pool to be initalized, the system will come to a dead halt, and even the CPU frequency shifts will probably not move much --- just as there weren't any interrupts while some system startup on the boot path wedges the system startup waiting for entropy. This is why ultimately, we do need to attack this problem from both ends, which means teaching userspace programs to only request cryptographic-grade randomness when it is really needed --- and most of the time, if the user has not logged in yet, you probably don't need cryptographic-grade randomness - Ted diff --git a/drivers/char/random.c b/drivers/char/random.c index cd888d4ee605..69bd29f039e7 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -916,6 +916,10 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r) __u32 key[8]; } buf; + if (crng == &primary_crng) + pr_notice("random: crng_reseed primary from %px\n", r); + else + pr_notice("random: crng_reseed crng %px from %px\n", crng, r); if (r) { num = extract_entropy(r, &buf, 32, 16, 0); if (num == 0) @@ -1241,6 +1245,10 @@ void add_interrupt_randomness(int irq, int irq_flags) fast_pool->pool[2] ^= ip; fast_pool->pool[3] ^= (sizeof(ip) > 4) ? ip >> 32 : get_reg(fast_pool, regs); + if (crng_init < 2) + pr_notice("random: add_interrupt(cycles=0x%08llx, now=%ld, " + "irq=%d, ip=0x%08lx)\n", + cycles, now, irq, _RET_IP_); fast_mix(fast_pool); add_interrupt_bench(cycles); @@ -1282,6 +1290,9 @@ void add_interrupt_randomness(int irq, int irq_flags) /* award one bit for the contents of the fast pool */ credit_entropy_bits(r, credit + 1); + if (crng_init < 2) + pr_notice("random: batched into pool in stage %d, bits now %d", + crng_init, ENTROPY_BITS(r)); } EXPORT_SYMBOL_GPL(add_interrupt_randomness);
Re: Linux messages full of `random: get_random_u32 called from`
On Fri, Apr 27, 2018 at 05:38:52PM +0200, Jason A. Donenfeld wrote: > > Please correct me if I'm wrong, but my present understanding of this > is that crng readiness used to be broken, meaning people would have a > seeded rng without it actually being seeded. You fixed this bug, and > now people are discovering that they don't have crng readiness during > a late stage of their init, which is breaking all sorts of entirely > reasonable and widely deployed userspaces. I'd say the problem is a combination of some classes of x86 hardware devices (so far I've mainly seen repurposed chromebooks and VM's that don't have virtio-rng enabled) combined with some distributions that could make themselves more amenable to platforms with minimal amounts of entropy available to them during system startup. > Sultan mentioned that his machine actually does trigger large > quantities of interrupts. Is it possible that the entropy gathering > algorithm has some issues, and Sultan's report points to a real bug > here? Considering the crng readiness state hasn't been working until > your recent fix, I suspect the actual entropy gathering code probably > hasn't prompted too many bug reports, until now that is. It's not clear when his machine is triggering the "large quantity of interrupts". Is it during the system startup, or after he's logged into the machine? I suspect what is going on is the Chromebook has been engineered so that when it's idle, it doesn't issue any interrupts at all --- which is a good thing from a power management perspective. So if nothing is actually _querying_ the SD Card reader, it's not generating any interrupts. This is a feature, and not a bug. That being said, a laptop which sends some number of interrupts as it receives, say, WiFi packets, and a system which automatically starts looking for suitable access points as soon as the machine is started gives us timing events which is not easily available to an analyst sitting in Fort Meade, Maryland. In practice, that seems to be much more of the rule and not the exception. However, as laptops try to become much more sparing interrupts to save power, then we either have to (a) be willing to trust hardware random number generators available to the laptop, and/or (b) change userspace to *wait* until after the user has logged in to try to obtain cryptographic-graded randomness. If you think there is an alternative besides those two, I'm all ears... - Ted
Re: Linux messages full of `random: get_random_u32 called from`
Hi Ted, Please correct me if I'm wrong, but my present understanding of this is that crng readiness used to be broken, meaning people would have a seeded rng without it actually being seeded. You fixed this bug, and now people are discovering that they don't have crng readiness during a late stage of their init, which is breaking all sorts of entirely reasonable and widely deployed userspaces. You could argue that those userspaces were "only designed for machines that have enough [by what measure?] boot time entropy", but obviously they didn't have that in mind. And now here we have an example of an ordinary x86 machine -- not some weird embedded device -- hitting these issues. I'd suspect that the problem here isn't one that we can exclusively punt onto userspace. Sultan mentioned that his machine actually does trigger large quantities of interrupts. Is it possible that the entropy gathering algorithm has some issues, and Sultan's report points to a real bug here? Considering the crng readiness state hasn't been working until your recent fix, I suspect the actual entropy gathering code probably hasn't prompted too many bug reports, until now that is. Jason
Re: Linux messages full of `random: get_random_u32 called from`
> The CRNG changes were needed because were erroneously saying that the > entropy pool was securely initialized before it really was. Saying > that CRNG should be able to init on its own is much like saying, "Ted > should be able to fly wherever he wants in his own personal Gulfstream > V." It would certainly be _nice_ if I could afford my personal jet. > I certainly wish I were that rich. But the problem is that dollars > (or Euro's) are like entropy, they don't just magically drop out of > the sky. > > If there isn't user-provided entropy, and the hardware isn't providing > sufficient entropy, where did you think the kernel is supposed to get > the entropy from? Should it dial 1-800-TRUST-NSA? > > From the dmesg log, you have a Chromebook Acer 14. I'm guessing the > problem is that Chromebooks have hardware tries *very* hard not to > issue interrupts, since that helps with power savings. The following > from your dmesg is very interesting: > > [0.526786] tpm tpm0: [Firmware Bug]: TPM interrupt not working, polling > instead > > I suspect this isn't a firmware bug; it's the hardware working as > intended / working as designed, for power savings reasons. > > So there are two ways to fix this that I can see. One is to try to > adjust userspace so that it allows the boot to proceed. As there is > more activity, the disk completion interrupts, the user typing their > username/password into the login screen, etc., there will be timing > events which can be used to harvest entropy. > > The other approach would be to compile the kernel with > CONFIG_HW_RANDOM_TPM and to modify drivers/char/tpm/tpm-chip.c tot > initalize chip->hwrng.quality = 500. We've historically made this > something that the system administrator must set via sysfs. This is > because we wanted system adminisrators to explicitly say that they > trust the any hardware manufacturer that (a) they haven't been paid by > your choice of the Chinese MSS or the US NSA to introduce a backdoor,i > and (b) they are competent to actually implemnt a _secure_ hardware > random number generator. Sadly, this has not always been the case. > Please see: > > https://www.chromium.org/chromium-os/tpm_firmware_update > > And note that your Edgar Chromebook is one the list of devices that > have a TPM with the buggy firmware. Fortunately this particular TPM > bug only affects RSA prime generation, so as far as I know there is no > _known_ vulerability in your TPM's hardware random number generator. > B ut we want it to be _your_ responsibility to decide you are willing > to truste it. I certainly don't want to be legally liable --- or even > have the moral responsibility --- of guaranteeing that every single > TPM out there is bug-free(tm). > > - Ted Why don't we tell users that they need to smash their keyboards to make their computers boot then? And if they question it, we can tell them that it certainly would be _nice_ to not have to smash their keyboards to make their computers boot, but alas, a part of me has a feeling that users would not take kindly to that :) I noted at least 20,000 mmc interrupts before I intervened in the boot process to provide entropy myself. That's just for mmc, so I'm sure there were even more interrupts elsewhere. Is 20k+ interrupts really not sufficient? There are lots of other sources of entropy available as well, like the ever-changing CPU frequencies reported by any recent Intel chip (i.e., they report precision down to 1 kHz). Why are we so limited to h/w interrupts? Sultan
Re: Linux messages full of `random: get_random_u32 called from`
On Thu, Apr 26, 2018 at 10:47:49PM +0200, Christian Brauner wrote: > > We have observed a similiar problem with libvirt. As soon as entropy is > provided the boot finishes otherwise it hangs for a long time. > This is not happening with v4.17-rc1 afaict. For libvirt there is at least an easy workaround. Make surue the guest kernel has CONFIG_HW_RANDOM_VIRTIO enabled, and then make sure qemu is started with the options: -object rng-random,filename=/dev/urandom,id=rng0 \ -device virtio-rng-pci,rng=rng0 Cheers, - Ted
Re: Linux messages full of `random: get_random_u32 called from`
On Thu, Apr 26, 2018 at 01:22:02PM -0700, Sultan Alsawaf wrote: > > Also, regardless of what's hanging on CRNG init, CRNG should be able to init > on its own in a timely > manner without the need for user-provided entropy. Userspace was working fine > before the recent CRNG > kernel changes, so I don't think this is a userspace bug. The CRNG changes were needed because were erroneously saying that the entropy pool was securely initialized before it really was. Saying that CRNG should be able to init on its own is much like saying, "Ted should be able to fly wherever he wants in his own personal Gulfstream V." It would certainly be _nice_ if I could afford my personal jet. I certainly wish I were that rich. But the problem is that dollars (or Euro's) are like entropy, they don't just magically drop out of the sky. If there isn't user-provided entropy, and the hardware isn't providing sufficient entropy, where did you think the kernel is supposed to get the entropy from? Should it dial 1-800-TRUST-NSA? >From the dmesg log, you have a Chromebook Acer 14. I'm guessing the problem is that Chromebooks have hardware tries *very* hard not to issue interrupts, since that helps with power savings. The following from your dmesg is very interesting: [0.526786] tpm tpm0: [Firmware Bug]: TPM interrupt not working, polling instead I suspect this isn't a firmware bug; it's the hardware working as intended / working as designed, for power savings reasons. So there are two ways to fix this that I can see. One is to try to adjust userspace so that it allows the boot to proceed. As there is more activity, the disk completion interrupts, the user typing their username/password into the login screen, etc., there will be timing events which can be used to harvest entropy. The other approach would be to compile the kernel with CONFIG_HW_RANDOM_TPM and to modify drivers/char/tpm/tpm-chip.c tot initalize chip->hwrng.quality = 500. We've historically made this something that the system administrator must set via sysfs. This is because we wanted system adminisrators to explicitly say that they trust the any hardware manufacturer that (a) they haven't been paid by your choice of the Chinese MSS or the US NSA to introduce a backdoor,i and (b) they are competent to actually implemnt a _secure_ hardware random number generator. Sadly, this has not always been the case. Please see: https://www.chromium.org/chromium-os/tpm_firmware_update And note that your Edgar Chromebook is one the list of devices that have a TPM with the buggy firmware. Fortunately this particular TPM bug only affects RSA prime generation, so as far as I know there is no _known_ vulerability in your TPM's hardware random number generator. B ut we want it to be _your_ responsibility to decide you are willing to truste it. I certainly don't want to be legally liable --- or even have the moral responsibility --- of guaranteeing that every single TPM out there is bug-free(tm). - Ted
Re: Linux messages full of `random: get_random_u32 called from`
On Thu, Apr 26, 2018 at 01:22:02PM -0700, Sultan Alsawaf wrote: > > Hmm, it looks like the multiuser startup is getting blocked on snapd: > > > > 29.060s snapd.service > > > > graphical.target @1min 32.145s > > └─multi-user.target @1min 32.145s > > └─hddtemp.service @6.512s +28ms > > └─network-online.target @6.508s > > └─NetworkManager-wait-online.service @2.428s +4.079s > > └─NetworkManager.service @2.016s +404ms > > └─dbus.service @1.869s > > └─basic.target @1.824s > > └─sockets.target @1.824s > > └─snapd.socket @1.821s +1ms > > └─sysinit.target @1.812s > > └─apparmor.service @587ms +1.224s > > └─local-fs.target @585ms > > └─local-fs-pre.target @585ms > > └─keyboard-setup.service @235ms +346ms > > └─systemd-journald.socket @226ms > > └─system.slice @225ms > > └─-.slice @220ms > > > > This appears to be some kind of new package management system for > > Ubuntu: > > > > Description-en: Tool to interact with Ubuntu Core Snappy. > > Install, configure, refresh and remove snap packages. Snaps are > > 'universal' packages that work across many different Linux systems, > > enabling secure distribution of the latest apps and utilities for > > cloud, servers, desktops and the internet of things. > > > > Why it the Ubuntu package believes it needs to be fully started before > > the login screen can display is unclear to me. It might be worth > > using systemctl to disable snapd.serivce and see if that makes things > > work better for you. > > > > - Ted > > I removed snapd completely which did nothing. > > Here are new logs: > systemd-analyze blame: https://hastebin.com/edehikuyeb.css > systemd-analyze critical-chain: https://hastebin.com/vedufafema.pl > dmesg: https://hastebin.com/zuwuwoxadu.vbs > > I should also note that leaving the system untouched does not result in it > booting: I must > provide a source of entropy, otherwise it just stays stuck. In both of the > dmesgs I've given, I We have observed a similiar problem with libvirt. As soon as entropy is provided the boot finishes otherwise it hangs for a long time. This is not happening with v4.17-rc1 afaict. Christian > manually provided entropy to the system after about 5 minutes of waiting. > > Also, regardless of what's hanging on CRNG init, CRNG should be able to init > on its own in a timely > manner without the need for user-provided entropy. Userspace was working fine > before the recent CRNG > kernel changes, so I don't think this is a userspace bug. > > -Sultan >
Re: Linux messages full of `random: get_random_u32 called from`
> Hmm, it looks like the multiuser startup is getting blocked on snapd: > > 29.060s snapd.service > > graphical.target @1min 32.145s > └─multi-user.target @1min 32.145s > └─hddtemp.service @6.512s +28ms > └─network-online.target @6.508s > └─NetworkManager-wait-online.service @2.428s +4.079s > └─NetworkManager.service @2.016s +404ms > └─dbus.service @1.869s > └─basic.target @1.824s > └─sockets.target @1.824s > └─snapd.socket @1.821s +1ms > └─sysinit.target @1.812s > └─apparmor.service @587ms +1.224s > └─local-fs.target @585ms > └─local-fs-pre.target @585ms > └─keyboard-setup.service @235ms +346ms > └─systemd-journald.socket @226ms > └─system.slice @225ms > └─-.slice @220ms > > This appears to be some kind of new package management system for > Ubuntu: > > Description-en: Tool to interact with Ubuntu Core Snappy. > Install, configure, refresh and remove snap packages. Snaps are > 'universal' packages that work across many different Linux systems, > enabling secure distribution of the latest apps and utilities for > cloud, servers, desktops and the internet of things. > > Why it the Ubuntu package believes it needs to be fully started before > the login screen can display is unclear to me. It might be worth > using systemctl to disable snapd.serivce and see if that makes things > work better for you. > > - Ted I removed snapd completely which did nothing. Here are new logs: systemd-analyze blame: https://hastebin.com/edehikuyeb.css systemd-analyze critical-chain: https://hastebin.com/vedufafema.pl dmesg: https://hastebin.com/zuwuwoxadu.vbs I should also note that leaving the system untouched does not result in it booting: I must provide a source of entropy, otherwise it just stays stuck. In both of the dmesgs I've given, I manually provided entropy to the system after about 5 minutes of waiting. Also, regardless of what's hanging on CRNG init, CRNG should be able to init on its own in a timely manner without the need for user-provided entropy. Userspace was working fine before the recent CRNG kernel changes, so I don't think this is a userspace bug. -Sultan
Re: Linux messages full of `random: get_random_u32 called from`
On Thu, Apr 26, 2018 at 08:17:34AM -0700, Sultan Alsawaf wrote: > > Hmm, can you let the boot hang for a while? It should continue after > > a few minutes if you wait long enough, but wait a minute or two, then > > give it entropy so the boot can continue. Then can you use > > "systemd-analyze blame" or "systemd-analyize critical-chain" and we > > can see what process was trying to get randomness during the boot > > startup and blocking waiting for the CRNG to be fully initialized. > > > > - Ted > > systemd-analyze blame: https://hastebin.com/ikipavevew.css > systemd-analyze critical-chain: https://hastebin.com/odoyuqeges.pl > dmesg: https://hastebin.com/waracebeja.vbs > Hmm, it looks like the multiuser startup is getting blocked on snapd: 29.060s snapd.service graphical.target @1min 32.145s └─multi-user.target @1min 32.145s └─hddtemp.service @6.512s +28ms └─network-online.target @6.508s └─NetworkManager-wait-online.service @2.428s +4.079s └─NetworkManager.service @2.016s +404ms └─dbus.service @1.869s └─basic.target @1.824s └─sockets.target @1.824s └─snapd.socket @1.821s +1ms └─sysinit.target @1.812s └─apparmor.service @587ms +1.224s └─local-fs.target @585ms └─local-fs-pre.target @585ms └─keyboard-setup.service @235ms +346ms └─systemd-journald.socket @226ms └─system.slice @225ms └─-.slice @220ms This appears to be some kind of new package management system for Ubuntu: Description-en: Tool to interact with Ubuntu Core Snappy. Install, configure, refresh and remove snap packages. Snaps are 'universal' packages that work across many different Linux systems, enabling secure distribution of the latest apps and utilities for cloud, servers, desktops and the internet of things. Why it the Ubuntu package believes it needs to be fully started before the login screen can display is unclear to me. It might be worth using systemctl to disable snapd.serivce and see if that makes things work better for you. - Ted
Re: Linux messages full of `random: get_random_u32 called from`
> Hmm, can you let the boot hang for a while? It should continue after > a few minutes if you wait long enough, but wait a minute or two, then > give it entropy so the boot can continue. Then can you use > "systemd-analyze blame" or "systemd-analyize critical-chain" and we > can see what process was trying to get randomness during the boot > startup and blocking waiting for the CRNG to be fully initialized. > >- Ted systemd-analyze blame: https://hastebin.com/ikipavevew.css systemd-analyze critical-chain: https://hastebin.com/odoyuqeges.pl dmesg: https://hastebin.com/waracebeja.vbs
Re: Linux messages full of `random: get_random_u32 called from`
On Wed, Apr 25, 2018 at 10:05:55PM -0700, Sultan Alsawaf wrote: > > Correct, I'm running Xubuntu 18.04 with my own kernel based off linux-stable. > Hmm, can you let the boot hang for a while? It should continue after a few minutes if you wait long enough, but wait a minute or two, then give it entropy so the boot can continue. Then can you use "systemd-analyze blame" or "systemd-analyize critical-chain" and we can see what process was trying to get randomness during the boot startup and blocking waiting for the CRNG to be fully initialized. - Ted
Re: Linux messages full of `random: get_random_u32 called from`
Hi! > Since Linux 4.17-rcX, Linux spams a lot of `random: get_random_u32 called > from` messages. I believe, this setting should be reverted by default as > otherwise a lot of other messages are not seen. > > Please find my configuration attached. Same here, thinkpad X60: [3.163839] systemd[1]: Failed to insert module 'ipv6' [3.181266] systemd[1]: Set hostname to . [3.267243] random: systemd-sysv-ge: uninitialized urandom read (16 bytes read) [3.669590] random: systemd-sysv-ge: uninitialized urandom read (16 bytes read) [3.696242] random: systemd: uninitialized urandom read (16 bytes read) [3.700066] random: systemd: uninitialized urandom read (16 bytes read) [3.703716] random: systemd: uninitialized urandom read (16 bytes read) [3.756137] random: systemd: uninitialized urandom read (16 bytes read) [3.760460] random: systemd: uninitialized urandom read (16 bytes read) [3.764515] random: systemd: uninitialized urandom read (16 bytes read) [3.835312] random: systemd: uninitialized urandom read (16 bytes read) [4.173204] systemd[1]: Binding to IPv6 address not available since kernel does not support IPv6. [4.176977] systemd[1]: [/lib/systemd/system/gpsd.socket:6] Failed to parse address value, ignoring: [::1]:2947 [4.186472] systemd[1]: Starting Forward Password Requests to Wall Directory Watch. [4.188845] systemd[1]: Started Forward Password Requests to Wall Directory Watch. -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Re: Linux messages full of `random: get_random_u32 called from`
> Thanks for the report! > > I assume since you're upgrading your own kernel, you must not be > running Chrome OS on your Acer CB3-431 Chromebook (Edgar). Are you > running Chromium --- or some Linux distribution on it? > > Thanks, > > - Ted Correct, I'm running Xubuntu 18.04 with my own kernel based off linux-stable.
Re: Linux messages full of `random: get_random_u32 called from`
On Wed, Apr 25, 2018 at 09:11:08PM -0700, Sultan Alsawaf wrote: > I noticed "systems without sufficient boot randomness" and would like to add > to this. > > With the changes to /dev/random going from 4.16.3 to 4.16.4, my low-spec > Chromebook does not reach > the login screen upon boot (it stays stuck on a black screen) until I provide > a source of entropy to > the system via interrupts (e.g., holding down a key on the keyboard for 5 sec > or moving my finger > across the touchpad a lot). After providing a source of entropy for long > enough, > "random: crng init done" prints out in dmesg and the login screen finally > pops up. Thanks for the report! I assume since you're upgrading your own kernel, you must not be running Chrome OS on your Acer CB3-431 Chromebook (Edgar). Are you running Chromium --- or some Linux distribution on it? Thanks, - Ted
Re: Linux messages full of `random: get_random_u32 called from`
I noticed "systems without sufficient boot randomness" and would like to add to this. With the changes to /dev/random going from 4.16.3 to 4.16.4, my low-spec Chromebook does not reach the login screen upon boot (it stays stuck on a black screen) until I provide a source of entropy to the system via interrupts (e.g., holding down a key on the keyboard for 5 sec or moving my finger across the touchpad a lot). After providing a source of entropy for long enough, "random: crng init done" prints out in dmesg and the login screen finally pops up. Detailed information on my system can be found on this bug report I recently worked on: https://bugzilla.kernel.org/show_bug.cgi?id=199463
Re: Linux messages full of `random: get_random_u32 called from`
Dear Theodore, Am 25.04.2018 um 09:41 schrieb Theodore Y. Ts'o: Does this help on your system? Thank you, after figuring out how to apply the paste, yes it helped on my Lenovo X60. commit 4e00b339e264802851aff8e73cde7d24b57b18ce Author: Theodore Ts'o Date: Wed Apr 25 01:12:32 2018 -0400 random: rate limit unseeded randomness warnings On systems without sufficient boot randomness, no point spamming dmesg. I guess this is a problem with old hardware? […] Kind regards, Pul
Re: Linux messages full of `random: get_random_u32 called from`
Does this help on your system? - Ted commit 4e00b339e264802851aff8e73cde7d24b57b18ce Author: Theodore Ts'o Date: Wed Apr 25 01:12:32 2018 -0400 random: rate limit unseeded randomness warnings On systems without sufficient boot randomness, no point spamming dmesg. Signed-off-by: Theodore Ts'o Cc: sta...@vger.kernel.org diff --git a/drivers/char/random.c b/drivers/char/random.c index 721dca8db9cf..cd888d4ee605 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -261,6 +261,7 @@ #include #include #include +#include #include #include #include @@ -438,6 +439,16 @@ static void _crng_backtrack_protect(struct crng_state *crng, static void process_random_ready_list(void); static void _get_random_bytes(void *buf, int nbytes); +static struct ratelimit_state unseeded_warning = + RATELIMIT_STATE_INIT("warn_unseeded_randomness", HZ, 3); +static struct ratelimit_state urandom_warning = + RATELIMIT_STATE_INIT("warn_urandom_randomness", HZ, 3); + +static int ratelimit_disable __read_mostly; + +module_param_named(ratelimit_disable, ratelimit_disable, int, 0644); +MODULE_PARM_DESC(ratelimit_disable, "Disable random ratelimit suppression"); + /** * * OS independent entropy store. Here are the functions which handle @@ -932,6 +943,18 @@ static void crng_reseed(struct crng_state *crng, struct entropy_store *r) process_random_ready_list(); wake_up_interruptible(&crng_init_wait); pr_notice("random: crng init done\n"); + if (unseeded_warning.missed) { + pr_notice("random: %d get_random_xx warning(s) missed " + "due to ratelimiting\n", + unseeded_warning.missed); + unseeded_warning.missed = 0; + } + if (urandom_warning.missed) { + pr_notice("random: %d urandom warning(s) missed " + "due to ratelimiting\n", + urandom_warning.missed); + urandom_warning.missed = 0; + } } } @@ -1572,8 +1595,9 @@ static void _warn_unseeded_randomness(const char *func_name, void *caller, #ifndef CONFIG_WARN_ALL_UNSEEDED_RANDOM print_once = true; #endif - pr_notice("random: %s called from %pS with crng_init=%d\n", - func_name, caller, crng_init); + if (__ratelimit(&unseeded_warning)) + pr_notice("random: %s called from %pS with crng_init=%d\n", + func_name, caller, crng_init); } /* @@ -1767,6 +1791,10 @@ static int rand_initialize(void) init_std_data(&blocking_pool); crng_initialize(&primary_crng); crng_global_init_time = jiffies; + if (ratelimit_disable) { + urandom_warning.interval = 0; + unseeded_warning.interval = 0; + } return 0; } early_initcall(rand_initialize); @@ -1834,9 +1862,10 @@ urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos) if (!crng_ready() && maxwarn > 0) { maxwarn--; - printk(KERN_NOTICE "random: %s: uninitialized urandom read " - "(%zd bytes read)\n", - current->comm, nbytes); + if (__ratelimit(&urandom_warning)) + printk(KERN_NOTICE "random: %s: uninitialized " + "urandom read (%zd bytes read)\n", + current->comm, nbytes); spin_lock_irqsave(&primary_crng.lock, flags); crng_init_cnt = 0; spin_unlock_irqrestore(&primary_crng.lock, flags);
Re: Linux messages full of `random: get_random_u32 called from`
Dear Theodore, On 04/24/18 17:49, Theodore Y. Ts'o wrote: On Tue, Apr 24, 2018 at 09:56:21AM -0400, Theodore Y. Ts'o wrote: On Tue, Apr 24, 2018 at 01:48:16PM +0200, Paul Menzel wrote: Since Linux 4.17-rcX, Linux spams a lot of `random: get_random_u32 called from` messages. I believe, this setting should be reverted by default as otherwise a lot of other messages are not seen. Can you tell me a bit about your system? What distribution, what hardware is present in your sytsem (what architecture, what peripherals are attached, etc.)? Can you also send me your dmesg or kern.log so I can see where get_random_u32 is getting called from during your system startup? Sorry, for just attaching the unedited log file with the coreboot boot messages. But at time stamp 31 second (first column) the Linux messages are also included. An excerpt, and the full log in my last message. 01.515: Jumping to boot code at 9000(7f733000) 31.117: [0.515017] 00:07: ttyS1 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A 31.118: [0.523260] Linux agpgart interface v0.103 31.119: [0.528188] i8042: PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12 31.130: [0.547244] serio: i8042 KBD port at 0x60,0x64 irq 1 31.130: [0.552286] serio: i8042 AUX port at 0x60,0x64 irq 12 31.130: [0.557653] rtc_cmos 00:03: RTC can wake from S4 31.131: [0.562627] rtc_cmos 00:03: registered as rtc0 31.131: [0.567197] rtc_cmos 00:03: alarms up to one month, y3k, 242 bytes nvram, hpet irqs 31.131: [0.575045] ledtrig-cpu: registered to indicate activity on CPUs 31.132: [0.581736] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=0 31.132: [0.590831] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=0 31.132: [0.599654] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=0 31.132: [0.608722] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=0 31.132: [0.617551] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=0 31.132: [0.626630] random: get_random_u32 called from cache_alloc_refill+0x5bb/0x13d0 with crng_init=0 31.132: [0.635438] random: get_random_u32 called from cache_random_seq_create+0xa3/0x1f0 with crng_init=0 31.133: [0.644556] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0 The problem on the Lenovo X60 is, that the serial console seems to switch from ttyS0 to ttyS1 during bootup with the dock attached, that’s why you do not see the messages in the beginning. Kind regards, Paul smime.p7s Description: S/MIME Cryptographic Signature
Re: Linux messages full of `random: get_random_u32 called from`
On Tue, Apr 24, 2018 at 09:56:21AM -0400, Theodore Y. Ts'o wrote: > On Tue, Apr 24, 2018 at 01:48:16PM +0200, Paul Menzel wrote: > > Dear Linux folks, > > > > > > Since Linux 4.17-rcX, Linux spams a lot of `random: get_random_u32 called > > from` messages. I believe, this setting should be reverted by default as > > otherwise a lot of other messages are not seen. > > Can you tell me a bit about your system? What distribution, what > hardware is present in your sytsem (what architecture, what > peripherals are attached, etc.)? Can you also send me your dmesg or kern.log so I can see where get_random_u32 is getting called from during your system startup? Thanks! - Ted
Re: Linux messages full of `random: get_random_u32 called from`
On Tue, Apr 24, 2018 at 01:48:16PM +0200, Paul Menzel wrote: > Dear Linux folks, > > w > Since Linux 4.17-rcX, Linux spams a lot of `random: get_random_u32 called > from` messages. I believe, this setting should be reverted by default as > otherwise a lot of other messages are not seen. Can you tell me a bit about your system? What distribution, what hardware is present in your sytsem (what architecture, what peripherals are attached, etc.)? There's a reason why we made this --- we were declaring the random number pool to be fully intialized before it really was, and that was a potential security concern. It's not as bad as the weakness discovered by Nadia Heninger in 2012. (See https://factorable.net for more details.) However, this is not one of those things where we like to fool around. So I want to understand if this is an issue with a particular hardware configuration, or whether it's just a badly designed Linux init system or embedded setup, or something else. After all, you wouldn't want the NSA spying on all of your network traffic, would you? :-) - Ted