Re: [PATCH] random: Don't overwrite CRNG state in crng_initialize()
On Thu, Feb 09, 2017 at 01:13:22AM -0700, Alden Tondettar wrote: > And using: > > $ qemu-system-x86_64 --version > QEMU emulator version 2.1.2 (Debian 1:2.1+dfsg-12+deb8u6), Copyright (c) > 2003-2008 Fabrice Bellard > $ qemu-system-x86_64 -nographic -enable-kvm -m 1024M -kernel bzImage -append > "root=/dev/sda1 loglevel=3 console=ttyS0" hd3 Hmm, I'm not seeing this at *all*. I assume you must be using Debian stable? I'm using Debain Testing, which has much newer version of qemu: % /usr/bin/kvm --version QEMU emulator version 2.8.0(Debian 1:2.8+dfsg-2) Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project developers And I'm using: /usr/bin/kvm -drive file=/usr/projects/xfstests-bld/build-32/kvm-xfstests/test-appliance/root_fs.img,if=virtio,snapshot=on -vga none -nographic -m 1024 --kernel /build/random/arch/x86/boot/bzImage --append "root=/dev/vda console=ttyS0,115200" See below for an excerpt of the log, but basically we don't get the first call to crng_fast_load until a good 2 seconds into the boot, when we're doing device probing. The only thing I think of is that your version of qemu is spewing a *huge* number of interrupts to the guest kernel, as soon as interrupts are enabled, and *before* the kernel even starts trying to talk to the devices. That's bad, because it's going to be destroying CPU efficiency of the VM, and even if we add a safety mechanism to prohibit calling crng_fast_load until after crng_initialize() has been called, it's likely that you're not getting much entropy from the interrupts, because qemu must be spewing interrupts as fast as possible, and there may not be a lot of unpredictability in that circumstance. So we can put in some changes to try to mitigate this, but even with your patch, there might not be a lot of entropy because qemu is clearly spewing interrupts at line rate. Hence, I'd call this a qemu BUG, and I'd strongly suggest you look at fixing it by upgrading qemu. - Ted [0.029226] mce: CPU supports 10 MCE banks [0.030077] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0 [0.09] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0 [0.041436] Freeing SMP alternatives memory: 20K [0.043621] ftrace: allocating 34091 entries in 67 pages [0.053659] smpboot: Max logical packages: 1 [0.056696] Enabling APIC mode: Flat. Using 1 I/O APICs [0.061854] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 [0.06] smpboot: CPU0: Intel QEMU Virtual CPU version 2.5+ (family: 0x6, model: 0x6, stepping: 0x3) [0.063588] Performance Events: PMU not available due to virtualization, using software events only. [0.067555] crng_initialize called [0.070107] smp: Bringing up secondary CPUs ... [0.072108] smp: Brought up 1 node, 1 CPU [0.073351] smpboot: Total of 1 processors activated (4801.01 BogoMIPS) [0.077456] devtmpfs: initialized [0.079945] clocksource: jiffies: mask: 0x max_cycles: 0x, max_idle_ns: 6370867519511994 ns [2.063906] crng: dumping entropy [2.065382] crng_fast_load called [2.066747] crng_fast_load: 16/64 [2.066747] crng_fast_load: 16 [2.073526] tsc: Refined TSC clocksource calibration: 2399.998 MHz [2.076219] clocksource: tsc: mask: 0x max_cycles: 0x229835b7123, max_idle_ns: 440795242976 ns [2.134486] ata2.00: ATAPI: QEMU DVD-ROM, 2.5+, max UDMA/100 [2.144405] ata2.00: configured for MWDMA2 [2.153349] scsi 1:0:0:0: CD-ROMQEMU QEMU DVD-ROM 2.5+ PQ: 0 ANSI: 5 [2.187210] sr 1:0:0:0: [sr0] scsi3-mmc drive: 4x/4x cd/rw xa/form2 tray [2.194289] cdrom: Uniform CD-ROM driver Revision: 3.20 [2.205026] sr 1:0:0:0: Attached scsi generic sg0 type 5 [2.277461] crng: dumping entropy [2.279017] crng_fast_load called [2.279017] crng_fast_load: 32/64 [2.279017] crng_fast_load: 16 [2.720393] crng: dumping entropy [2.723448] crng_fast_load called [2.723448] crng_fast_load: 48/64 [2.723448] crng_fast_load: 16 [2.744182] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3 [2.760954] EXT4-fs (vda): couldn't mount as ext3 due to feature incompatibilities [2.774648] EXT4-fs (vda): mounted filesystem with ordered data mode. Opts: (null) [2.779939] VFS: Mounted root (ext4 filesystem) readonly on device 254:0. [2.785356] devtmpfs: mounted [2.788127] Freeing unused kernel memory: 2404K [2.789833] Write protecting the kernel text: 7512k [2.791856] Write protecting the kernel read-only data: 3568k [2.793918] NX-protecting the kernel data: 8872k [2.822964] x86/mm: Checked W+X mappings: FAILED, 96 W+X pages found. [2.845398] crng: dumping entropy [2.846536] crng_fast_load called [2.847460] crng_fast_load: 64/64 [2.848137] random: fast init done [2.848137] crng_fast_load: 16 [2.914998] systemd[1]: systemd 215 running in system mode.
Re: [PATCH] random: Don't overwrite CRNG state in crng_initialize()
OK, I figured out what is going on with your test results. If you use qemu-system-x86_64 **without** --enable-kvm, then on both the Debian Jessie version of qemu as well as the Debian Stretch version of qemu, crng_fast_load() will be called _twice_ before crng_initialize has a chance to be called. At least for my kernel configuration and my CPU. If you're using a different kernel configuration and a slower CPU, such that when qemu is doing instruction by instruction emulation, which slows down the boot sequence **massively**, then that probably explains your results. I'm not sure if there are any real life use cases where someone would be insane enough to use virtualization without enabling KVM, but at least we know what is happening now. This makes me feel better, because I've looked at kernel boot messags from a variety of systems, from big data center servers to laptops to mobile handsets, and I had **never** seen the sequence of crng initialization messages that you had been reporting. - Ted
Re: [PATCH] random: Don't overwrite CRNG state in crng_initialize()
Am Donnerstag, 9. Februar 2017, 02:04:32 CET schrieb Alden Tondettar: Hi Alden, > On Thu, Feb 09, 2017 at 07:47:25AM +0100, Greg Kroah-Hartman wrote: > > On Wed, Feb 08, 2017 at 08:31:26PM -0700, Alden Tondettar wrote: > > > In short, the situation is: > > > > > > A) No usable hardware RNG or arch_get_random() (or we don't trust it...) > > > > Wait, why would you not trust arch_get_random()? Is it broken somehow > > on some arches? If so, why not fix that as well? > > arch_get_random() makes use of RDRAND and similar CPU features. Some people > do not wish to trust black-box RNG implementations. Furthermore, this function is only implemented on x86. On other arches, it is a noop. Ciao Stephan
Re: [PATCH] random: Don't overwrite CRNG state in crng_initialize()
On Thu, Feb 09, 2017 at 02:04:32AM -0700, Alden Tondettar wrote: > On Thu, Feb 09, 2017 at 07:47:25AM +0100, Greg Kroah-Hartman wrote: > > On Wed, Feb 08, 2017 at 08:31:26PM -0700, Alden Tondettar wrote: > > > In short, the situation is: > > > > > > A) No usable hardware RNG or arch_get_random() (or we don't trust it...) > > > > Wait, why would you not trust arch_get_random()? Is it broken somehow > > on some arches? If so, why not fix that as well? > > arch_get_random() makes use of RDRAND and similar CPU features. Some people > do not wish to trust black-box RNG implementations. It does not use those features "raw", it uses the output to feed the entropy, which, from my understanding, should be the same as feeding any other form of data into the system, right? So while it is always nice to worry about different things, I think this is something that is very low on the "possible to cause an issue" scale. thanks, greg k-h
Re: [PATCH] random: Don't overwrite CRNG state in crng_initialize()
On Thu, Feb 09, 2017 at 07:47:25AM +0100, Greg Kroah-Hartman wrote: > On Wed, Feb 08, 2017 at 08:31:26PM -0700, Alden Tondettar wrote: > > In short, the situation is: > > > > A) No usable hardware RNG or arch_get_random() (or we don't trust it...) > > Wait, why would you not trust arch_get_random()? Is it broken somehow > on some arches? If so, why not fix that as well? arch_get_random() makes use of RDRAND and similar CPU features. Some people do not wish to trust black-box RNG implementations. Alden
Re: [PATCH] random: Don't overwrite CRNG state in crng_initialize()
On Wed, Feb 08, 2017 at 11:19:31PM -0500, Theodore Ts'o wrote: > How did you determine when crng_initialize() was being called? On a > VM generally there are fewer interrupts than on real hardware. On > KVM, for I see the random: fast_init message being printed 3.6 seconds > into the boot. > > On Google Compute Engine, the fast_init message happens 52 seconds into the > boot. > > So what VM where you using? I'm trying to figure out whether this is > hypothetical or real problem, and on what systems. Adding a few printk()s to the latest kernel: @@ -778,6 +778,8 @@ static void crng_initialize(struct crng_state *crng) int i; unsigned long rv; + printk("crng_initialize called\n"); + memcpy(>state[0], "expand 32-byte k", 16); if (crng == _crng) _extract_entropy(_pool, >state[4], @@ -1149,6 +1151,9 @@ void add_interrupt_randomness(int irq, int irq_flags) fast_mix(fast_pool); add_interrupt_bench(cycles); + if (fast_pool->count >= 64) + printk("add_interrupt_randomness: fast_pool->count >= 64, dumping entropy"); + if (!crng_ready()) { if ((fast_pool->count >= 64) && crng_fast_load((char *) fast_pool->pool, And using: $ qemu-system-x86_64 --version QEMU emulator version 2.1.2 (Debian 1:2.1+dfsg-12+deb8u6), Copyright (c) 2003-2008 Fabrice Bellard $ qemu-system-x86_64 -nographic -enable-kvm -m 1024M -kernel bzImage -append "root=/dev/sda1 loglevel=3 console=ttyS0" hd3 I get: [0.010247] mce: CPU supports 10 MCE banks [0.010317] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0 [0.010318] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0 [0.064002] add_interrupt_randomness: fast_pool->count >= 64, dumping entropy [0.128003] add_interrupt_randomness: fast_pool->count >= 64, dumping entropy [0.160364] Freeing SMP alternatives memory: 36K [0.160428] ftrace: allocating 35771 entries in 140 pages [0.172384] smpboot: Max logical packages: 1 [0.173964] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 [0.184000] add_interrupt_randomness: fast_pool->count >= 64, dumping entropy [0.184000] add_interrupt_randomness: fast_pool->count >= 64, dumping entropy [0.184000] random: fast init done [0.184000] smpboot: CPU0: Intel QEMU Virtual CPU version 2.1.2 (family: 0x6, model: 0x6, stepping: 0x3) [0.184000] Performance Events: PMU not available due to virtualization, usin g software events only. [0.184000] crng_initialize called [0.184000] crng_initialize called [0.184000] smp: Bringing up secondary CPUs ... [0.184000] smp: Brought up 1 node, 1 CPU Sometimes I get three add_interrupt_randomness lines instead of four which is fine but still cutting things awfully close.
Re: [PATCH] random: Don't overwrite CRNG state in crng_initialize()
On Wed, Feb 08, 2017 at 08:31:26PM -0700, Alden Tondettar wrote: > In short, the situation is: > > A) No usable hardware RNG or arch_get_random() (or we don't trust it...) Wait, why would you not trust arch_get_random()? Is it broken somehow on some arches? If so, why not fix that as well? thanks, greg k-h
Re: [PATCH] random: Don't overwrite CRNG state in crng_initialize()
On Wed, Feb 08, 2017 at 08:31:26PM -0700, Alden Tondettar wrote: > The new non-blocking system introduced in commit e192be9d9a30 ("random: > replace non-blocking pool with a Chacha20-based CRNG") can under > some circumstances report itself initialized while it still contains > dangerously little entropy, as follows: > > Approximately every 64th call to add_interrupt_randomness(), the "fast" > pool of interrupt-timing-based entropy is fed into one of two places. At > calls numbered <= 256, the fast pool is XORed into the primary CRNG state. > At call 256, the CRNG is deemed initialized, getrandom(2) is unblocked, > and reading from /dev/urandom no longer gives warnings. > > At calls > 256, the fast pool is fed into the input pool, leaving the CRNG > untouched. > > The problem arises between call number 256 and 320. If crng_initialize() > is called at this time, it will overwrite the _entire_ CRNG state with > 48 bytes generated from the input pool. So in practice this isn't a problem because crng_initialize is called in early init. For reference, the ordering of init calls are: "early", <--- crng_initialize is here() "core", < ftrace is initialized here() "postcore", "arch", "subsys", < acpi_init is here() "fs", "device", < device probing is here "late", So in practice, call 256 typically happens **well** after crng_initialize. You can see where it is the boot messages, which is after 2.5 seconds into the boot: [2.570733] rtc_cmos 00:02: alarms up to one month, y3k, 114 bytes nvram, hpet irqs [2.570863] usbcore: registered new interface driver i2c-tiny-usb [2.571035] device-mapper: uevent: version 1.0.3 [2.571215] random: fast init done <- [2.571316] device-mapper: ioctl: 4.35.0-ioctl (2016-06-23) initialised: dm-de...@redhat.com [2.571678] device-mapper: multipath round-robin: version 1.1.0 loaded [2.571728] intel_pstate: Intel P-state driver initializing [2.572331] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input3 [2.572462] intel_pstate: HWP enabled [2.572464] sdhci: Secure Digital Host Controller Interface driver When is crng_initialize() called? Sometime *before* 0.05 seconds into the boot on my laptop: [0.054529] ftrace: allocating 29140 entries in 114 pages > In short, the situation is: > > A) No usable hardware RNG or arch_get_random() (or we don't trust it...) > B) add_interrupt_randomness() called 256-320 times but other >add_*_randomness() functions aren't adding much entropy. > C) then crng_initialize() is called > D) not enough calls to add_*_randomness() to push the entropy >estimate over 128 (yet) > E) getrandom(2) or /dev/urandom used for something important > > Based on a few experiments with VMs, A) through D) can occur easily in > practice. And with no HDD we have a window of about a minute or two for > E) to happen before add_interrupt_randomness() finally pushes the > estimate over 128 on its own. How did you determine when crng_initialize() was being called? On a VM generally there are fewer interrupts than on real hardware. On KVM, for I see the random: fast_init message being printed 3.6 seconds into the boot. On Google Compute Engine, the fast_init message happens 52 seconds into the boot. So what VM where you using? I'm trying to figure out whether this is hypothetical or real problem, and on what systems. - Ted