Re: Handling of entropy during boot
On Wed, 2019-09-11 at 15:52 -0400, Paul Thomas wrote: > Hi All, > > First off, I want to acknowledge how great system Debian is, very nice work! > > I know the issue with Entropy Starvation is understood, and I > understand the security concern: > https://wiki.debian.org/BoottimeEntropyStarvation > https://daniel-lange.com/archives/152-hello-buster.html > > However, I would just like to indicate how nuts this has been driving > me. First, with a new Buster install it took me a little while just to > figure out what was going on with sshd. I did install haveged, and > this helps for general cases. But then I have corner cases like when > the root filesystem is readonly then haveged doesn't work. > > I'm not using ancient hardware I'm on a modern arm64 processor, but it > is an embedded environment with no keyboard or mouse. And no hardware RNG? Ben. -- Ben Hutchings Unix is many things to many people, but it's never been everything to anybody. signature.asc Description: This is a digitally signed message part
Re: Handling of entropy during boot
Hi All, First off, I want to acknowledge how great system Debian is, very nice work! I know the issue with Entropy Starvation is understood, and I understand the security concern: https://wiki.debian.org/BoottimeEntropyStarvation https://daniel-lange.com/archives/152-hello-buster.html However, I would just like to indicate how nuts this has been driving me. First, with a new Buster install it took me a little while just to figure out what was going on with sshd. I did install haveged, and this helps for general cases. But then I have corner cases like when the root filesystem is readonly then haveged doesn't work. I'm not using ancient hardware I'm on a modern arm64 processor, but it is an embedded environment with no keyboard or mouse. -Paul
Re: Handling of entropy during boot
On Mon, 2019-01-21 at 21:46 +, Ben Hutchings wrote: > On Mon, 2019-01-21 at 20:49 +, Andy Simpkins wrote: > [...] > > Should we add to or change the possible entropy sources? > [...] > > Yes, we should (by default) enable use of available hardware RNGs to > produce entropy and if none is available then we should (by default) > install one of the various software entropy gathering daemons. In linux version 4.19.20-1 I've enabled CONFIG_RANDOM_TRUST_CPU. Ben. > We should also document this so that users that distrust certain > entropy sources will know how to disable them. > > Ben. > -- Ben Hutchings The world is coming to an end. Please log off. signature.asc Description: This is a digitally signed message part
Re: Handling of entropy during boot
On Mon, 2019-01-21 at 20:49 +, Andy Simpkins wrote: [...] > Should we add to or change the possible entropy sources? [...] Yes, we should (by default) enable use of available hardware RNGs to produce entropy and if none is available then we should (by default) install one of the various software entropy gathering daemons. We should also document this so that users that distrust certain entropy sources will know how to disable them. Ben. -- Ben Hutchings Klipstein's 4th Law of Prototyping and Production: A fail-safe circuit will destroy others. signature.asc Description: This is a digitally signed message part
Re: Handling of entropy during boot
Hi, This thread seems to have gone quite for some time. Re-Reading the thread I don't see any solutions being proposed that will truly suit everyone. If I have correctly understood the problem we are seeing a change from a more open and trusting software environment to one with more emphasis on security that is also less trusting: * More packages are requiring the use of the kernel's high quality entropy pool (including aspects of the kernel itself) * At the same time questions are being asked over how much we can trust our entropy sources. There is no agreement of which sources we should trust; this appears to be based upon a cultural perspective rather than evidence based. * Different platforms may have different entropy sources available to them (think desktops, mobile devices, headless servers, small IoT devices & virtualised instances) What does this mean for Buster? Some services may take a long time to start. I am not talking about a few seconds here, but instead minutes or even hours. I myself see sshd timing out and being restarted by systemd several times before finally starting some 7 min after the rest of the system on my ARM64 Mustang platform. I have seen reports of taking literally several hours for all services to start on some NAS boxes. Unfortunately some services fail to start completely, others are terminated and unlimited restart attempts are made. In all cases, that I have seen, there is no mention of the reason for the failed start being that there is insufficient entropy available. This itself is a bug whatever your view on how to address lack of available entropy during start-up. We should at the very least state the reason a service has not started. I believe that systemd has the ability to only start services when a given event has happened (i.e. wait for network). Should we be asking for wait for “entropy pool > x bytes” before starting a given service? Should we add to or change the possible entropy sources? Increasing the number of different sources of entropy may well reduce the time waiting for sufficient entropy, (although this is not an excuse not to explain why a service has failed to start). There has been some discussion about adding in further possible entropy sources, and whether or not that source is enabled by default of not. In general nobody appears to be arguing against having the ability to use additional entropy sources, the only debate is over which should be enabled by default within debian. This debate appears to boil down to ‘do I trust this source’ and it is accepted that this is very much dependant upon what the installation is going to be used for AND your geo-political leanings. i.e. you may well trust a HRNG for an Intel device if you are an American, but be less inclined to trust one from China, and vice versa. I don't think that we can OR SHOULD make a sensible decision for an out of the box experience that will suitable for all users. Perhaps instead we should consider a tool (to be included in DI as well as just the archive) that can present the different options and allow the user to decide? If this is the way we as a project decide to go I would very much like to be involved in this new package. Such a tool is probably beyond my ability to write, however I would be very happy to work on the design, UI and testing. Is this the right approach to take? Best regards Andy
Re: Handling of entropy during boot
On Jan 16, Guido Günther wrote: > There's also jitterentropy-rngd which does the trick but I haven't > looked at the security implications. Nowadays rngd collects jitter entropy, so I would not use something else. -- ciao, Marco signature.asc Description: PGP signature
Re: Handling of entropy during boot
On Wed, 2019-01-16 at 11:05 +0100, Guido Günther wrote: > Hi, > On Mon, Jan 14, 2019 at 05:56:20PM +0100, W. Martin Borgert wrote: > > Quoting Michael Stone : > > > Unless the cpu supports rdrand/rdseed, installing rng-tools5 > > > won't > > > really change anything. If it does support those, it probably > > > makes more > > > sense going forward to just enable CONFIG_RANDOM_TRUST_CPU rather > > > than > > > installing another package. > > > > This option is only available for some architectures (X86, S390, > > PPC)? > > What about the others? > > There's also jitterentropy-rngd which does the trick but I haven't > looked at the security implications. > -- Guido FWIW I've been using jitterentropy-rngd and rng-tools in production for years, in Azure/VMWare/AWS x86 VMs, exactly for this problem. Haven't been hacked so far... as far as I know :-) -- Kind regards, Luca Boccassi signature.asc Description: This is a digitally signed message part
Re: Handling of entropy during boot
Hi, On Mon, Jan 14, 2019 at 05:56:20PM +0100, W. Martin Borgert wrote: > Quoting Michael Stone : > > Unless the cpu supports rdrand/rdseed, installing rng-tools5 won't > > really change anything. If it does support those, it probably makes more > > sense going forward to just enable CONFIG_RANDOM_TRUST_CPU rather than > > installing another package. > > This option is only available for some architectures (X86, S390, PPC)? > What about the others? There's also jitterentropy-rngd which does the trick but I haven't looked at the security implications. -- Guido
Re: Handling of entropy during boot
On 1/14/19 7:07 AM, Thomas Goirand wrote: On 12/18/18 8:11 PM, Theodore Y. Ts'o wrote: If you are firmly convinced that there is a good chance that the NSA has suborned Intel in putting a backdoor into RDRAND, you won't want to use that boot option. I have read numerous times that some people trust this or that part of the instruction set, and I always found it silly. Why should some instruction or part of the Intel CPU be more trusted? To me, either you trust the entire CPU, or you just don't trust it at all and consider using other CPU brands. Am I wrong with this reasoning? I think the idea behind that is that the rest of the CPU has defined, verifiable behaviors. If NSA makes 1+1 sometimes equal 3, then that's detectable. So it'd be a fairly risky attack, someone might notice it. It also risks that other countries' NSA-equivalents make use of the backdoor. OTOH, the RNG is not verifiable. It's supposed to take two entropy sources and apply AES to them to combine them. But how do you know it actually did that? You can't tell what the input to AES was, at least as long as AES remains secure. It could well be giving you the equivalent of 1, 2, 3, 4, etc. encrypted with a key known only to NSA. And there is much less risk of another country taking advantage as the numbers still are fully CSPRNG — to everyone but NSA. (Also, see Dual_EC_DRBG)
Re: Handling of entropy during boot
On January 14, 2019 11:56:20 AM EST, "W. Martin Borgert" wrote: >Quoting Michael Stone : >> Unless the cpu supports rdrand/rdseed, installing rng-tools5 won't >> really change anything. If it does support those, it probably makes >> more sense going forward to just enable CONFIG_RANDOM_TRUST_CPU >> rather than installing another package. > >This option is only available for some architectures (X86, S390, PPC)? >What about the others? I'm not aware of a good general solution for them. -- Michael Stone (From phone, please excuse typos)
Re: Handling of entropy during boot
Quoting Michael Stone : Unless the cpu supports rdrand/rdseed, installing rng-tools5 won't really change anything. If it does support those, it probably makes more sense going forward to just enable CONFIG_RANDOM_TRUST_CPU rather than installing another package. This option is only available for some architectures (X86, S390, PPC)? What about the others?
Re: Re: Handling of entropy during boot
Sam Hartman wrote: "Marco" == Marco d'Itri writes: Marco> online. Is it enough to feed the host side of virtio-rng Marco> with /dev/random or should everybody who has virtual machines Marco> also install rngd in the host? Is rngd to be preferred to Marco> haveged? I'd also like to point out that virtio-rng is only a solution for kvm. I recently discovered that Vmware appears to have no virtual RNG available to the guest at all. A buster vmware guest will boot but will be unable to start sshd because of lack of entropy for typically five minutes or so. A lot of stuff breaks in that configuration. virtio-rng doesn't help at all. You can claim that Vmware is broken all you want, but a lot of people us it, and we really should produce an operating system that you can ssh into when you boot a bunch of instances in a virtual environment. Another data point: there exist high-profile KVM-based cloud providers that don't give their customers a virtio RNG device in the guest. One particular example is AliYun, also known as Alibaba Cloud. Note that in some locations they provide Xen, not KVM, instances, so try Shanghai if you want to confirm my statement. -- Alexander E. Patrakov smime.p7s Description: S/MIME Cryptographic Signature
Re: Handling of entropy during boot
On Mon, Jan 14, 2019 at 12:55:09PM +0100, Marco d'Itri wrote: Agreed. I think that d-i should install rngd (or haveged? And why?) if it detects a virtualized environment without virtio-rng. Unless the cpu supports rdrand/rdseed, installing rng-tools5 won't really change anything. If it does support those, it probably makes more sense going forward to just enable CONFIG_RANDOM_TRUST_CPU rather than installing another package. As far as haveged, it's not clear how much better that is than the old practice of having rngd read from /dev/urandom. Mike Stone
Re: Handling of entropy during boot
On 12/18/18 8:11 PM, Theodore Y. Ts'o wrote: > If you are firmly convinced that there is a good > chance that the NSA has suborned Intel in putting a backdoor into > RDRAND, you won't want to use that boot option. I have read numerous times that some people trust this or that part of the instruction set, and I always found it silly. Why should some instruction or part of the Intel CPU be more trusted? To me, either you trust the entire CPU, or you just don't trust it at all and consider using other CPU brands. Am I wrong with this reasoning? Cheers, Thomas Goirand (zigo)
Re: Handling of entropy during boot
On Jan 13, Sam Hartman wrote: > I recently discovered that Vmware appears to have no virtual RNG > available to the guest at all. AFAIK you are right. > A buster vmware guest will boot but will be unable to start sshd because > of lack of entropy for typically five minutes or so. > A lot of stuff breaks in that configuration. > virtio-rng doesn't help at all. > > You can claim that Vmware is broken all you want, but a lot of people us > it, and we really should produce an operating system that you can ssh > into when you boot a bunch of instances in a virtual environment. Agreed. I think that d-i should install rngd (or haveged? And why?) if it detects a virtualized environment without virtio-rng. -- ciao, Marco signature.asc Description: PGP signature
Re: Handling of entropy during boot
> "Marco" == Marco d'Itri writes: Marco> online. Is it enough to feed the host side of virtio-rng Marco> with /dev/random or should everybody who has virtual machines Marco> also install rngd in the host? Is rngd to be preferred to Marco> haveged? I'd also like to point out that virtio-rng is only a solution for kvm. I recently discovered that Vmware appears to have no virtual RNG available to the guest at all. A buster vmware guest will boot but will be unable to start sshd because of lack of entropy for typically five minutes or so. A lot of stuff breaks in that configuration. virtio-rng doesn't help at all. You can claim that Vmware is broken all you want, but a lot of people us it, and we really should produce an operating system that you can ssh into when you boot a bunch of instances in a virtual environment. --Sam
Re: Handling of entropy during boot
On Jan 09, "Theodore Y. Ts'o" wrote: > x86 systems have a high resolution timer; Rasberry PI's don't. > Furthermore, if libvirt is miconfigured, it should just be fixed (and > better yet, it should be configured to enable virtio-rng, which is > *not* hard). Can you clarify what is the best practice here? I am finding a lot of conflicting and often obviously clueless advice online. Is it enough to feed the host side of virtio-rng with /dev/random or should everybody who has virtual machines also install rngd in the host? Is rngd to be preferred to haveged? Data points: none of my current virtualization hosts (very new HPE Gen10 and Cisco UCS M5 blades) have an hardware RNG available to the kernel, at least with RHEL 7. When rngd is installed it reports RDRAND and jitter entropy (the rngd internal source, not the kernel module) to be available. -- ciao, Marco signature.asc Description: PGP signature
Re: Handling of entropy during boot
On Thu, Jan 10, 2019 at 03:57:00PM +0100, Michael Biebl wrote: with possible solutions like installing haveged It still isn't clear to me that this is actually secure, so I'm not sure we should be telling people to do it in release notes. Mike Stone
Re: Handling of entropy during boot
On Thu, 10 Jan 2019, Michael Biebl wrote: > Am 10.01.19 um 15:51 schrieb Stefan Fritsch: > > On Thu, 10 Jan 2019, Michael Biebl wrote: > >>> ACK, we also had to do the same in Grml[.org] and our latest release > >>> (2018.12). Now we automatically enable haveged when users boot using > >>> the ssh boot option (which is something Grml specific, taking care > >>> of setting user password and invoking the ssh service). > >> > >> And this is a perfect example why crediting the seed file (#914297) is > >> not a solution to this problem. > > > > While I still think this case should be handled by documentation, let's > > try to find a way forward that we can agree upon. > > > > I think the absolute minimum we need something that prints a big fat > > warning during boot if the RNG is not yet initialized, points out that > > further services may block and that the admin should add entropy sources > > like virtio-rng or rdrand. The time when this warning should be printed > > should probably be before network is started, because if the admin has > > configured vpn services in /etc/network/interfaces, those will already > > block because of lack of entropy. > > > > A second thing we need is a service that finishes when the RNG is > > initialized and that has a suitable large timeout for starting (maybe one > > day?). Services that need randomness can then depend on that service and > > don't need to set their own timeout to huge values. Also it is a lot > > easier to see what's wrong if the "wait for RNG" service is blocking than > > if some random network service is blocking. > > > > More things should be done but maybe we can figure those out while we > > implement the above two things. Can we agree on this? > > > > I'd prefer having this documented in the release notes: > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=916690 > with possible solutions like installing haveged, configuring virtio-rng, > etc. depending on the situation. That would be an extremely user-unfriendly "solution" and would lead to countless hours of debugging and useless bug reports.
Re: Handling of entropy during boot
Am 10.01.19 um 15:51 schrieb Stefan Fritsch: > On Thu, 10 Jan 2019, Michael Biebl wrote: >>> ACK, we also had to do the same in Grml[.org] and our latest release >>> (2018.12). Now we automatically enable haveged when users boot using >>> the ssh boot option (which is something Grml specific, taking care >>> of setting user password and invoking the ssh service). >> >> And this is a perfect example why crediting the seed file (#914297) is >> not a solution to this problem. > > While I still think this case should be handled by documentation, let's > try to find a way forward that we can agree upon. > > I think the absolute minimum we need something that prints a big fat > warning during boot if the RNG is not yet initialized, points out that > further services may block and that the admin should add entropy sources > like virtio-rng or rdrand. The time when this warning should be printed > should probably be before network is started, because if the admin has > configured vpn services in /etc/network/interfaces, those will already > block because of lack of entropy. > > A second thing we need is a service that finishes when the RNG is > initialized and that has a suitable large timeout for starting (maybe one > day?). Services that need randomness can then depend on that service and > don't need to set their own timeout to huge values. Also it is a lot > easier to see what's wrong if the "wait for RNG" service is blocking than > if some random network service is blocking. > > More things should be done but maybe we can figure those out while we > implement the above two things. Can we agree on this? > I'd prefer having this documented in the release notes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=916690 with possible solutions like installing haveged, configuring virtio-rng, etc. depending on the situation. Michael -- Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? signature.asc Description: OpenPGP digital signature
Re: Handling of entropy during boot
On Thu, 10 Jan 2019, Michael Biebl wrote: > > ACK, we also had to do the same in Grml[.org] and our latest release > > (2018.12). Now we automatically enable haveged when users boot using > > the ssh boot option (which is something Grml specific, taking care > > of setting user password and invoking the ssh service). > > And this is a perfect example why crediting the seed file (#914297) is > not a solution to this problem. While I still think this case should be handled by documentation, let's try to find a way forward that we can agree upon. I think the absolute minimum we need something that prints a big fat warning during boot if the RNG is not yet initialized, points out that further services may block and that the admin should add entropy sources like virtio-rng or rdrand. The time when this warning should be printed should probably be before network is started, because if the admin has configured vpn services in /etc/network/interfaces, those will already block because of lack of entropy. A second thing we need is a service that finishes when the RNG is initialized and that has a suitable large timeout for starting (maybe one day?). Services that need randomness can then depend on that service and don't need to set their own timeout to huge values. Also it is a lot easier to see what's wrong if the "wait for RNG" service is blocking than if some random network service is blocking. More things should be done but maybe we can figure those out while we implement the above two things. Can we agree on this? Now, in which packages should those services be shipped? Should they be part of the individual init system packages or into some central package like initscripts? Any opinions?
Re: Handling of entropy during boot
On Wed, 9 Jan 2019, Theodore Y. Ts'o wrote: > On Wed, Jan 09, 2019 at 09:58:22AM +0100, Stefan Fritsch wrote: > > > > There have been a number of bug reports and blog posts about this, despite > > buster not being release yet. So it's not that uncommon. > > Pointers, please? Let's see them and investigate. The primary issue > I've been aware of to date has been on Fedora systems, and it's due to > some Red Hat specific changes that they made for FEDRAMP compliance > --- and Red Hat has dealt with those issues. > > If there are problems for people using Debian Testing, we should > investigate them and understand what is going on. Some other people already have sent you a few pointers (thanks!). The reason why I am looking into this is that it affects apache2 (see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914297 ). Apache does not call getrandom itself but libssl does, and it definitely needs secure randomness for diffie-hellman. So there is nothing that can or should be fixed in apache. More links are at the end of https://lists.debian.org/debian-devel/2018/12/msg00184.html Also, the thread on debian-kernel pointed to by Ben Hutchings is an interesting read, I had not noticed that before. > > No, that's utterly wrong. If it's a hassle to use good entropy, people > > will use gettimeofday() for getting "entropy" and they will use it for > > security relevant purposes. In this way, you would achieve exactly the > > opposite of what you want. > > If *users* do this, then if they end up releasing credit card numbers > or PII or violate their customers privacy which brings the EU's GDPR > enforcers down on then, it's on *their* heads. If *Debian* makes a > local Debian-specific change which causes these really bad outcomes, > then it's on *ours*. Since many users and developers will take the shortest path to a "working" service, we must make sure that the secure way just works. > > Any program that does secure network connections needs entropy for > > Diffie-Hellman. And even seeds for hash buckets can be security relevant. > > You really don't want that people need to distinguish between > > security-critical and stupid uses of entropy, because they WILL get it > > wrong. > > Sure, this is why developers need to investigate the bugs. You said > you provided links, but I couldn't find any in your e-mail messages or > earlier ones on this thread. Perhaps I missed them; in which case, my > apologies. Can you please send/resend those links? > > Can you please prioritize reports from people running Debian Unstable > or Debain Testing? As I said above, these issues tend to be very > distro specific, especially when distros are messing around with > crypto-related libraries in order to keep the US Government happy. As far as I can see, all reports are from unstable/testing only, because stable does not cause getrandom() to block (see https://lists.debian.org/debian-release/2018/05/msg00130.html ).
Re: Handling of entropy during boot
Am 10.01.19 um 14:23 schrieb Michael Prokop: > * Raphael Hertzog [Thu Jan 10, 2019 at 12:24:45PM +0100]: >> On Wed, 09 Jan 2019, Theodore Y. Ts'o wrote: > >>> Pointers, please? Let's see them and investigate. The primary issue >>> I've been aware of to date has been on Fedora systems, and it's due to >>> some Red Hat specific changes that they made for FEDRAMP compliance >>> --- and Red Hat has dealt with those issues. > >> In Kali I had to install haveged by default due to this problem. >> We got reports of having to wait up to 5 minutes to get to their desktop. >> We got reports of sshd not working on first boot (in fact just taking too >> long to start). > > ACK, we also had to do the same in Grml[.org] and our latest release > (2018.12). Now we automatically enable haveged when users boot using > the ssh boot option (which is something Grml specific, taking care > of setting user password and invoking the ssh service). And this is a perfect example why crediting the seed file (#914297) is not a solution to this problem. -- Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? signature.asc Description: OpenPGP digital signature
Re: Handling of entropy during boot
* Raphael Hertzog [Thu Jan 10, 2019 at 12:24:45PM +0100]: > On Wed, 09 Jan 2019, Theodore Y. Ts'o wrote: > > Pointers, please? Let's see them and investigate. The primary issue > > I've been aware of to date has been on Fedora systems, and it's due to > > some Red Hat specific changes that they made for FEDRAMP compliance > > --- and Red Hat has dealt with those issues. > In Kali I had to install haveged by default due to this problem. > We got reports of having to wait up to 5 minutes to get to their desktop. > We got reports of sshd not working on first boot (in fact just taking too > long to start). ACK, we also had to do the same in Grml[.org] and our latest release (2018.12). Now we automatically enable haveged when users boot using the ssh boot option (which is something Grml specific, taking care of setting user password and invoking the ssh service). We saw exactly what Daniel documented at https://daniel-lange.com/archives/152-Openssh-taking-minutes-to-become-available,-booting-takes-half-an-hour-...-because-your-server-waits-for-a-few-bytes-of-randomness.html regards, -mika- -- https://michael-prokop.at/ || https://adminzen.org/ https://grml-solutions.com/ || https://grml.org/ signature.asc Description: Digital signature
Re: Handling of entropy during boot
Hi, On Wed, 09 Jan 2019, Theodore Y. Ts'o wrote: > Pointers, please? Let's see them and investigate. The primary issue > I've been aware of to date has been on Fedora systems, and it's due to > some Red Hat specific changes that they made for FEDRAMP compliance > --- and Red Hat has dealt with those issues. In Kali I had to install haveged by default due to this problem. We got reports of having to wait up to 5 minutes to get to their desktop. We got reports of sshd not working on first boot (in fact just taking too long to start). https://bugs.kali.org/view.php?id=5124 https://bugs.kali.org/view.php?id=4994 https://bugs.kali.org/view.php?id=5011 I haven't looked, but it seems likely that thin.service is trying to generate some keys on initial startup. Which explains why it gets stalled. Cheers, -- Raphaël Hertzog ◈ Debian Developer Support Debian LTS: https://www.freexian.com/services/debian-lts.html Learn to master Debian: https://debian-handbook.info/get/
Re: Handling of entropy during boot
On Tue, Jan 08, 2019 at 10:41:55AM +0100, Stefan Fritsch wrote: > > If the security issue only affects a small percentage of the installations > and fixing it means breaking many other installations, then there has to > be a discussion if we really want fix the issue or if a "don't do that" > documentation is the better choice. One of the questions which needs to be answered is exactly how many installations are actually broken. I don't think it's going to be bad as you suspect > Rasberry PIs were only an example. There are also other systems, including > old x86 systems, that don't have a HWRNG. Also, there are probably a load > of x86 VMs that emulate an older CPU due to libvirt misconfiguration and > don't expose the rdrand cpuid bit. x86 systems have a high resolution timer; Rasberry PI's don't. Furthermore, if libvirt is miconfigured, it should just be fixed (and better yet, it should be configured to enable virtio-rng, which is *not* hard). > Systems that don't suffer from blocking on entropy because they have other > sources of entropy (hwrng, ...) won't have their security reduced because > the good entropy will still be added to the pool, regardless of the seed > file being credited or not. The question is how long they have to block. *Very* unfortunately, there's a lot of busted software that try to generate security-critical keys when the system is first booted, which is when entropy available is the least available. Such packages include ssh and various packages which call openssl (such as CUPS) which are visible to the internet. And if the system doesn't have good sources of entropy, and don't have sufficient interrupts to initialize the entropy pool, the question is what should we do? Should we just blindly proceed and let them generate insecure keys? At least, if the system blocks, they'll know something is wrong, and they can fix the problem (for example, such as *fixing* their libvirt configuration). Ultimately, I don't think it's a big problem, primarily because I'm not hearing a lot of yelling from Debian users. It may be annoying for your Rasberry Pi system, but the question is whether that is a common case or an isolated case. > So, how could we go forward from here. Maybe we could limit the wait for > entropy to some reasonable value (1 minute? 5 minutes?). This could be > done by creating a program that does a blocking getrandom but with a > timeout. If the timeout expires and the seed file has been read > successfully before, it would then credit the read entropy. This program > would be added as systemd unit so that services that need entropy can > depend on it and don't get killed with a timeout. Is this a reasonable > approach? Or do you (or anyone else) have any better suggestions? My suggest is to try and figure out *what* is blocking, and *why*. If it's because it's something security-critical, such as generating ssh keys, letting things continue even though we don't have secure entropy is a bad, bad, BAD idea. If it's for something stupid, like generating seeds for Python dictionaries (just as an example; that one has been fixed) then the application should be fixed not to request secure randomness in the first place. That's the correct fix, as opposed to a short cut that might leave us in worst place, from a security perspective. - Ted
Re: Handling of entropy during boot
On Wed, 2019-01-09 at 11:40 -0500, Theodore Y. Ts'o wrote: > On Wed, Jan 09, 2019 at 09:58:22AM +0100, Stefan Fritsch wrote: [...] > > No, that's utterly wrong. If it's a hassle to use good entropy, people > > will use gettimeofday() for getting "entropy" and they will use it for > > security relevant purposes. In this way, you would achieve exactly the > > opposite of what you want. > > If *users* do this, then if they end up releasing credit card numbers > or PII or violate their customers privacy which brings the EU's GDPR > enforcers down on then, it's on *their* heads. If *Debian* makes a > local Debian-specific change which causes these really bad outcomes, > then it's on *ours*. > > We've tried to do this ten years ago, when well-meaning Debian > Developers tried to "fix" OpenSSL's random number library, and it > turned out to be a disaster[1]. So let's be careful and to replicate > past mistakes, eh? It's a bit late for that: https://lists.debian.org/debian-release/2018/05/msg00130.html [...] > Sure, this is why developers need to investigate the bugs. You said > you provided links, but I couldn't find any in your e-mail messages or > earlier ones on this thread. Perhaps I missed them; in which case, my > apologies. Can you please send/resend those links? [...] I sent you a bunch of bug links in message in August. Ben. -- Ben Hutchings Every program is either trivial or else contains at least one bug signature.asc Description: This is a digitally signed message part
Re: Handling of entropy during boot
On Wed, Jan 9, 2019 at 12:13 PM Theodore Y. Ts'o wrote: > On Wed, Jan 09, 2019 at 09:58:22AM +0100, Stefan Fritsch wrote: > > > > There have been a number of bug reports and blog posts about this, > despite > > buster not being release yet. So it's not that uncommon. > > Pointers, please? Let's see them and investigate. https://daniel-lange.com/archives/152-Openssh-taking-minutes-to-become-available,-booting-takes-half-an-hour-...-because-your-server-waits-for-a-few-bytes-of-randomness.html https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=912087 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=912616 There's lots of chatter in the systemd github isses, too. I've been bitten by both ssh taking forever and puppet timing out on VMs. I'll need to investigate about virtio-rng. I've got an embedded x86-64 system where lightdm starts quickly when I am plugged into an ethernet connection, but takes about 8 minutes when the ethernet is disconnected. I am very suspicious of the low entropy in this case, too. -m
Re: Handling of entropy during boot
On Wed, Jan 09, 2019 at 09:58:22AM +0100, Stefan Fritsch wrote: > > There have been a number of bug reports and blog posts about this, despite > buster not being release yet. So it's not that uncommon. Pointers, please? Let's see them and investigate. The primary issue I've been aware of to date has been on Fedora systems, and it's due to some Red Hat specific changes that they made for FEDRAMP compliance --- and Red Hat has dealt with those issues. If there are problems for people using Debian Testing, we should investigate them and understand what is going on. > > My suggest is to try and figure out *what* is blocking, and *why*. If > > it's because it's something security-critical, such as generating ssh > > keys, letting things continue even though we don't have secure entropy > > is a bad, bad, BAD idea. If it's for something stupid, like > > generating seeds for Python dictionaries (just as an example; that one > > has been fixed) then the application should be fixed not to request > > secure randomness in the first place. > > No, that's utterly wrong. If it's a hassle to use good entropy, people > will use gettimeofday() for getting "entropy" and they will use it for > security relevant purposes. In this way, you would achieve exactly the > opposite of what you want. If *users* do this, then if they end up releasing credit card numbers or PII or violate their customers privacy which brings the EU's GDPR enforcers down on then, it's on *their* heads. If *Debian* makes a local Debian-specific change which causes these really bad outcomes, then it's on *ours*. We've tried to do this ten years ago, when well-meaning Debian Developers tried to "fix" OpenSSL's random number library, and it turned out to be a disaster[1]. So let's be careful and to replicate past mistakes, eh? [1] https://www.schneier.com/blog/archives/2008/05/random_number_b.html > Any program that does secure network connections needs entropy for > Diffie-Hellman. And even seeds for hash buckets can be security relevant. > You really don't want that people need to distinguish between > security-critical and stupid uses of entropy, because they WILL get it > wrong. Sure, this is why developers need to investigate the bugs. You said you provided links, but I couldn't find any in your e-mail messages or earlier ones on this thread. Perhaps I missed them; in which case, my apologies. Can you please send/resend those links? Can you please prioritize reports from people running Debian Unstable or Debain Testing? As I said above, these issues tend to be very distro specific, especially when distros are messing around with crypto-related libraries in order to keep the US Government happy. - Ted
Re: Handling of entropy during boot
On Tue, 8 Jan 2019, Theodore Y. Ts'o wrote: > On Tue, Jan 08, 2019 at 10:41:55AM +0100, Stefan Fritsch wrote: > > > > If the security issue only affects a small percentage of the installations > > and fixing it means breaking many other installations, then there has to > > be a discussion if we really want fix the issue or if a "don't do that" > > documentation is the better choice. > > One of the questions which needs to be answered is exactly how many > installations are actually broken. I don't think it's going to be bad > as you suspect There have been a number of bug reports and blog posts about this, despite buster not being release yet. So it's not that uncommon. > > > Rasberry PIs were only an example. There are also other systems, including > > old x86 systems, that don't have a HWRNG. Also, there are probably a load > > of x86 VMs that emulate an older CPU due to libvirt misconfiguration and > > don't expose the rdrand cpuid bit. > > x86 systems have a high resolution timer; Rasberry PI's don't. > Furthermore, if libvirt is miconfigured, it should just be fixed (and > better yet, it should be configured to enable virtio-rng, which is > *not* hard). It can be very hard if the VM host is not under your control. > > Systems that don't suffer from blocking on entropy because they have other > > sources of entropy (hwrng, ...) won't have their security reduced because > > the good entropy will still be added to the pool, regardless of the seed > > file being credited or not. > > The question is how long they have to block. *Very* unfortunately, > there's a lot of busted software that try to generate > security-critical keys when the system is first booted, which is when > entropy available is the least available. Such packages include ssh > and various packages which call openssl (such as CUPS) which are > visible to the internet. > > And if the system doesn't have good sources of entropy, and don't have > sufficient interrupts to initialize the entropy pool, the question is > what should we do? Should we just blindly proceed and let them > generate insecure keys? At least, if the system blocks, they'll know > something is wrong, and they can fix the problem (for example, such as > *fixing* their libvirt configuration). At the very least, there must be a clear message what the problem is. People having to use strace to figure out what is broken is just not acceptable. > Ultimately, I don't think it's a big problem, primarily because I'm > not hearing a lot of yelling from Debian users. I think the amount of yelling is already quite high, considering that it's only for testing and the vast majority of large deployments only use stable. I have included some links in my first mail. > It may be annoying > for your Rasberry Pi system, but the question is whether that is a > common case or an isolated case. > > So, how could we go forward from here. Maybe we could limit the wait for > > entropy to some reasonable value (1 minute? 5 minutes?). This could be > > done by creating a program that does a blocking getrandom but with a > > timeout. If the timeout expires and the seed file has been read > > successfully before, it would then credit the read entropy. This program > > would be added as systemd unit so that services that need entropy can > > depend on it and don't get killed with a timeout. Is this a reasonable > > approach? Or do you (or anyone else) have any better suggestions? > > My suggest is to try and figure out *what* is blocking, and *why*. If > it's because it's something security-critical, such as generating ssh > keys, letting things continue even though we don't have secure entropy > is a bad, bad, BAD idea. If it's for something stupid, like > generating seeds for Python dictionaries (just as an example; that one > has been fixed) then the application should be fixed not to request > secure randomness in the first place. No, that's utterly wrong. If it's a hassle to use good entropy, people will use gettimeofday() for getting "entropy" and they will use it for security relevant purposes. In this way, you would achieve exactly the opposite of what you want. Any program that does secure network connections needs entropy for Diffie-Hellman. And even seeds for hash buckets can be security relevant. You really don't want that people need to distinguish between security-critical and stupid uses of entropy, because they WILL get it wrong. For the most part, daemons block during startup because openssl decides it wants entropy for something. This is really difficult to change without creating other security issues. > That's the correct fix, as opposed to a short cut that might leave us > in worst place, from a security perspective. We already were there with the random() library call, and that was not a good situation. People used it for everything, including security-critical stuff. Now people have been educated to use good entrop
Re: Handling of entropy during boot
On Sun, 23 Dec 2018, Theodore Y. Ts'o wrote: > On Sun, Dec 23, 2018 at 05:52:31PM +0100, Stefan Fritsch wrote: > > I think some other questions should be considered first. Did Debian protect > > from these attacks in the past? The answer is clearly no. Now, should we > > break > > the systems of those people who keep their random-seed file secret and > > don't > > clone their OS image, in order to offer some protection to other people? > > This > > is really what we need to answer first, and in my opinion, we should try > > very > > hard not to break the systems of those users. And I see no other way than > > to > > credit the random seed file with entropy. > > I don't think this line of reasoning is valid. Supposed there was a > horrific security hole, such that 10% of publically available SSH > hosts had insecurely shared public keys such that were vulnerable to > being guessed[1]. Cearly, in the past (before we knew about such a > vulnerability) we did not protect those systems against this attack. > Does this mean we shouldn't in the future? I don't think it so > follows! If the security issue only affects a small percentage of the installations and fixing it means breaking many other installations, then there has to be a discussion if we really want fix the issue or if a "don't do that" documentation is the better choice. > [1] Mining your p's and q's: Widespread Weak Keys in Network Devices. > https://factorable.net > There is a balancing test that has to go on here. And quite frankly > Rasberry PI's are extremely problematic devices from a security > perspective. They use a coarse-grained clock, so it's very hard to > get good entropy out of timing events, and very the hardware that they > have on them is such that there aren't many events that we can use to > generate entropy in the first place. Rasberry PIs were only an example. There are also other systems, including old x86 systems, that don't have a HWRNG. Also, there are probably a load of x86 VMs that emulate an older CPU due to libvirt misconfiguration and don't expose the rdrand cpuid bit. Will the Linux kernel try to detect rdrand by detecting the UD exception or does it trust the cpuid bit? > I'm not sure that it's a great idea to weaken *all* Debian systems to > the security of Rasberry PI's, including x86 servers and laptops, just > because one platform has crappy hardware with respect to getting > secure random numbers. Systems that don't suffer from blocking on entropy because they have other sources of entropy (hwrng, ...) won't have their security reduced because the good entropy will still be added to the pool, regardless of the seed file being credited or not. > So perhaps the right answer is we have one default value for certain > architectures, or maybe classes of devices (e.g., a server-class ARM64 > device is very different from a IOT-style ARM platform). > > > > > One could also make it harder for an attacker to regenerate key material > > from > > a system where he knows the seed file. For example, if there is a RTC one > > could > > put the boot time and all serial numbers / MAC addresses that one can find > > into > > an expensive function like PBKDF2 or bcrypt and feed the result to the > > random > > seed. This way, even if the attacker has an approximate knowledge of most > > of > > that information, he would still need to spend quite a bit of computing > > power > > to get all the possible random seeds that could be used. > > We mix things like serial numbers and MAC addresses into the random > pool already. Unfortunately, if the attacker can snoop the > random-seed file, it's likely he or she can simply obtain the MAC > addresses or serial numbers of the device. Including the boot time would help, if this was done with sufficient granularity, but the boot time can probably leak by stuff like tcp timestamps, too. Still, making it more expensive for an attacker to try all possible values may still be a good idea. > > If the number of rounds in the function depends on timing, like do > > as many rounds as possible in 1 second, things like the load of the > > VM host and the temperature of the CPU will also play a role in the > > result. A sha sum of dmesg would probably also help, because it > > contains a lot of timings that also depend on the load of the VM > > host. > > We are already mixing timing information into the entropy pool, and to > the extent that there is randomness there, it is cr editedi > appropriately. The problem is that the Rasberry Pi doesn't have a > fine-grained clock, and there is a lot less entropy from timing events > than most people might suppose. > > As I said, though; it's one thing for this to be added to the entropy > pool. It's quite another for it to be reflected in the random seed > file. Today, if the system was booted a year ago, the random seed > file will not have been updated for the past 12 months. The last time > it
Re: Handling of entropy during boot
On Sun, Dec 23, 2018 at 05:52:31PM +0100, Stefan Fritsch wrote: > I think some other questions should be considered first. Did Debian protect > from these attacks in the past? The answer is clearly no. Now, should we > break > the systems of those people who keep their random-seed file secret and don't > clone their OS image, in order to offer some protection to other people? This > is really what we need to answer first, and in my opinion, we should try very > hard not to break the systems of those users. And I see no other way than to > credit the random seed file with entropy. I don't think this line of reasoning is valid. Supposed there was a horrific security hole, such that 10% of publically available SSH hosts had insecurely shared public keys such that were vulnerable to being guessed[1]. Cearly, in the past (before we knew about such a vulnerability) we did not protect those systems against this attack. Does this mean we shouldn't in the future? I don't think it so follows! [1] Mining your p's and q's: Widespread Weak Keys in Network Devices. https://factorable.net There is a balancing test that has to go on here. And quite frankly Rasberry PI's are extremely problematic devices from a security perspective. They use a coarse-grained clock, so it's very hard to get good entropy out of timing events, and very the hardware that they have on them is such that there aren't many events that we can use to generate entropy in the first place. I'm not sure that it's a great idea to weaken *all* Debian systems to the security of Rasberry PI's, including x86 servers and laptops, just because one platform has crappy hardware with respect to getting secure random numbers. So perhaps the right answer is we have one default value for certain architectures, or maybe classes of devices (e.g., a server-class ARM64 device is very different from a IOT-style ARM platform). > > One could also make it harder for an attacker to regenerate key material from > a system where he knows the seed file. For example, if there is a RTC one > could > put the boot time and all serial numbers / MAC addresses that one can find > into > an expensive function like PBKDF2 or bcrypt and feed the result to the random > seed. This way, even if the attacker has an approximate knowledge of most of > that information, he would still need to spend quite a bit of computing power > to get all the possible random seeds that could be used. We mix things like serial numbers and MAC addresses into the random pool already. Unfortunately, if the attacker can snoop the random-seed file, it's likely he or she can simply obtain the MAC addresses or serial numbers of the device. > If the number of rounds in the function depends on timing, like do > as many rounds as possible in 1 second, things like the load of the > VM host and the temperature of the CPU will also play a role in the > result. A sha sum of dmesg would probably also help, because it > contains a lot of timings that also depend on the load of the VM > host. We are already mixing timing information into the entropy pool, and to the extent that there is randomness there, it is cr editedi appropriately. The problem is that the Rasberry Pi doesn't have a fine-grained clock, and there is a lot less entropy from timing events than most people might suppose. As I said, though; it's one thing for this to be added to the entropy pool. It's quite another for it to be reflected in the random seed file. Today, if the system was booted a year ago, the random seed file will not have been updated for the past 12 months. The last time it would have been updated is shortly after the system was first booted. This is **terrible* if you want to assume that we should give full credit to the random-seed file --- because entropy means, "not known to the adversary". The adversary can have access to it, including, say, when ethernet interrupts may have caused timing events because the Rasberry PI only keeps time to 100Hz granularity, and an outside attacker can look at the external timing of packets on the network, assuming that the timing of network interrupts are actually contributing entropy is not clear. I understand that having Rasberry Pi's take a long time to boot because they don't have entropy is frustrating. But is silently assuming they have entropy when someone really determined to reverset engineer state of the pool a preferable alternative? If someone is using the prototype and IOT device (remember: 'S' in IOT standards for security), maybe it's fine, since IOT devices are generally wide open to security problems anyway, so what's one more? Just don't put them on *my* home network. :-) But is that *really* the best answer for Debian? My opinion is "no" At least, let's please not make the security for x86 servers and desktops worse just to please Rasberry Pi IOT developers - Ted
Re: Handling of entropy during boot
On Tuesday, 18 December 2018 20:11:58 CET you wrote: > On Mon, Dec 17, 2018 at 09:46:42PM +0100, Stefan Fritsch wrote: > > There is a random seed file stored by systemd-random-seed.service that > > saves entropy from one boot and loads it again after the next reboot. The > > random seed file is re-written immediately after the file is read, so the > > system not properly shutting down won't cause the same seed file to be > > used again. The problem is that systemd (and probably > > /etc/init.d/urandom, too) does not set the flag that allows the kernel to > > credit the randomness and so the kernel does not know about the entropy > > contained in that file. Systemd upstream argues that this is supposed to > > protect against the same OS image being used many times [3]. (More links > > to more discussion can be found at [4]). > > This is an issue which Debian should be deciding more than systemd, > since the issues involved involve how the entire OS is packaged and > installed. I definitely agree with that. > That being said, the issues involved are subtle. > > The decision to not credit any randomness for the contents of > /var/lib/systemd/random-seed is definitely the conservative thing to > do. One of the issues is indeed what happens if the OS image gets > reused. And it's not just for Virtual Machines, but it can also be an > issue any time an image is cloned --- for example, in some kind of > consumer electronic device. Another question is that has to be > considered is whether you trust that random-seed file hasn't been > tampered with or read between it was written and when the system is > next booted. For example, if the "Targetted Access Organization" at > NSA, or its equivalent at German BND, or Chinese MSS, etc., were to > intercept a specific device, and read the random-seed file, they > wouldn't need to make any changes to the devices (which might, after > all, be detectable). If the OS were to blindly trust the random-seed > file as having entropy that can't be guessed by an adversary, this > kind of attack becomes possible. > > Now, should Debian care about this particular attack? I think some other questions should be considered first. Did Debian protect from these attacks in the past? The answer is clearly no. Now, should we break the systems of those people who keep their random-seed file secret and don't clone their OS image, in order to offer some protection to other people? This is really what we need to answer first, and in my opinion, we should try very hard not to break the systems of those users. And I see no other way than to credit the random seed file with entropy. > If the kernel is only going to be used by a VM, you have to trust the > Host OS provider, and if you're paranoid enough that you doubt Intel's > ability to resist being suborned by the NSA, you're probably going to > be even more concerned of the hosting/cloud provider from being in bed > with the its local government authorities. So what the default should > be for Google's "Cloud Optimized OS" is pretty obvious. The COS > kernel trusts RDRAND, and this avoids any delays in the boot process > waiting for the random number to be securely initialized --- because > we trust RDRAND. RDRAND is not the answer here, simply because not all architectures have it. Do Raspberry Pis have a HW-RNG? I am pretty sure that they don't. My cubietruck definitely does not. Therefore the question what to do with RDRAND is not related to the question above, as it does not prevent breaking people's systems. > That being said, there are some thing we can do that can help > regardless of what the default ends up being, and how we enable users > or image installers to change the default. For example, at least > every day, or perhaps sooner (and maybe once an hour if the device is > powered by the AC mains) the contents of the random-seed file should > be refreshed. The reason for that is that if the system has been up > for weeks or month, and the user reboots the system by forcing power > down or if the kernel crashes, or if the user is in too much of a > hurry to wait for a clean shutdown sequence, and runs something like > "echo b > /proc/sysrq-trigger", there is an increased chance that the > random-seed file may have been snooped sometime in the past > week/month/quarter. One could also make it harder for an attacker to regenerate key material from a system where he knows the seed file. For example, if there is a RTC one could put the boot time and all serial numbers / MAC addresses that one can find into an expensive function like PBKDF2 or bcrypt and feed the result to the random seed. This way, even if the attacker has an approximate knowledge of most of that information, he would still need to spend quite a bit of computing power to get all the possible random seeds that could be used. If the number of rounds in the function depends on timing, like do as many rounds as possible in 1 second, t
Re: Handling of entropy during boot
On Mon, Dec 17, 2018 at 09:46:42PM +0100, Stefan Fritsch wrote: > > There is a random seed file stored by systemd-random-seed.service that saves > entropy from one boot and loads it again after the next reboot. The random > seed file is re-written immediately after the file is read, so the system not > properly shutting down won't cause the same seed file to be used again. The > problem is that systemd (and probably /etc/init.d/urandom, too) does not set > the flag that allows the kernel to credit the randomness and so the kernel > does > not know about the entropy contained in that file. Systemd upstream argues > that > this is supposed to protect against the same OS image being used many times > [3]. (More links to more discussion can be found at [4]). This is an issue which Debian should be deciding more than systemd, since the issues involved involve how the entire OS is packaged and installed. That being said, the issues involved are subtle. The decision to not credit any randomness for the contents of /var/lib/systemd/random-seed is definitely the conservative thing to do. One of the issues is indeed what happens if the OS image gets reused. And it's not just for Virtual Machines, but it can also be an issue any time an image is cloned --- for example, in some kind of consumer electronic device. Another question is that has to be considered is whether you trust that random-seed file hasn't been tampered with or read between it was written and when the system is next booted. For example, if the "Targetted Access Organization" at NSA, or its equivalent at German BND, or Chinese MSS, etc., were to intercept a specific device, and read the random-seed file, they wouldn't need to make any changes to the devices (which might, after all, be detectable). If the OS were to blindly trust the random-seed file as having entropy that can't be guessed by an adversary, this kind of attack becomes possible. Now, should Debian care about this particular attack? I suspect people of good will could very well disagree. There is a similar issue with newer kernels which support the boot-command-line option random.trust_cpu=on. If you are firmly convinced that there is a good chance that the NSA has suborned Intel in putting a backdoor into RDRAND, you won't want to use that boot option. But from the perspective of the distro, especially one who is striving to be a "Universal OS", how should you set this default? If the kernel is only going to be used by a VM, you have to trust the Host OS provider, and if you're paranoid enough that you doubt Intel's ability to resist being suborned by the NSA, you're probably going to be even more concerned of the hosting/cloud provider from being in bed with the its local government authorities. So what the default should be for Google's "Cloud Optimized OS" is pretty obvious. The COS kernel trusts RDRAND, and this avoids any delays in the boot process waiting for the random number to be securely initialized --- because we trust RDRAND. But for the Universal OS, it answer of whether we should blindly trust the random-seed or RDRAND is not so easy. I can construct scenarios where we should obviously trust random-seed --- and scenarios where we shouldn't. And we could throw it up to the user, and ask them to answer a question at installation time --- but most users probably won't be equipped to be able to answer the question with full understanding of the consequencs one way or another. That being said, there are some thing we can do that can help regardless of what the default ends up being, and how we enable users or image installers to change the default. For example, at least every day, or perhaps sooner (and maybe once an hour if the device is powered by the AC mains) the contents of the random-seed file should be refreshed. The reason for that is that if the system has been up for weeks or month, and the user reboots the system by forcing power down or if the kernel crashes, or if the user is in too much of a hurry to wait for a clean shutdown sequence, and runs something like "echo b > /proc/sysrq-trigger", there is an increased chance that the random-seed file may have been snooped sometime in the past week/month/quarter. > A refinement of the random seed handling could be to check if the hostname/ > virtual machine-id is the same when saving the seed, and only credit the > entropy if it is unchanged since the last boot. This is a good idea, but how you set the virtual machine-id is very cloud/hosting provider specific. Also, very often, in many cloud environments, the hostname is not set until after the network is brought up, since they end up querying the hostname for the VM via the metadata server. Also, for a kernel meant for a virtualization or cloud environment, my recommendation is to use random.trust_cpu=on, or compile the kernel with CONFIG_RANDOM_TRUST_CPU, which sets random.trust_cpu to be defaulted to on. Trusting RDRAND in a
Handling of entropy during boot
Hi, since the getrandom() system call is used more and more, there have been bugs that services that use it block for a long time at startup and/or get killed by systemd because they don't start fast enough [1, 2] There is a random seed file stored by systemd-random-seed.service that saves entropy from one boot and loads it again after the next reboot. The random seed file is re-written immediately after the file is read, so the system not properly shutting down won't cause the same seed file to be used again. The problem is that systemd (and probably /etc/init.d/urandom, too) does not set the flag that allows the kernel to credit the randomness and so the kernel does not know about the entropy contained in that file. Systemd upstream argues that this is supposed to protect against the same OS image being used many times [3]. (More links to more discussion can be found at [4]). But an identical OS image needs to be modified anyway in order to be secure (re-create ssh host keys, change root password, re-create ssl-cert's private keys, etc.). Injecting some entropy in some way is just another task that needs to be done for that use case. So basically the current implementation of systemd-random-seed.service breaks stuff for everyone while not fixing the thing they are claiming to fix. Also, the breakage will cause people to invent their own workarounds which will probably create more security issues than those that are fixed by the systemd behavior. Therefore I think it should be the default to credit the entropy of the saved random seed when loading it, and the special needs of identical OS images used many times should be documented in the release notes. A refinement of the random seed handling could be to check if the hostname/ virtual machine-id is the same when saving the seed, and only credit the entropy if it is unchanged since the last boot. In case that the random seed file is not present (or the hostname/machine-id check fails), services may still block for a long time until they start. To avoid that they are killed by systemd because of timeouts, there should be a oneshot service that waits for getrandom to unblock and that other services can use as a dependency. (This is not neccessary with /etc/init.d/urandom because there are no timeouts). The systemd maintainers argue that individual services should handle this problem [1,2]. But this does not scale and the whole point of the getrandom() syscall is that it cannot fail and that its users do not need fallback code that is not well-tested and probably buggy. [5] Cheers, Stefan [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=912087 [2] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914297 [3] https://github.com/systemd/systemd/issues/4271 [4] https://daniel-lange.com/archives/152-Openssh-taking-minutes-to-become-available,-booting-takes-half-an-hour-...-because-your-server-waits-for-a-few-bytes-of-randomness.html [5] https://lwn.net/Articles/605828/