On Sat, Jun 02, 2018 at 10:02:39PM +0100, Ken Moffat wrote:
> I've been seeing problems on some of my machines with recent kernels
> (first noticed in 4.17-rc, but it also now happends in 4.16.4 or
> later). The problem is that instead of unbound taking a handful of
> seconds to start (often, it is all-but immediate), on the affected
> machines it now takes up to two and a half minutes.
>
Finally, making slow progress on this. The problem is caused by the
fix for CVE-2018-1108. A little while ago Ted Ts'o offered a patch,
possibly as an RFC, to use entropy from the hwrng (unsafe for
critical things like key generation, but it allows less-important
things, e.g. in systemd units, to run and therefore it lets the box
boot in the absence of real entropy.
Apparently he did this because fedora are starting to derive
"entropy" from jitter so that e.g. VMs can boot in a meaningful
time.
For my haswell that was great, but for my kaveri it made no
difference - turns out that the kaveri does NOT have a hwrng (I
enabled the option, and /dev/hwrng exists, but reading it with dd
reports 'No such file').
And the patch which introduced this fix can no-longer be reverted,
parts of the file, at least in 4.18-rc5, have been rewritten.
What I will now be looking at is twofold:
1. start the random bootscript earlier (currently it is S25, but
unbound is S21; S15 - just after sysklogd - looks likely).
For systemd, I've no idea how to change the dependencies.
AND
2. persuade unbound to use /dev/urandom.
Googling, mostly unsuccessfully, I found that Nixos create
/var/lib/unbound/dev/random (sic) with /var/lib/unbound as the home
directory for the unbound user, and binds /dev/urandom to it. They
also seem to move the root key, and perhaps unbound.conf, to that
directory. So, as well as moving the random script, the unbound
bootscript needs to be modified (and unmount afterwards).
To recap, only some of my machines with an SSD (and no 'spinning
rust') are affected.
The alternative for the second part is to hack unbound. In 1.7.1,
the compat/getentropy_linux.c file has:
#if defined(SYS_getrandom) && defined(__NR_getrandom)
/*
* Try descriptor-less getrandom()
*/
ret = getentropy_getrandom(buf, len);
if (ret != -1)
return (ret);
if (errno != ENOSYS)
return (-1);
#endif
/*
* Try to get entropy with /dev/urandom
*
* This can fail if the process is inside a chroot or if file
* descriptors are exhausted.
*/
ret = getentropy_urandom(buf, len);
if (ret != -1)
return (ret);
#ifdef SYS__sysctl
/*
* Try to use sysctl CTL_KERN, KERN_RANDOM, RANDOM_UUID.
* sysctl is a failsafe API, so it guarantees a result. This
* should work inside a chroot, or when file descriptors are
* exhausted.
*
* However this can fail if the Linux kernel removes support
* for sysctl. Starting in 2007, there have been efforts to
* deprecate the sysctl API/ABI, and push callers towards use
* of the chroot-unavailable fd-using /proc mechanism --
* essentially the same problems as /dev/urandom.
*
* Numerous setbacks have been encountered in their deprecation
* schedule, so as of June 2014 the kernel ABI still exists on
* most Linux architectures. The sysctl() stub in libc is missing
* on some systems. There are also reports that some kernels
* spew messages to the console.
*/
ret = getentropy_sysctl(buf, len);
if (ret != -1)
return (ret);
#endif /* SYS__sysctl */
If it gets to this point, on linux it then uses
getentropy_fallback().
What is happening is that it hangs until hammering on the keyboard
has generated enough entropy, so I'm currently assuming that the
initial ret = getentropy_getrandom(buf, len); now blocks until
sufficient entropy is available - and that is the expected behaviour
on linux.
To be honest, deleting that chunk of code looks easiest, but it
brings an ongoing maintenance commitment (1.7.1 is no longer
current, and whatever else happens there will probably be newer
versions in the future). This is the sort of case where I like
patches, they either apply to a new version, or they don't (whereas
deleting lines in sed might remove the wrong content).
For the unbound systemd unit, again I have no idea what to change.
Opinions on whether it is better to change the bootscript (assuming
that works) or hack the code ? In either case, urandom needs to be
seeded earlier.
Either way, this is not my number one priority. But it would be
nice to fix it before 8.3.
ĸen
--
Entropy not found, thump keyboard to continue
--
http://lists.linuxfromscratch.org/listinfo/blfs-support
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page