bug#37501: [core-updates] Entropy starvation during boot

Ludovic Courtès Fri, 04 Oct 2019 02:03:25 -0700

Hi Marius,

Marius Bakke <[email protected]> skribis:

> Ludovic Courtès <[email protected]> writes:

[...]

>> I read some of these, and our ‘urandom-seed-service-type’ has the same
>> bug as <https://github.com/systemd/systemd/issues/4271>.  Namely, we
>> write the previous seed to /dev/urandom but we don’t credit the
>> entropy.
>>
>> The attached patch fixes that, and I think it should fix the problem you
>> reported.  Could people give it a try?
>
> Good catch, LGTM.  Unfortunately it does not fix the problem.
>
>> I’m interested in seeing the value of
>> /proc/sys/kernel/random/entropy_avail with and without this patch right
>> after boot (don’t try it in ‘guix system vm’ because there’s no seed
>> there.)
>
> before -  243
> after  - 2419

Is that with or without sshd running?

Do we have strong evidence that sshd is stuck in getrandom(2)?

That seems weird to me because we use #:pid-file for sshd, and thus
either sshd produces its PID file and we’re done (‘ssh-daemon’ is
considered started and life goes on), or sshd fails to produce its PID
file within 15 seconds and we kill it and consider that ‘ssh-daemon’
failed to start.

This only way this can hang is if sshd calls getrandom(2) before
daemonizing.

Looking at ‘main’ in sshd.c, I see:

  seed_rng();
  […]
  already_daemon = daemonized();

which I think means sshd indeed calls getrandom(2) before it has forked.
That explains the situation.  :-/

(‘seed_rng’ uses ‘RAND_status’ from OpenSSL, which supports several
methods but presumably defaults to getrandom(2)?)

> I don't know why this change was insufficient.  Perhaps the kernel
> does not consider such a seed alone trustworthy enough?  I also tried to
> increase the seed size to no avail.

Can you try to do:

  (add-to-entropy-count urandom (expt 2 17))

to see if that changes anything at all?

I checked with strace and the RNDADDTOENTCNT binding seems to be passing
its argument as expected.

> I found this patch in the 5.4 kernel tree after reading the commit log
> of random.c:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3f2dc2798b81531fd93a3b9b7c39da47ec689e55
>
> ...which *does* solve the problem.
>
> The comments in the merge commit suggests that it is not necessarily a
> good solution, so I think we should let it "settle" a bit upstream
> before pushing it.  It does look rather sledgehammer-y...
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3f2dc2798b81531fd93a3b9b7c39da47ec689e55
>
> Thoughts?

If it has to be that way, we can use this patch and we can always remove
it later if we have a better solution.

At any rate, I’d rather not block ‘core-updates’ any longer.

Thoughts?

Thanks for investigating!

Ludo’.

bug#37501: [core-updates] Entropy starvation during boot

Reply via email to