Bug#912087: reassign to systemd #912087 | openssh-server: Slow startup after the upgrade to 7.9p1

2019-01-19 Thread Andy Simpkins
Follow up report:

Bare metal install onto an APM Mustang board (see debian arm64 buildds)
of debian-buster-DI-alpha4-arm64-DVD-1.iso  [1]

sshd takes > 7 min to start [2]


This is clearly going to be a problem for Buster as things stand...

/Andy



[1] DI alpha4 uses kernel 4.18.20-2 (2018-11-23)
I am not updating the current Buster kernel (4.19.0-1-arm64) because
I am seeing a problem with it causing mustangs to panic during init
(this is the next bug I will report)

[2] root@sally:~# systemctl status ssh
ssh.service - OpenBSD Secure Shell server
Loaded: loaded (/lib/systemd/system/ssh.service; enabled; vendor
   preset: enab
   Active: activating (start-pre) since Sat 2019-01-19 21:23:39 GMT;
   2min 16s ag
 Docs: man:sshd(8)
   man:sshd_config(5)
   Cntrl PID: 459 (sshd)
   Tasks: 1 (limit: 4915)
  Memory: 2.8M
  CGroup: /system.slice/ssh.service
  ��└��─459 /usr/sbin/sshd -t


   syslog extract:
 Jan 19 21:30:49 sally systemd[1]: Starting OpenBSD Secure Shell
 server...

 Jan 19 21:31:01 sally CRON[516]: (root) CMD (   test -x
 /etc/cron.daily/popularity-contest && /etc/cron.daily/popularity-
 contest --crond)

 Jan 19 21:31:58 sally kernel: [  442.940274] random: crng init done

 Jan 19 21:31:58 sally kernel: [  442.940280] random: 7 urandom
 warning(s) missed due to ratelimiting

 Jan 19 21:31:58 sally systemd[1]: Started OpenBSD Secure Shell
 server.









On Wed, 2 Jan 2019 14:45:18 +0100 Olaf van der Spek
 wrote:
> Op do 29 nov. 2018 om 14:58 schreef Olaf van der Spek :
> >
> > Op do 29 nov. 2018 om 14:54 schreef Sebastian Andrzej Siewior
> > :
> > >
> > > On 2018-11-28 13:43:07 [+0100], Olaf van der Spek wrote:
> > > > > > They might just as well install haveged or configure virtio-rng in 
> > > > > > such
> > > > > > a case.
> > > > >
> > > > > Right. Do you think, that it would necessary to add something to the
> > > > > release notes?
> > > >
> > > > I do. ;)
> > > > What's the workaround for VMware?
> > > >
> > > > Does it just take longer to start or do some services not start at all?
> > >
> > > It will take longer to start, it will start. Let me pass that workaround
> >
> > Are you sure?
> > I've had php-fpm not start due to this, I think:
> > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=906686
> 
> Today on a Digital Ocean VM:
> Jan  2 13:22:29 www systemd[1]: php7.3-fpm.service: Start operation
> timed out. Terminating.
> Jan  2 13:22:29 www systemd[1]: ssh.service: Start-pre operation timed
> out. Terminating.
> Jan  2 13:22:29 www systemd[1]: nginx.service: Start-pre operation
> timed out. Terminating.
> Jan  2 13:22:29 www systemd[1]: php7.3-fpm.service: Main process
> exited, code=killed, status=15/TERM
> Jan  2 13:22:29 www systemd[1]: php7.3-fpm.service: Failed with result
> 'timeout'.
> Jan  2 13:22:29 www systemd[1]: Failed to start The PHP 7.3 FastCGI
> Process Manager.
> Jan  2 13:22:29 www systemd[1]: nginx.service: Control process exited,
> code=killed, status=15/TERM
> Jan  2 13:22:29 www systemd[1]: nginx.service: Failed with result 'timeout'.
> Jan  2 13:22:29 www systemd[1]: Failed to start A high performance web
> server and a reverse proxy server.
> Jan  2 13:22:29 www systemd[1]: ssh.service: Control process exited,
> code=killed, status=15/TERM
> Jan  2 13:22:29 www systemd[1]: ssh.service: Failed with result 'timeout'.
> Jan  2 13:22:29 www systemd[1]: Failed to start OpenBSD Secure Shell server.
> Jan  2 13:22:29 www systemd[1]: Reached target Multi-User System.
> Jan  2 13:22:29 www systemd[1]: Starting Execute cloud user/final scripts...
> Jan  2 13:22:29 www systemd[1]: Reached target Graphical Interface.
> Jan  2 13:22:29 www systemd[1]: Starting Update UTMP about System
> Runlevel Changes...
> Jan  2 13:22:29 www kernel: [   97.513700] urandom_read: 3 callbacks 
> suppressed
> Jan  2 13:22:29 www kernel: [   97.513702] random: cloud-init:
> uninitialized urandom read (24 bytes read)
> Jan  2 13:22:29 www systemd[1]: systemd-update-utmp-runlevel.service: 
> Succeeded.
> Jan  2 13:22:29 www systemd[1]: Started Update UTMP about System
> Runlevel Changes.
> Jan  2 13:22:29 www systemd[1]: ssh.service: Service RestartSec=100ms
> expired, scheduling restart.
> Jan  2 13:22:29 www systemd[1]: ssh.service: Scheduled restart job,
> restart counter is at 1.



Bug#912087: reassign to systemd #912087 | openssh-server: Slow startup after the upgrade to 7.9p1

2019-01-02 Thread Olaf van der Spek
Op do 29 nov. 2018 om 14:58 schreef Olaf van der Spek :
>
> Op do 29 nov. 2018 om 14:54 schreef Sebastian Andrzej Siewior
> :
> >
> > On 2018-11-28 13:43:07 [+0100], Olaf van der Spek wrote:
> > > > > They might just as well install haveged or configure virtio-rng in 
> > > > > such
> > > > > a case.
> > > >
> > > > Right. Do you think, that it would necessary to add something to the
> > > > release notes?
> > >
> > > I do. ;)
> > > What's the workaround for VMware?
> > >
> > > Does it just take longer to start or do some services not start at all?
> >
> > It will take longer to start, it will start. Let me pass that workaround
>
> Are you sure?
> I've had php-fpm not start due to this, I think:
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=906686

Today on a Digital Ocean VM:
Jan  2 13:22:29 www systemd[1]: php7.3-fpm.service: Start operation
timed out. Terminating.
Jan  2 13:22:29 www systemd[1]: ssh.service: Start-pre operation timed
out. Terminating.
Jan  2 13:22:29 www systemd[1]: nginx.service: Start-pre operation
timed out. Terminating.
Jan  2 13:22:29 www systemd[1]: php7.3-fpm.service: Main process
exited, code=killed, status=15/TERM
Jan  2 13:22:29 www systemd[1]: php7.3-fpm.service: Failed with result
'timeout'.
Jan  2 13:22:29 www systemd[1]: Failed to start The PHP 7.3 FastCGI
Process Manager.
Jan  2 13:22:29 www systemd[1]: nginx.service: Control process exited,
code=killed, status=15/TERM
Jan  2 13:22:29 www systemd[1]: nginx.service: Failed with result 'timeout'.
Jan  2 13:22:29 www systemd[1]: Failed to start A high performance web
server and a reverse proxy server.
Jan  2 13:22:29 www systemd[1]: ssh.service: Control process exited,
code=killed, status=15/TERM
Jan  2 13:22:29 www systemd[1]: ssh.service: Failed with result 'timeout'.
Jan  2 13:22:29 www systemd[1]: Failed to start OpenBSD Secure Shell server.
Jan  2 13:22:29 www systemd[1]: Reached target Multi-User System.
Jan  2 13:22:29 www systemd[1]: Starting Execute cloud user/final scripts...
Jan  2 13:22:29 www systemd[1]: Reached target Graphical Interface.
Jan  2 13:22:29 www systemd[1]: Starting Update UTMP about System
Runlevel Changes...
Jan  2 13:22:29 www kernel: [   97.513700] urandom_read: 3 callbacks suppressed
Jan  2 13:22:29 www kernel: [   97.513702] random: cloud-init:
uninitialized urandom read (24 bytes read)
Jan  2 13:22:29 www systemd[1]: systemd-update-utmp-runlevel.service: Succeeded.
Jan  2 13:22:29 www systemd[1]: Started Update UTMP about System
Runlevel Changes.
Jan  2 13:22:29 www systemd[1]: ssh.service: Service RestartSec=100ms
expired, scheduling restart.
Jan  2 13:22:29 www systemd[1]: ssh.service: Scheduled restart job,
restart counter is at 1.
Jan  2 13:22:29 www systemd[1]: Stopped OpenBSD Secure Shell server.
Jan  2 13:22:29 www systemd[1]: Starting OpenBSD Secure Shell server...

Jan  2 13:26:06 www kernel: [  314.100132] random: crng init done
Jan  2 13:26:06 www systemd[1]: Started OpenBSD Secure Shell server.

I think more than some release notes is required.
-- 
Olaf



Bug#912087: reassign to systemd #912087 | openssh-server: Slow startup after the upgrade to 7.9p1

2018-11-29 Thread Olaf van der Spek
Op do 29 nov. 2018 om 14:54 schreef Sebastian Andrzej Siewior
:
>
> On 2018-11-28 13:43:07 [+0100], Olaf van der Spek wrote:
> > > > They might just as well install haveged or configure virtio-rng in such
> > > > a case.
> > >
> > > Right. Do you think, that it would necessary to add something to the
> > > release notes?
> >
> > I do. ;)
> > What's the workaround for VMware?
> >
> > Does it just take longer to start or do some services not start at all?
>
> It will take longer to start, it will start. Let me pass that workaround

Are you sure?
I've had php-fpm not start due to this, I think:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=906686

Gr,

-- 
Olaf



Bug#912087: reassign to systemd #912087 | openssh-server: Slow startup after the upgrade to 7.9p1

2018-11-29 Thread Sebastian Andrzej Siewior
On 2018-11-28 13:43:07 [+0100], Olaf van der Spek wrote:
> > > They might just as well install haveged or configure virtio-rng in such
> > > a case.
> >
> > Right. Do you think, that it would necessary to add something to the
> > release notes?
> 
> I do. ;)
> What's the workaround for VMware?
> 
> Does it just take longer to start or do some services not start at all?

It will take longer to start, it will start. Let me pass that workaround
question to someone from vmware/virtualbox and #912087:

On a recent kernel you see something like that:
[   68.823013] random: crng init done

which means it took almost 69 seconds for the system to initialize its
rng. OpenSSH 7.9 can be compilied against OpenSSL 1.1.1 which in turn
switched to getrandom() (for its randomness).
This syscall will block until kernel's rng is ready which took in this
example almost 69 seconds.
Those "high" numbers are not a problem on decent/recent HW but occur
oftern on in virtualized environments.
For KVM we have CONFIG_HW_RANDOM_VIRTIO. Are there any plans to get
something similar for VMware/Vbox?

[0] http://bugs.debian.org/912087

Sebastian



Bug#912087: reassign to systemd #912087 | openssh-server: Slow startup after the upgrade to 7.9p1

2018-11-28 Thread Olaf van der Spek
Op di 27 nov. 2018 om 14:12 schreef Sebastian Andrzej Siewior
:
>
> On 2018-11-27 09:57:58 [+0100], Michael Biebl wrote:
> > Sigh, and there is nothing that systemd can do to fix this, so I don't
> > see a point re-assigning this to systemd (again).
>
> I was about to close it myself then I though maybe reassign in case you
> want close it if that PR gets merged it.
>
> > Even if this PR is merged, upstream made it very clear that this will be
> > explicitly opt-in, so users will still be affected by default.
>
> Yes, I am not a fan that.
>
> > They might just as well install haveged or configure virtio-rng in such
> > a case.
>
> Right. Do you think, that it would necessary to add something to the
> release notes?

I do. ;)
What's the workaround for VMware?

Does it just take longer to start or do some services not start at all?



-- 
Olaf



Bug#912087: reassign to systemd #912087 | openssh-server: Slow startup after the upgrade to 7.9p1

2018-11-27 Thread Sebastian Andrzej Siewior
On 2018-11-27 09:57:58 [+0100], Michael Biebl wrote:
> Sigh, and there is nothing that systemd can do to fix this, so I don't
> see a point re-assigning this to systemd (again).

I was about to close it myself then I though maybe reassign in case you
want close it if that PR gets merged it.

> Even if this PR is merged, upstream made it very clear that this will be
> explicitly opt-in, so users will still be affected by default.

Yes, I am not a fan that.
 
> They might just as well install haveged or configure virtio-rng in such
> a case.

Right. Do you think, that it would necessary to add something to the
release notes?

> Oh well, I case this means we just have to close this issue as wontfix then.
Thank you.

Sebastian



Bug#912087: reassign to systemd #912087 | openssh-server: Slow startup after the upgrade to 7.9p1

2018-11-26 Thread Sebastian Andrzej Siewior
control: reassign -1 systemd 239-13

I hereby reassign the bug to systemd.
The problem is that OpenSSL is now using getrandom() for entropy which
is not (yet) ready and therefore OpenSSH needs longer for startup by
simply waiting for entropy.

Theodore Y. Ts'o suggested adding hw-rng to KVM/virt setups [0].
Everything else should work just "fine".

Either way there is nothing OpenSSL wise that can be done. A similar
issue [1] has been reported once GnuTLS which to getrandom().

Systemd wise there is
https://github.com/systemd/systemd/issues/4271
https://github.com/systemd/systemd/pull/10621
where people want systemd to systemd to credit entropy. I'm not much a
fan of that but that is a different story.

[0] https://bugs.debian.org/912087#118
[1] https://bugs.debian.org/837597

Sebastian