Bug#982950: ssh.service starts sshd before network is online: please switch to After=network-online.target instead of just After=network.target
* Thomas Goirand [210218 17:03]: > On 2/18/21 12:59 PM, Timo Weingärtner wrote: > > 17.02.21 21:42 Chris Hofstaedtler: > >> Services that use After=network-online.target are generally broken, > >> please do not introduce that. > > > > Seconded. Just consider a node where one link is down on boot and you would > > have to wait such a long time until you can examine the problem via ssh. > > I still don't understand this part. If network isn't online, how can I > ssh to my server anyways? Thats exactly the problem description: everyone and every service has a different idea of "network is online". I'm also happy with "fe80::x" IPv6 is ready - for ssh. Don't need to wait for IPv4 DHCP or a default gateway or whatever. Chris
Bug#982950: ssh.service starts sshd before network is online: please switch to After=network-online.target instead of just After=network.target
On 2/18/21 12:59 PM, Timo Weingärtner wrote: > 17.02.21 21:42 Chris Hofstaedtler: >> Services that use After=network-online.target are generally broken, >> please do not introduce that. > > Seconded. Just consider a node where one link is down on boot and you would > have to wait such a long time until you can examine the problem via ssh. I still don't understand this part. If network isn't online, how can I ssh to my server anyways? Cheers, Thomas Goirand (zigo)
Bug#982950: ssh.service starts sshd before network is online: please switch to After=network-online.target instead of just After=network.target
Hallo, 17.02.21 21:42 Chris Hofstaedtler: > * Thomas Goirand [210217 20:38]: > > # cat /etc/systemd/system/ssh.service.d/override.conf > > [Unit] > > After=network-online.target auditd.service > > > > But IMO, this is very wrong to mandate doing this, and not having ssh > > connectivity after a reboot, is kind of a grave problem. > > > > So, could you hard-wire this in the openssh-server package directly, so > > Debian users can avoid such an override? Indeed After=network.target > > doesn't tell you that network is ready. After=network-online.target does, > > and that's IMO what the ssh daemon should be using. > > But if you do this, you'll end up delaying start of sshd for up to > 120seconds in error cases. And even then, you might not get what you > want (if you read systemd-networkd-wait-online.service(8) > carefully). > > Services that use After=network-online.target are generally broken, > please do not introduce that. Seconded. Just consider a node where one link is down on boot and you would have to wait such a long time until you can examine the problem via ssh. > As discussed already, IP_FREEBIND is a thing. The system-wide sysctl > is a common workaround, especially for "bgp-on-the-host" setups, for > all sorts of servers/daemons. That should work; systemd-sysctl.service is ordered before ssh. Another option is in #965132 (ssh@.socket), but then the fix for #946180 and #934663 (RuntimeDirectoryPreserve=yes for ssh*.service) is also needed. Grüße Timo [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=965132 [2] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=946180 [3] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=934663 signature.asc Description: This is a digitally signed message part.
Bug#982950: ssh.service starts sshd before network is online: please switch to After=network-online.target instead of just After=network.target
Hi Colin, Thanks a lot for taking the time to provide very valuable information. It's helping me a lot. On 2/17/21 12:11 PM, Colin Watson wrote: > On Wed, Feb 17, 2021 at 11:46:57AM +0100, Thomas Goirand wrote: >> On 2/17/21 10:14 AM, Colin Watson wrote: >>> Are you using ListenAddress in sshd_config? >> >> Yes, with the same IP as above, in order to make sure ssh isn't >> listening on a public IP (which would be a security concern for us). > > Oh, that's vital information for this bug I'm surprised ... > using ListenAddress changes > the constraints on sshd startup, somewhat as described in README.Debian. I've read it, and the only part of the README.Debian that talks about something related, is about the removal of the if-up hook. I don't see any startup constraint changes described there. What did I miss? The launchpad bug entry about the ifupdown hook removal is specifically discussing the fact that in my case, I'd be affected. Indeed, I am. I then wonder what's advised then... > In that case I think this is at least arguably a case of needing to keep > your configuration in sync, isn't it? You've made a change to > sshd_config, so you need to change other parts of the system to support > that change. I'm not convinced that using a custom ListenAddress implies repairing the boot process, no. By default, sshd_config has this: #Port 22 #AddressFamily any #ListenAddress 0.0.0.0 #ListenAddress :: Having these commented out is an invitation to un-comment and use them with custom values. Basically, what you wrote above is that doing this breaks sshd. Hopefully, you'll agree that this isn't what one would expect! :) If that's really the case and one should do either the systemd ordering hack I'm doing, or the net.ipv4.ip_nonlocal_bind sysctl tweak, then IMO it'd be worse either: 1/ removing ListenAddress from the example sshd_config 2/ adding comments just above the directive, explaining what we're discussing in this bug entry. >> Maybe setting-up net.ipv4.ip_nonlocal_bind=1 (in sysctl.conf) would fix >> my issue, no? > > That's the system-wide version of IP_FREEBIND. OpenSSH upstream seems > to have decided not to support IP_FREEBIND > (https://bugzilla.mindrot.org/show_bug.cgi?id=2512) If I understand well what's in this bug entry, upstream seems to suggest to do what I did: 1/ ListenAddress 2/ Add an override so that sshd starts After=network-online.target However, it's looking like you're saying one shouldn't do 2/ at all? Could you explain why? I've been using After=network-online.target in most daemons I maintain, and now I'm wondering if that's wrong... Then if I shouldn't do After=network-online.target, do you believe that the sysctl hack is better? > but the sysctl should work if you're OK with it being system-wide. I'd very much prefer if this could be a per-socket thing, but I already do this because that's how I setup haproxy that binds on a VIP (which can move from one host to another). Though now, I wonder if there's an option in HAProxy so it could use IP_FREEBIND on its own. Which would then lead me to say I would prefer to not use a system-wide thing... > I'd also recommend at least considering other approaches to implementing > your security policy that avoid the ordering complexities of > ListenAddress, since there are other ways to prevent incoming > connections on public IP addresses. Approaches I can think of include: > > * Reject connections to port 22 at the firewall level (perhaps a >firewall on the local host). Considering the interaction with OpenStack Neutron, this could potentially be hard to maintain: Neutron is doing a lot of iptables stuff on its own, and I prefer if I don't touch that at all, either on compute nodes or network nodes. So the option to use ListenAddress looked a way nicer for my use case. > * It might be worth experimenting with Match LocalAddress in >sshd_config. I haven't played with that much myself, and it's >poorly-documented, but I *think* that might allow you to reject any >connections that aren't directed to appropriate addresses. >From my experience, it's best to not expose the ssh port at all if possible, as brute-forcing the port may lead brokenness. On 2/17/21 9:42 PM, Chris Hofstaedtler wrote: > But if you do this, you'll end up delaying start of sshd for up to > 120seconds in error cases. And even then, you might not get what you > want (if you read systemd-networkd-wait-online.service(8) > carefully). This talks about networkd. Unless things have changed, I don't think Debian is using this by default (yet?). > Services that use After=network-online.target are generally broken, > please do not introduce that. Can you explain why? > As discussed already, IP_FREEBIND is a thing. As per the bug entry Collin pointed out, it looks like it isn't a thing in sshd at least... > The system-wide sysctl > is a common workaround, especially for "bgp-on-the-host" setups, for > all sorts of servers/daemons. I
Bug#982950: ssh.service starts sshd before network is online: please switch to After=network-online.target instead of just After=network.target
* Thomas Goirand [210217 20:38]: > # cat /etc/systemd/system/ssh.service.d/override.conf > [Unit] > After=network-online.target auditd.service > > But IMO, this is very wrong to mandate doing this, and not having ssh > connectivity after a reboot, is kind of a grave problem. > > So, could you hard-wire this in the openssh-server package directly, so Debian > users can avoid such an override? Indeed After=network.target doesn't tell you > that network is ready. After=network-online.target does, and that's IMO what > the ssh daemon should be using. But if you do this, you'll end up delaying start of sshd for up to 120seconds in error cases. And even then, you might not get what you want (if you read systemd-networkd-wait-online.service(8) carefully). Services that use After=network-online.target are generally broken, please do not introduce that. As discussed already, IP_FREEBIND is a thing. The system-wide sysctl is a common workaround, especially for "bgp-on-the-host" setups, for all sorts of servers/daemons. Chris
Bug#982950: ssh.service starts sshd before network is online: please switch to After=network-online.target instead of just After=network.target
On Wed, Feb 17, 2021 at 11:46:57AM +0100, Thomas Goirand wrote: > On 2/17/21 10:14 AM, Colin Watson wrote: > > On Wed, Feb 17, 2021 at 09:36:15AM +0100, Thomas Goirand wrote: > >> This means that, until FRR is fully up and running, with the BGP session > >> established, the server IP (10.x.x.x/32 bound to the loopback interface) > >> isn't > >> set yet on the server, and the ssh daemon cannot bind on the IP (as it's > >> not > >> there yet). > > > > Are you using ListenAddress in sshd_config? > > Yes, with the same IP as above, in order to make sure ssh isn't > listening on a public IP (which would be a security concern for us). Oh, that's vital information for this bug - using ListenAddress changes the constraints on sshd startup, somewhat as described in README.Debian. In that case I think this is at least arguably a case of needing to keep your configuration in sync, isn't it? You've made a change to sshd_config, so you need to change other parts of the system to support that change. I'd be happy to try to clarify documentation once we work out what works. > > See also > > https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/, which > > among other things (in general the tone of that page is that a > > well-written service should not use After=network-online.target) says: > > > > "If you write a server: listen on [::], [::1], 0.0.0.0 and 127.0.0.1 > > only. These pseudo-addresses are unconditionally available." > > > > That's what sshd does in its default configuration. If it doesn't work, > > the systemd documentation suggests that something else is not fulfilling > > its end of a contract somewhere. > > Maybe setting-up net.ipv4.ip_nonlocal_bind=1 (in sysctl.conf) would fix > my issue, no? That's the system-wide version of IP_FREEBIND. OpenSSH upstream seems to have decided not to support IP_FREEBIND (https://bugzilla.mindrot.org/show_bug.cgi?id=2512), but the sysctl should work if you're OK with it being system-wide. I'd also recommend at least considering other approaches to implementing your security policy that avoid the ordering complexities of ListenAddress, since there are other ways to prevent incoming connections on public IP addresses. Approaches I can think of include: * Reject connections to port 22 at the firewall level (perhaps a firewall on the local host). * It might be worth experimenting with Match LocalAddress in sshd_config. I haven't played with that much myself, and it's poorly-documented, but I *think* that might allow you to reject any connections that aren't directed to appropriate addresses. -- Colin Watson (he/him) [cjwat...@debian.org]
Bug#982950: ssh.service starts sshd before network is online: please switch to After=network-online.target instead of just After=network.target
On 2/17/21 10:14 AM, Colin Watson wrote: > Control: severity -1 important > > On Wed, Feb 17, 2021 at 09:36:15AM +0100, Thomas Goirand wrote: >> Package: openssh-server >> Version: 1:8.4p1-4 >> Severity: grave > > No. It may yet need to be sorted out before release, but this is a rare > situation and I'm not having it being release-critical right now, > especially because it's not clear that it's openssh-server's problem. Let's not discuss the severity: let's try to fix the issue instead. >> On a Sid/Testing system, currently we have in >> /lib/systemd/system/ssh.service: >> >> After=network.target auditd.service >> >> While this isn't a problem in most installation, it didn't work under our >> setup, >> because we use "bgp-to-the-host" networking. In this setup, we need FRR (the >> BGP routing daemon which is a fork of Quagga, if you didn't know) to provide >> network connectivity to the server. Our configuration is something like this: >> >> # cat /etc/frr/frr.conf >> [...] >> ! >> int lo >> ip address 10.56.17.7/32 >> ! >> [...] >> >> This means that, until FRR is fully up and running, with the BGP session >> established, the server IP (10.x.x.x/32 bound to the loopback interface) >> isn't >> set yet on the server, and the ssh daemon cannot bind on the IP (as it's not >> there yet). > > Are you using ListenAddress in sshd_config? Yes, with the same IP as above, in order to make sure ssh isn't listening on a public IP (which would be a security concern for us). > Normally sshd doesn't > require network interfaces to be online - it just binds to 0.0.0.0 or > [::] and that's enough for it to be bound to interfaces as they come up. > > If lo has to be up for this to work (which is surprising to me, but > maybe), then I think there's a decent argument for frr to be part of > network.target on such systems. The interface is up before FRR start. Though the IP on localhost is only added when FRR has established a working BGP session with its peers (here, the 2 switches the machine is connected to). > See also > https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/, which > among other things (in general the tone of that page is that a > well-written service should not use After=network-online.target) says: > > "If you write a server: listen on [::], [::1], 0.0.0.0 and 127.0.0.1 > only. These pseudo-addresses are unconditionally available." > > That's what sshd does in its default configuration. If it doesn't work, > the systemd documentation suggests that something else is not fulfilling > its end of a contract somewhere. Maybe setting-up net.ipv4.ip_nonlocal_bind=1 (in sysctl.conf) would fix my issue, no? Your thoughts? Cheers, Thomas Goirand (zigo)
Bug#982950: ssh.service starts sshd before network is online: please switch to After=network-online.target instead of just After=network.target
Control: severity -1 important On Wed, Feb 17, 2021 at 09:36:15AM +0100, Thomas Goirand wrote: > Package: openssh-server > Version: 1:8.4p1-4 > Severity: grave No. It may yet need to be sorted out before release, but this is a rare situation and I'm not having it being release-critical right now, especially because it's not clear that it's openssh-server's problem. > On a Sid/Testing system, currently we have in /lib/systemd/system/ssh.service: > > After=network.target auditd.service > > While this isn't a problem in most installation, it didn't work under our > setup, > because we use "bgp-to-the-host" networking. In this setup, we need FRR (the > BGP routing daemon which is a fork of Quagga, if you didn't know) to provide > network connectivity to the server. Our configuration is something like this: > > # cat /etc/frr/frr.conf > [...] > ! > int lo > ip address 10.56.17.7/32 > ! > [...] > > This means that, until FRR is fully up and running, with the BGP session > established, the server IP (10.x.x.x/32 bound to the loopback interface) isn't > set yet on the server, and the ssh daemon cannot bind on the IP (as it's not > there yet). Are you using ListenAddress in sshd_config? Normally sshd doesn't require network interfaces to be online - it just binds to 0.0.0.0 or [::] and that's enough for it to be bound to interfaces as they come up. If lo has to be up for this to work (which is surprising to me, but maybe), then I think there's a decent argument for frr to be part of network.target on such systems. > Our fix was pretty simple: > > # cat /etc/systemd/system/ssh.service.d/override.conf > [Unit] > After=network-online.target auditd.service > > But IMO, this is very wrong to mandate doing this, and not having ssh > connectivity after a reboot, is kind of a grave problem. > > So, could you hard-wire this in the openssh-server package directly, so Debian > users can avoid such an override? Indeed After=network.target doesn't tell you > that network is ready. After=network-online.target does, and that's IMO what > the ssh daemon should be using. I don't agree that network-online.target is appropriate in general, from its documentation: network-online.target Units that strictly require a configured network connection should pull in network-online.target (via a Wants= type dependency) and order themselves after it. This target unit is intended to pull in a service that delays further execution until the network is sufficiently set up. What precisely this requires is left to the implementation of the network managing service. Note the distinction between this unit and network.target. This unit is an active unit (i.e. pulled in by the consumer rather than the provider of this functionality) and pulls in a service which possibly adds substantial delays to further execution. In contrast, network.target is a passive unit (i.e. pulled in by the provider of the functionality, rather than the consumer) that usually does not delay execution much. Usually, network.target is part of the boot of most systems, while network-online.target is not, except when at least one unit requires it. Also see Running Services After the Network is up[1] for more information. sshd does not in general require a configured network connection just to start up, and making it do so would be a pretty significant change to the unit dependency graph on most systems; it would make "is not [part of the boot process]" above typically untrue, for one thing. See also https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/, which among other things (in general the tone of that page is that a well-written service should not use After=network-online.target) says: "If you write a server: listen on [::], [::1], 0.0.0.0 and 127.0.0.1 only. These pseudo-addresses are unconditionally available." That's what sshd does in its default configuration. If it doesn't work, the systemd documentation suggests that something else is not fulfilling its end of a contract somewhere. -- Colin Watson (he/him) [cjwat...@debian.org]
Bug#982950: ssh.service starts sshd before network is online: please switch to After=network-online.target instead of just After=network.target
Package: openssh-server Version: 1:8.4p1-4 Severity: grave Hi there, On a Sid/Testing system, currently we have in /lib/systemd/system/ssh.service: After=network.target auditd.service While this isn't a problem in most installation, it didn't work under our setup, because we use "bgp-to-the-host" networking. In this setup, we need FRR (the BGP routing daemon which is a fork of Quagga, if you didn't know) to provide network connectivity to the server. Our configuration is something like this: # cat /etc/frr/frr.conf [...] ! int lo ip address 10.56.17.7/32 ! [...] This means that, until FRR is fully up and running, with the BGP session established, the server IP (10.x.x.x/32 bound to the loopback interface) isn't set yet on the server, and the ssh daemon cannot bind on the IP (as it's not there yet). Our fix was pretty simple: # cat /etc/systemd/system/ssh.service.d/override.conf [Unit] After=network-online.target auditd.service But IMO, this is very wrong to mandate doing this, and not having ssh connectivity after a reboot, is kind of a grave problem. So, could you hard-wire this in the openssh-server package directly, so Debian users can avoid such an override? Indeed After=network.target doesn't tell you that network is ready. After=network-online.target does, and that's IMO what the ssh daemon should be using. Thanks for maintaining openssh in Debian, Cheers, Thomas Goirand (zigo)