Re: [systemd-devel] Query on sshd.socket sshd.service approaches

2024-03-06 Thread Shreenidhi Shedi
On Wed, Mar 6, 2024 at 8:57 PM Lennart Poettering 
wrote:

> On Mi, 06.03.24 14:44, Shreenidhi Shedi (shreenidhi.sh...@broadcom.com)
> wrote:
>
> > > Lennart Poettering, Berlin
> >
> > Thanks a lot for the responses Andrei, Poettering .
> > We took it from blfs in PhotonOS.
> >
> https://www.linuxfromscratch.org/blfs/view/11.3-systemd/introduction/systemd-units.html
> > We need to do some more work on these unit files.
>
> But that tarball actually contains a correct sshd -i line that
> includes the "-" that makes the return values to be ignored as it
> should.  Hence if your distro didn't do this even though it imported
> this from LFS, then it's your distro that broke that...
>
> Lennart
>
> --
> Lennart Poettering, Berlin
>

I'm not sure which file you are referring to.
Please check this one blfs-systemd-units-20220720/blfs/units/sshdat.service

-- 
Shedi


Re: [systemd-devel] Query on sshd.socket sshd.service approaches

2024-03-06 Thread Lennart Poettering
On Mi, 06.03.24 13:06, Arseny Maslennikov (a...@cs.msu.ru) wrote:

> > The question of course is how many SSH instances you serve every
> > minute. My educated guess is that most SSH installations have a use
> > pattern that's more on the "sporadic use" side of things. There are
> > certainly heavy use scenarios though (e.g. let's say you are github
> > and server git via sshd).
>
> A more relevant source of problems here IMO is not the "fair use"
> pattern, but the misuse pattern.
>
> The per-connection template unit mode, unfortunately, is really unfit
> for any machine with ssh daemons exposed to the IPv4 internet: within
> several months of operation such a machine starts getting at least 3-5
> unauthed connections a second from hierarchically and geographically
> distributed sources. Those clients are probing for vulnerabilities and
> dictionary passwords, they are doomed to never be authenticated on a
> reasonable system, so this is junk traffic at the end of the day.
>
> If sshd is deployed the classic way (№1 or №3), each junk connection is
> accepted and possibly rate-limited by the sshd program itself, and the
> pid1-manager's state is unaffected. Units are only created for
> authorized connections via PAM hooks in the "session stack";
> same goes for other accounting entities and resources.
> If sshd is deployed the per-connection unit way (№2), each junk connection 
> will
> fiddle with system manager state, IOW make the machine create and
> immediately destroy a unit: fork-exec, accounting and sandboxing setup
> costs, etc. If the instance units for junk connections are not
> automatically collected (e. g. via `CollectMode=inactive-or-failed`
> property), this leads to unlimited memory use for pid1 on an unattended
> machine (really bad), powered by external actors.

Well, whatever sshd does as ratelimiting systemd can do to
afaics. I.e. the sshd@.service definition we suggest that and that the
big distros use all get the ExecStart=- thing right, so that an
unclean exit of sshd does not result in a pinned unit. Moreover, there's
PollLimitIntervalSec=/PollLimitBurst=, MaxConnectionsPerSource=,
MaxConnections= that ensures that any attempt to flood the socket
is reasonably contained, and the system recovers from that.

Current versions of systemd enable these settings by default, hence I
think we actually should be fine by default, even if you do not tune
these .socket parameters.

> > I'd suggest to distros to default to mode
> > 2, and alternatively support mode 3 if possible (and mode 1 if they
> > don#t want to patch the support for mode 3 in)
>
> So mode 2 only really makes sense for deployments which are only ever
> accessible from intranets with little junk traffic.

What precisely do you think is missing in systemd that
PollLimitIntervalSec=/PollLimitBurst=, MaxConnectionsPerSource=,
MaxConnections= can't cover?

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Query on sshd.socket sshd.service approaches

2024-03-06 Thread Lennart Poettering
On Mi, 06.03.24 14:44, Shreenidhi Shedi (shreenidhi.sh...@broadcom.com) wrote:

> > Lennart Poettering, Berlin
>
> Thanks a lot for the responses Andrei, Poettering .
> We took it from blfs in PhotonOS.
> https://www.linuxfromscratch.org/blfs/view/11.3-systemd/introduction/systemd-units.html
> We need to do some more work on these unit files.

But that tarball actually contains a correct sshd -i line that
includes the "-" that makes the return values to be ignored as it
should.  Hence if your distro didn't do this even though it imported
this from LFS, then it's your distro that broke that...

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Query on sshd.socket sshd.service approaches

2024-03-06 Thread Mantas Mikulėnas
On Wed, Mar 6, 2024 at 12:21 PM Arseny Maslennikov  wrote:

> So mode 2 only really makes sense for deployments which are only ever
> accessible from intranets with little junk traffic.
>

Which is the case for "deployments" that are *not servers* in the first
place. Many distros are oriented towards personal computers, which are
usually behind a firewall so junk traffic is not a concern, but which you
might want to SSH/VNC/RDP at unexpected moments.

For example, when I first started using systemd in ~2011, my laptop still
had a 5400 rpm HDD, and its boot time mattered far more than it does for
"deployments", so systemd's promise of on-demand startup of everything (to
reduce the boot I/O contention while still keeping the actual service
available) was particularly attractive.

(Of course, these days most systems have SSDs while even the baseline
systemd startup process runs twice as many Assorted Things as my full
desktop environment did in the past, so maybe the issue is no longer
relevant.)

-- 
Mantas Mikulėnas


Re: [systemd-devel] Query on sshd.socket sshd.service approaches

2024-03-06 Thread Arseny Maslennikov
On Wed, Mar 06, 2024 at 09:31:52AM +0100, Lennart Poettering wrote:
> On Mi, 06.03.24 11:11, Shreenidhi Shedi (shreenidhi.sh...@broadcom.com) wrote:
> 
> > Hi All,
> >
> > What is the rationale behind using sshd.socket other than not keeping sshd
> > daemon running always and reducing memory consumption?
> 
> <...>
> 1. Traditional mode (i.e. no socket activation)
>+ connections are served immediately, minimal latency during
>  connection setup
>- takes up resources all the time, even if not used
> 
> 2. Per-connection socket activation mode
>+ takes up almost no resources when not used
>+ zero state shared between connections
>+ robust updates: socket stays connectible throughout updates
>+ robust towards failures in sshd: the bad instance dies, but sshd
>  stays connectible in general
>+ resource accounting/enforcement separate for each connection
>- slightly bigger latency for each connection coming in
>- slightly more resources being used if many connections are
>  established in parallel, since each will get a whole sshd
>  instance of its own.
> 
> 3. Single-instance socket activation mode
>+ takes up almost no resources when not used
>+ robust updates: socket stays connectible throughout updates
> 
> > With sshd.socket, systemd does a fork/exec on each connection which is
> > expensive and with the sshd.service approach server will just connect with
> > the client which is less expensive and faster compared to
> > sshd.socket.
> 
> The question of course is how many SSH instances you serve every
> minute. My educated guess is that most SSH installations have a use
> pattern that's more on the "sporadic use" side of things. There are
> certainly heavy use scenarios though (e.g. let's say you are github
> and server git via sshd).

A more relevant source of problems here IMO is not the "fair use"
pattern, but the misuse pattern.

The per-connection template unit mode, unfortunately, is really unfit
for any machine with ssh daemons exposed to the IPv4 internet: within
several months of operation such a machine starts getting at least 3-5
unauthed connections a second from hierarchically and geographically
distributed sources. Those clients are probing for vulnerabilities and
dictionary passwords, they are doomed to never be authenticated on a
reasonable system, so this is junk traffic at the end of the day.

If sshd is deployed the classic way (№1 or №3), each junk connection is
accepted and possibly rate-limited by the sshd program itself, and the
pid1-manager's state is unaffected. Units are only created for
authorized connections via PAM hooks in the "session stack";
same goes for other accounting entities and resources.
If sshd is deployed the per-connection unit way (№2), each junk connection will
fiddle with system manager state, IOW make the machine create and
immediately destroy a unit: fork-exec, accounting and sandboxing setup
costs, etc. If the instance units for junk connections are not
automatically collected (e. g. via `CollectMode=inactive-or-failed`
property), this leads to unlimited memory use for pid1 on an unattended
machine (really bad), powered by external actors.

> I'd suggest to distros to default to mode
> 2, and alternatively support mode 3 if possible (and mode 1 if they
> don#t want to patch the support for mode 3 in)

So mode 2 only really makes sense for deployments which are only ever
accessible from intranets with little junk traffic.


signature.asc
Description: PGP signature


Re: [systemd-devel] Query on sshd.socket sshd.service approaches

2024-03-06 Thread Shreenidhi Shedi
On Wed, Mar 6, 2024 at 2:01 PM Lennart Poettering 
wrote:

> On Mi, 06.03.24 11:11, Shreenidhi Shedi (shreenidhi.sh...@broadcom.com)
> wrote:
>
> > Hi All,
> >
> > What is the rationale behind using sshd.socket other than not keeping
> sshd
> > daemon running always and reducing memory consumption?
>
> Note that there are two distinct modes to running sshd via socket
> activation: the per-connection mode (using sshd's native inetd mode),
> where there's a separate instance forked off by systemd for each
> connection, and the a mode where systemd just binds the socket, but
> it's served by a single instance. The latter is only supported via an
> out-of-tree patch afaik though, which at least debian/ubuntu ship:
>
>
> https://salsa.debian.org/ssh-team/openssh/-/commit/7fa10262be3c7d9fd2fca9c9710ac4ef3f788b08
>
> Unless you have a gazillion of connections coming in every second I'd
> probably just use the per-connection inetd mode, simply because it's
> supported upstream. Would be great of course if openssh would just add
> support for the single-instance mode in upstream too, but as I
> understand ssh upstream is a bit special, and doesn't want to play
> ball on this.
>
> To summarize the benefits of each mode:
>
> 1. Traditional mode (i.e. no socket activation)
>+ connections are served immediately, minimal latency during
>  connection setup
>- takes up resources all the time, even if not used
>
> 2. Per-connection socket activation mode
>+ takes up almost no resources when not used
>+ zero state shared between connections
>+ robust updates: socket stays connectible throughout updates
>+ robust towards failures in sshd: the bad instance dies, but sshd
>  stays connectible in general
>+ resource accounting/enforcement separate for each connection
>- slightly bigger latency for each connection coming in
>- slightly more resources being used if many connections are
>  established in parallel, since each will get a whole sshd
>  instance of its own.
>
> 3. Single-instance socket activation mode
>+ takes up almost no resources when not used
>+ robust updates: socket stays connectible throughout updates
>
> > With sshd.socket, systemd does a fork/exec on each connection which is
> > expensive and with the sshd.service approach server will just connect
> with
> > the client which is less expensive and faster compared to
> > sshd.socket.
>
> The question of course is how many SSH instances you serve every
> minute. My educated guess is that most SSH installations have a use
> pattern that's more on the "sporadic use" side of things. There are
> certainly heavy use scenarios though (e.g. let's say you are github
> and server git via sshd). I'd suggests to distros to default to mode
> 2, and alternatively support mode 3 if possible (and mode 1 if they
> don#t want to patch the support for mode 3 in)
>
> > And if there are issues in unit files like in
> > https://github.com/systemd/systemd/issues/29897 it will make the system
> > unusable.
>
> Did any distro ship a unit file like that? That was clearly a buggy
> (local?) unit file, I am not aware of any big distro shipping such a
> unit file.
>
> Lennart
>
> --
> Lennart Poettering, Berlin
>

Thanks a lot for the responses Andrei, Poettering .
We took it from blfs in PhotonOS.
https://www.linuxfromscratch.org/blfs/view/11.3-systemd/introduction/systemd-units.html
We need to do some more work on these unit files.

--
Shedi


Re: [systemd-devel] Query on sshd.socket sshd.service approaches

2024-03-06 Thread Lennart Poettering
On Mi, 06.03.24 11:11, Shreenidhi Shedi (shreenidhi.sh...@broadcom.com) wrote:

> Hi All,
>
> What is the rationale behind using sshd.socket other than not keeping sshd
> daemon running always and reducing memory consumption?

Note that there are two distinct modes to running sshd via socket
activation: the per-connection mode (using sshd's native inetd mode),
where there's a separate instance forked off by systemd for each
connection, and the a mode where systemd just binds the socket, but
it's served by a single instance. The latter is only supported via an
out-of-tree patch afaik though, which at least debian/ubuntu ship:

https://salsa.debian.org/ssh-team/openssh/-/commit/7fa10262be3c7d9fd2fca9c9710ac4ef3f788b08

Unless you have a gazillion of connections coming in every second I'd
probably just use the per-connection inetd mode, simply because it's
supported upstream. Would be great of course if openssh would just add
support for the single-instance mode in upstream too, but as I
understand ssh upstream is a bit special, and doesn't want to play
ball on this.

To summarize the benefits of each mode:

1. Traditional mode (i.e. no socket activation)
   + connections are served immediately, minimal latency during
 connection setup
   - takes up resources all the time, even if not used

2. Per-connection socket activation mode
   + takes up almost no resources when not used
   + zero state shared between connections
   + robust updates: socket stays connectible throughout updates
   + robust towards failures in sshd: the bad instance dies, but sshd
 stays connectible in general
   + resource accounting/enforcement separate for each connection
   - slightly bigger latency for each connection coming in
   - slightly more resources being used if many connections are
 established in parallel, since each will get a whole sshd
 instance of its own.

3. Single-instance socket activation mode
   + takes up almost no resources when not used
   + robust updates: socket stays connectible throughout updates

> With sshd.socket, systemd does a fork/exec on each connection which is
> expensive and with the sshd.service approach server will just connect with
> the client which is less expensive and faster compared to
> sshd.socket.

The question of course is how many SSH instances you serve every
minute. My educated guess is that most SSH installations have a use
pattern that's more on the "sporadic use" side of things. There are
certainly heavy use scenarios though (e.g. let's say you are github
and server git via sshd). I'd suggests to distros to default to mode
2, and alternatively support mode 3 if possible (and mode 1 if they
don#t want to patch the support for mode 3 in)

> And if there are issues in unit files like in
> https://github.com/systemd/systemd/issues/29897 it will make the system
> unusable.

Did any distro ship a unit file like that? That was clearly a buggy
(local?) unit file, I am not aware of any big distro shipping such a
unit file.

Lennart

--
Lennart Poettering, Berlin