Re: [systemd-devel] protecting sshd against forkbombs, excessive memory usage by other processes
Hi, I would suggest trying the following: * Set a MemoryLow allocation * Enable the CPU cgroup controller For the first, it'll make sense to set MemoryLow= on system.slice and also setting DefaultMemoryLow= or MemoryLow= on sshd.service. Otherwise things might be somewhat unexpected for now, see https://github.com/systemd/systemd/pull/16559 I guess one could also do something similar for user-0.slice. The second part ensures CPU is allocated to users fairly, meaning that the user-X.slice's are competing against each other, rather than the individual processes. This will effectively give the root login and SSH service a higher CPU priority in relation to the fork bomb. You can do this by setting CPUWeight=100 on user-.slice. It'll also result in system.slice and user.slice competing for CPU at eye level. Benjamin On Wed, 2020-08-12 at 12:57 +0900, Tomasz Chmielewski wrote: > I've made a mistake and have executed a forkbomb-like task. Almost > immediately, the system became unresponsive, ssh session froze or > were > very slow to output even single characters; some ssh sessions timed > out > and were disconnected. > > It was not possible to connect a new ssh session to interrupt the > runaway task - new connection attempt were simply timing out. > > SSH is the only way to access the server. Eventually, after some 30 > mins, the system "unfroze" - but - I wonder - can systemd help > sysadmins > getting out of such situations? > > I realize it's a bit tricky, as there are two cases here: > > 1) misbehaving program is a child process of sshd (i.e. user logged > in > and executed a forkbomb) > > 2) misbehaving program is not a child process of sshd (i.e. some > system > service is using a lot of resources) > > > Given that - how can we tune systemd so that system admin is almost > always able to log in via a new SSH connection, in both cases > outlined > above? My usage case assumes user error rather than a malicious > system > resource usage. > > > > Tomasz Chmielewski > ___ > systemd-devel mailing list > systemd-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/systemd-devel > signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] protecting sshd against forkbombs, excessive memory usage by other processes
On 2020-08-12 22:07, Mantas Mikulėnas wrote: On Wed, Aug 12, 2020 at 7:03 AM Tomasz Chmielewski wrote: I've made a mistake and have executed a forkbomb-like task. Almost immediately, the system became unresponsive, ssh session froze or were very slow to output even single characters; some ssh sessions timed out and were disconnected. It was not possible to connect a new ssh session to interrupt the runaway task - new connection attempt were simply timing out. SSH is the only way to access the server. Eventually, after some 30 mins, the system "unfroze" - but - I wonder - can systemd help sysadmins getting out of such situations? I realize it's a bit tricky, as there are two cases here: 1) misbehaving program is a child process of sshd (i.e. user logged in and executed a forkbomb) I don't think "child process of sshd" is the useful part, as logged-in user processes are actually moved to a separate cgroup for the session – so yes, they're sshd children, but they actually have resource limits fully separate from the main sshd daemon process. Which means that with systemd, each user already has their own limit on the number of processes/tasks (the default in user-.slice.d is TasksMax=33% of...something, but it could be lowered to e.g. 10% or to 4096) without affecting the service itself. So I'm sure that sshd.service and user-0.slice could be tweaked somehow to give root a higher priority at cgroup level, but that depends on what your system actually ran out of... It ran out of memory. Tomasz ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] protecting sshd against forkbombs, excessive memory usage by other processes
On Wed, Aug 12, 2020 at 7:03 AM Tomasz Chmielewski wrote: > I've made a mistake and have executed a forkbomb-like task. Almost > immediately, the system became unresponsive, ssh session froze or were > very slow to output even single characters; some ssh sessions timed out > and were disconnected. > > It was not possible to connect a new ssh session to interrupt the > runaway task - new connection attempt were simply timing out. > > SSH is the only way to access the server. Eventually, after some 30 > mins, the system "unfroze" - but - I wonder - can systemd help sysadmins > getting out of such situations? > > I realize it's a bit tricky, as there are two cases here: > > 1) misbehaving program is a child process of sshd (i.e. user logged in > and executed a forkbomb) > I don't think "child process of sshd" is the useful part, as logged-in user processes are actually moved to a separate cgroup for the session – so yes, they're sshd children, but they actually have resource limits fully separate from the main sshd daemon process. Which means that with systemd, each user already has their own limit on the number of processes/tasks (the default in user-.slice.d is TasksMax=33% of...something, but it could be lowered to e.g. 10% or to 4096) without affecting the service itself. So I'm sure that sshd.service and user-0.slice could be tweaked somehow to give root a higher priority at cgroup level, but that depends on what your system actually ran out of... -- Mantas Mikulėnas ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] protecting sshd against forkbombs, excessive memory usage by other processes
I've made a mistake and have executed a forkbomb-like task. Almost immediately, the system became unresponsive, ssh session froze or were very slow to output even single characters; some ssh sessions timed out and were disconnected. It was not possible to connect a new ssh session to interrupt the runaway task - new connection attempt were simply timing out. SSH is the only way to access the server. Eventually, after some 30 mins, the system "unfroze" - but - I wonder - can systemd help sysadmins getting out of such situations? I realize it's a bit tricky, as there are two cases here: 1) misbehaving program is a child process of sshd (i.e. user logged in and executed a forkbomb) 2) misbehaving program is not a child process of sshd (i.e. some system service is using a lot of resources) Given that - how can we tune systemd so that system admin is almost always able to log in via a new SSH connection, in both cases outlined above? My usage case assumes user error rather than a malicious system resource usage. Tomasz Chmielewski ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel