Re: [systemd-devel] Automatically moving forked processes in a different cgroup based on children's UID

2022-01-09 Thread Mantas Mikulėnas
On Fri, Jan 7, 2022 at 4:11 PM Michal Koutný  wrote:

> More viable way seems to me to modify the apache2-mpm-itk to put
> children into respective cgroups.
>

I'm assuming that the resource usage primarily comes from something like
webapps running via mod_php, rather than Apache itself, in which case a
better approach would be to move the webapps out of apache entirely. At
work we also looked into mpm-itk for a student "shared hosting" server,
but instead ended up with a setup where each user automatically gets their
own PHP-FPM pool (with each vhost configured to talk to their own PHP-FPM
socket), so resource limits could be set either at PHP-FPM level, or if
needed multiple php-fpm@.service instances could be run with their own
cgroups.

-- 
Mantas Mikulėnas


Re: [systemd-devel] Automatically moving forked processes in a different cgroup based on children's UID

2022-01-07 Thread Michal Koutný
Hello Wadih.

On Sat, Jan 01, 2022 at 04:41:12PM -0500, Wadih  wrote:
> Is there a way to automatically classify child processes of a process
> in a different cgroup than the spawning process with systemd based on
> the children's new UID? I know apache2-mpm-itk calls setuid() on its
> children, so we would have to somehow hook on that. 

You can summon the whole PAM machinery and include pam_systemd in the
stack which would create a new session scope for the user. (Or do it
yourself from the process via DBus call
org.freedesktop.systemd1.Manager.StartTransientUnit() that gives you
more freedom for that). (Note that to keep the service lifecycle
tracking under the name of apache2.service, the forked children should
not reparent under PID 1 so that service parent can properly track
them.)

> I'd like to have the child processes that apache2-mpm-itk spawns go
> under their respective user, e.g.
> [...]
> system.slice/apache2.service/vhosts/%UID%

That's an alternative of maintaining the (relative) (sub)hierachy
yourself (and it doesn't require special treating wrt apache2.service
lifecycle).
Note that for this cgroup tree you'd need to specify apache2.service
Delegate= directive though.

> I've been able to do this with cgrulesengd and cgconfigparser for 3
> years, it's been rock solid.

I'm glad it work(s|ed) for you. The asynchronous classification via
cgrulesengd is racy and may not be always reliable (wrt resource
control). It's much better to do fork-classify-exec or
fork(CLONE_INTO_CGROUP)-exec synchronously in the migrated task.

> Would the only solution for me to create a daemon which monitors for
> setuid() calls of the parent apache process, and classify the children
> as per the new setuid user? 

I'd discourage you of going the path of cgrulesengd again. (And
cgroupify too :-p)

> Or perhaps, I think root parent processes spawning specific UID
> children is a common security practise, perhaps there should be
> something in systemd out of the box for classifying the children under
> their respective cgroups?

Yes, on the low level it's the StartTransientUnit() DBus call or its
specialized extensions for logind or machinectl.

> If my only solution is to create a daemon which monitors for setuid()
> I'll do it, although I've never done it before, not sure where I'd have
> to start. Any guidance would be great! 

More viable way seems to me to modify the apache2-mpm-itk to put
children into respective cgroups.

HTH,
Michal


Re: [systemd-devel] Automatically moving forked processes in a different cgroup based on children's UID

2022-01-03 Thread Benjamin Berg
Hi,

systemd will not help you with managing the cgroup sub-hierarchy
underneath the daemon. I suppose the most generic solution would be
something like cgrulesengd for cgroup v2. No idea if something like
that exists.

I assume you have had a look at
  https://systemd.io/CGROUP_DELEGATION/#three-scenarios
and other parts of that document. And that you are choosing option #2
for good reasons.

Managing the cgroup hierarchy is quite simple in principle (mkdir and
then a write to cgroup.procs). Or, even better by using
CLONE_INTO_CGROUP when creating the processes. It is not that hard to
write small daemon that does this.

If you want to do so, then you could look into the cgroupify hack[1]
that is in uresourced to move each process into its own cgroup. This is
done for browsers in Fedora as systemd-oomd always kills an entire
cgroup. That said, it is not perfect and you'll need a different logic
overall. But it may be a good reference if you want to implement
something similar yourself.

Having the cgroup management inside apache itself would probably be
better overall and may not be much harder.

Benjamin

[1] 
https://gitlab.freedesktop.org/benzea/uresourced/-/blob/master/cgroupify/cgroupify.c
Startup works by installing a small template service
https://gitlab.freedesktop.org/benzea/uresourced/-/blob/master/data/user/cgroup...@.service.in
and a simple drop-in unit for every service that should be managed
https://gitlab.freedesktop.org/benzea/uresourced/-/blob/master/data/user/cgroupify.service


On Sat, 2022-01-01 at 16:41 -0500, Wadih wrote:
> Hi,
> 
> I've been using apache2-mpm-itk with cgrulesengd in cgroupv1 to
> automatically classify the child processes that apache2-mpm-itk
> spawns when servicing web requests for different vhosts for about 3
> years, and it's been working great, when a vhost starts using up too
> much CPU/RAM, oom killer takes care of that specific vhost and leaves
> the others alone, as well as the parent process. 
> 
> I'm now preparing to move to Debian 11 as part of my yearly updates,
> and I'm finding out that I need to use cgroup v2 now. So I'm trying
> to bring my resource control solution to the new world.
> 
> When I create my e.g. /etc/systemd/system/user-
> UID.slice.d/override.conf with the resource controls for that user,
> they don't apply to the forked processes, as cgrulesengd used to be
> able to do, as I am confirming with systemd-cgls. Instead, the parent
> and all its children all still belong to the same apache2.service
> slice. Which makes sense since it wasn't systemd that spawned the
> child processes.
> 
> Is there a way to automatically classify child processes of a process
> in a different cgroup than the spawning process with systemd based on
> the children's new UID? I know apache2-mpm-itk calls setuid() on its
> children, so we would have to somehow hook on that. 
> 
> By default, the processes are now all in :
> 
> system.slice/apache2.service
> 
> I'd like to have the child processes that apache2-mpm-itk spawns go
> under their respective user, e.g.
> 
> system.slice/apache2.service/vhosts/%UID%
> 
> And then I would set a memory limit of 1G on
> system.slice/apache2.service/vhosts
> 
> Then when the sum of the memory usage of the vhosts goes above 1G,
> oom killer will choose the biggest offending group under
> system.slice/apache2.service/vhosts/ and terminate that group,
> without touching the others nor the parent process. I've been able to
> do this with cgrulesengd and cgconfigparser for 3 years, it's been
> rock solid. I'm trying to bring that to the new systemd world.
> 
> Would the only solution for me to create a daemon which monitors for
> setuid() calls of the parent apache process, and classify the
> children as per the new setuid user? 
> 
> Or perhaps, I think root parent processes spawning specific UID
> children is a common security practise, perhaps there should be
> something in systemd out of the box for classifying the children
> under their respective cgroups?
> 
> If my only solution is to create a daemon which monitors for setuid()
> I'll do it, although I've never done it before, not sure where I'd
> have to start. Any guidance would be great! 
> 
> Thank you so much,
> 
> Wadih Maalouf



signature.asc
Description: This is a digitally signed message part


[systemd-devel] Automatically moving forked processes in a different cgroup based on children's UID

2022-01-01 Thread Wadih
Hi,

I've been using apache2-mpm-itk with cgrulesengd in cgroupv1 to
automatically classify the child processes that apache2-mpm-itk spawns
when servicing web requests for different vhosts for about 3 years, and
it's been working great, when a vhost starts using up too much CPU/RAM,
oom killer takes care of that specific vhost and leaves the others
alone, as well as the parent process. 

I'm now preparing to move to Debian 11 as part of my yearly updates,
and I'm finding out that I need to use cgroup v2 now. So I'm trying to
bring my resource control solution to the new world.

When I create my e.g. /etc/systemd/system/user-
UID.slice.d/override.conf with the resource controls for that user,
they don't apply to the forked processes, as cgrulesengd used to be
able to do, as I am confirming with systemd-cgls. Instead, the parent
and all its children all still belong to the same apache2.service
slice. Which makes sense since it wasn't systemd that spawned the child
processes.

Is there a way to automatically classify child processes of a process
in a different cgroup than the spawning process with systemd based on
the children's new UID? I know apache2-mpm-itk calls setuid() on its
children, so we would have to somehow hook on that. 

By default, the processes are now all in :

system.slice/apache2.service

I'd like to have the child processes that apache2-mpm-itk spawns go
under their respective user, e.g.

system.slice/apache2.service/vhosts/%UID%


And then I would set a memory limit of 1G on
system.slice/apache2.service/vhosts

Then when the sum of the memory usage of the vhosts goes above 1G, oom
killer will choose the biggest offending group under
system.slice/apache2.service/vhosts/ and terminate that group, without
touching the others nor the parent process. I've been able to do this
with cgrulesengd and cgconfigparser for 3 years, it's been rock solid.
I'm trying to bring that to the new systemd world.

Would the only solution for me to create a daemon which monitors for
setuid() calls of the parent apache process, and classify the children
as per the new setuid user? 

Or perhaps, I think root parent processes spawning specific UID
children is a common security practise, perhaps there should be
something in systemd out of the box for classifying the children under
their respective cgroups?

If my only solution is to create a daemon which monitors for setuid()
I'll do it, although I've never done it before, not sure where I'd have
to start. Any guidance would be great! 

Thank you so much,

Wadih Maalouf