Re: [systemd-devel] Unable to check 'effective' cgroup limits

2022-06-09 Thread Michal Koutný
Hello.

On Thu, Jun 09, 2022 at 11:40:02AM +0100, Lewis Gaul  
wrote:
> [Disclaimer: cross posting from
> https://github.com/containers/podman/discussions/14538]
> 
> Apologies that this is more of a Linux cgroup question than specific to
> systemd, but I was wondering if someone here might be able to enlighten
> me...

Yes, this is most suitable for cgro...@vger.kernel.org. (Feel free to
continue there.)

> Two questions:
> 
>- Why on cgroups v1 do the cpuset controller's
>cpuset.effective_{cpus,mems} seem to simply not work?

It's how it eveolved and instead of changing the accustomed behavior,
there's whole different v2.

> Didn't expect this to fail - shouldn't it automatically impose a stricter
> limit on any child cgroups? Do I need to manually update all child cgroups
> first?

The v1 API simply doesn't implement the hierarchical configuration well
(such that ancestors can always override descendants).

> But can't relax the child's cgroup restriction (i.e. need awareness of CPU
> restrictions already imposed above - how are you supposed to check this in
> a private cgroup namespace?).

Binary^WExhaustive search?

> Memory/Hugetlb effective limits
> [...]
> There is a memory.limit_in_bytes file, but no
> memory.effective_limit_in_bytes to reflect parent cgroup restrictions.
> 
> Similarly on cgroups v2:
> [...]
> I guess you could traverse up the cgroup hierarchy to find the smallest
> limit being imposed... But this isn't possible inside a private cgroup
> namespace. Is there any way to find the actual cgroup limit imposed?

I've been actually pondering with .effective analogues for limits on v2
for this reasons. Short answer is that's not implemented.

More generally -- why would you want to know the inherited limit?
(For regular memory, there's the idea, that you watch memory.pressure
and adjust your behavior based on that instead of adapting to residue
from memory.max.)

HTH,
Michal


[systemd-devel] Unable to check 'effective' cgroup limits

2022-06-09 Thread Lewis Gaul
Hi everyone,

[Disclaimer: cross posting from
https://github.com/containers/podman/discussions/14538]

Apologies that this is more of a Linux cgroup question than specific to
systemd, but I was wondering if someone here might be able to enlighten
me...

Two questions:

   - Why on cgroups v1 do the cpuset controller's
   cpuset.effective_{cpus,mems} seem to simply not work?
   - Is there any way to check effective cgroup memory or hugetlb limits?
   (cgroups v1 or v2)

Cpuset effective limits

root@ubuntu:~# podman run --rm -it --privileged -w /sys/fs/cgroup fedora
[root@7b9b67c7e1d4 cgroup]# mkdir cpuset/my-group
[root@7b9b67c7e1d4 cgroup]# cat cpuset/cpuset.cpus
0-5
[root@7b9b67c7e1d4 cgroup]# cat cpuset/my-group/cpuset.cpus

[root@7b9b67c7e1d4 cgroup]# cat cpuset/my-group/cpuset.effective_cpus

Expected cpuset/my-group/cpuset.effective_cpus to give 0-5 as set in the
parent cgroup. Works as expected on cgroups v2.

[root@7b9b67c7e1d4 cgroup]# echo 0-5 > cpuset/my-group/cpuset.cpus
[root@7b9b67c7e1d4 cgroup]# cat cpuset/my-group/cpuset.{effective_,}cpus
0-5
0-5
[root@7b9b67c7e1d4 cgroup]# echo 0-4 > cpuset/cpuset.cpus
bash: echo: write error: Device or resource busy

Didn't expect this to fail - shouldn't it automatically impose a stricter
limit on any child cgroups? Do I need to manually update all child cgroups
first?

[root@7b9b67c7e1d4 cgroup]# echo 0-4 > cpuset/my-group/cpuset.cpus
[root@7b9b67c7e1d4 cgroup]# cat cpuset/my-group/cpuset.{effective_,}cpus
0-4
0-4

Can impose a stricter limit on child cgroups, as expected.

[root@7b9b67c7e1d4 cgroup]# echo 0-4 > cpuset/cpuset.cpus
[root@7b9b67c7e1d4 cgroup]# echo 0-5 > cpuset/my-group/cpuset.cpus
bash: echo: write error: Permission denied

But can't relax the child's cgroup restriction (i.e. need awareness of CPU
restrictions already imposed above - how are you supposed to check this in
a private cgroup namespace?).
Memory/Hugetlb effective limits

On cgroups v1:

[root@7b9b67c7e1d4 cgroup]# ls memory/
cgroup.clone_children  memory.kmem.failcnt
memory.kmem.tcp.limit_in_bytes  memory.max_usage_in_bytes
memory.soft_limit_in_bytes  notify_on_release
cgroup.event_control   memory.kmem.limit_in_bytes
memory.kmem.tcp.max_usage_in_bytes  memory.move_charge_at_immigrate
memory.stat tasks
cgroup.procs   memory.kmem.max_usage_in_bytes
memory.kmem.tcp.usage_in_bytes  memory.numa_stat
memory.swappiness
memory.failcnt memory.kmem.slabinfo
memory.kmem.usage_in_bytes  memory.oom_control
memory.usage_in_bytes
memory.force_empty memory.kmem.tcp.failcnt
memory.limit_in_bytes   memory.pressure_level
memory.use_hierarchy

There is a memory.limit_in_bytes file, but no
memory.effective_limit_in_bytes to reflect parent cgroup restrictions.

Similarly on cgroups v2:

[root@0c0d71230663 cgroup]# ls memory.*
memory.current  memory.events.local  memory.low  memory.min
memory.oom.group  memory.stat  memory.swap.events
memory.swap.max
memory.events   memory.high  memory.max  memory.numa_stat
memory.pressure   memory.swap.current  memory.swap.high

There is a memory.max file, but not memory.max.effective (corresponding to
cpuset.cpus.effective).

I guess you could traverse up the cgroup hierarchy to find the smallest
limit being imposed... But this isn't possible inside a private cgroup
namespace. Is there any way to find the actual cgroup limit imposed?



Any insights welcome!


Thanks,

Lewis