On Thu, Nov 19, 2020 at 10:14:18PM +0300, Andrei Enshin <b...@bk.ru> wrote:
> For you it might be interesting in sake of improving robustness of
> systemd in case of such invaders as kubelet+cgroupfs : )
I think the interface is clearly defined in the CGROUP_DELEGATION
document though.
I'm happy if a bug can be found in general. I'm happier when it's a well
defined and reproducible case.

> ########## (1) abandoned cgroup ##########
> > systemd isn't aware of it and it would clean the hierarchy according to its 
> > configuration
That was related to a controller hierarchy (which I understood was the
k8s issue about).

Below it is a named hierarchy there it's yet different.

> systemd hasn’t deleted the unknown hierarchy, it’s still presented:
> [...]
> cgroup.procs here and in it’s child cgroup 
> 8842def241fac72cb34fdce90297b632f098289270fa92ec04643837f5748c15 are empty.
> Seems there are no processes attached to these cgroups. Date of creation is 
> Jul 16-17.
What systemd version is it? What cgroup setup is it (legacy or hybrid)?


> ########## (2) mysterious mount of systemd hierarchy ########## 
> [...]
>   Seems to be cyclic mount. Questions are who, why and when did the second 
> mysterious mount?
> I have two candidates:
> - runc during container creation;
> - systemd, probably because it was confused by kubelet and it’s unexpected 
> usage of cgroups.
I don't see why/how would systemd (PID 1) do this (not sure about
nspawn). Anyway you can try tracing mounts systemwide (e.g. `perf trace
-a -e syscalls:sys_enter_mount`) to find out who does the mount.

> ########## (3) suspected owner of mysterious mount is systemd-nspawn machine 
> ##########
> [...]
> Let’s explore cgroups of centos75 machine:
> # ls -lah 
> /sys/fs/cgroup/systemd/machine.slice/systemd-nspawn\@centos75.service/payload/system.slice/
>  | grep sys-fs-cgroup-systemd
> 
> drwxr-xr-x.   2 root root 0 Nov  9 20:07 
> host\x2drootfs-sys-fs-cgroup-systemd-kubepods-burstable-pod7ffde41a\x2dfa85\x2d4b01\x2d8023\x2d69a4e4b50c55-8842def241fac72cb34fdce90297b632f098289270fa92ec04643837f5748c15.mount
> 
> drwxr-xr-x.   2 root root 0 Jul 16 08:05 
> host\x2drootfs-sys-fs-cgroup-systemd.mount
> 
> drwxr-xr-x.   2 root root 0 Jul 16 08:05 
> host\x2drootfs-var-lib-machines-centos75-sys-fs-cgroup-systemd.mount
>   There are three interesting cgroups in container. First one seems to be in 
> relation with the abandoned cgroup and mysterious mount on the host.
Note those are cgroups created for .mount units (and under nested
payload's system.slice). It tells that within the container a mount
point at
> host/rootfs/sys/fs/cgroup/systemd/kubepods/burstable/pod7ffde41a/fa85/4b01/8023/69a4e4b50c55/8842def241fac72cb34fdce90297b632f098289270fa92ec04643837f5748c15
was visible. It doesn't mean that the mount was done within the
container.

I can't tell why was that, it depends how was systemd-nspawn instructed
to realize mounts for the container.

> Creation date is Nov  9 20:07. I’ve updated kubelet at Nov  8 12:01. 
> Сoincidence?! I don't think so.
Yes, it can be related. For instance:
- The cyclic bind mount happened,
- it's visibility was propagated into the nspawn container 
- and inner systemd created cgroup for the (generated) .mount unit
  (possibly after daemon-reload).

> Q1. Let me ask, what is the meaning of mount inside centos75 container?
> /system.slice/host\x2drootfs-sys-fs-cgroup-systemd-kubepods-burstable-pod7ffde41a\x2dfa85\x2d4b01\x2d8023\x2d69a4e4b50c55-8842def241fac72cb34fdce90297b632f098289270fa92ec04643837f5748c15.mount
> 
> Q2. Why the mount appeared in the container at Nov 9, 20:07 ?
Hopefully, it's answered above.

> ##### mind-blowing but migh be important note #####
> [...]
> The node already seems to have not healthy mounts:
Is there the conflicting cgroup driver used again?

> # cat /proc/self/mountinfo |grep systemd | grep cgr
> 26 25 0:23 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:6 
> - cgroup cgroup 
> rw,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd
> 866 865 0:23 / 
> /var/lib/rkt/pods/run/3720606d-535b-4e59-a137-ee00246a20c1/stage1/rootfs/opt/stage2/hyperkube-amd64/rootfs/sys/fs/cgroup/systemd
>  rw,nosuid,nodev,noexec,relatime shared:6 - cgroup cgroup 
> rw,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd
> 5253 26 0:23 
> /kubepods/burstable/pod64ad01cf-5dd4-4283-abe0-8fb8f3f13dc3/4a81a28292c3250e03c27a7270cdf58a07940e462999ab3e2be51c01b3a6bf10
>  
> /sys/fs/cgroup/systemd/kubepods/burstable/pod64ad01cf-5dd4-4283-abe0-8fb8f3f13dc3/4a81a28292c3250e03c27a7270cdf58a07940e462999ab3e2be51c01b3a6bf10
>  rw,nosuid,nodev,noexec,relatime shared:6 - cgroup cgroup 
> rw,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd
> 5251 866 0:23 
> /kubepods/burstable/pod64ad01cf-5dd4-4283-abe0-8fb8f3f13dc3/4a81a28292c3250e03c27a7270cdf58a07940e462999ab3e2be51c01b3a6bf10
>  
> /var/lib/rkt/pods/run/3720606d-535b-4e59-a137-ee00246a20c1/stage1/rootfs/opt/stage2/hyperkube-amd64/rootfs/sys/fs/cgroup/systemd/kubepods/burstable/pod64ad01cf-5dd4-4283-abe0-8fb8f3f13dc3/4a81a28292c3250e03c27a7270cdf58a07940e462999ab3e2be51c01b3a6bf10
>  rw,nosuid,nodev,noexec,relatime shared:6 - cgroup cgroup 
> rw,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd
>   Also seems systemd-nspawn is not affected yet, since there is no such 
> cgroup inside centos75 container (we have it on each machine) but only 
> abandoned one, with empty cgroup.procs:
It'd depend on the mounts propagation into that container and what
systemd inside that container did (i.e. the mount unit may not have been
created yet).

Michal

Attachment: signature.asc
Description: Digital signature

_______________________________________________
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Reply via email to