On Thu, Nov 19, 2020 at 10:14:18PM +0300, Andrei Enshin <b...@bk.ru> wrote: > For you it might be interesting in sake of improving robustness of > systemd in case of such invaders as kubelet+cgroupfs : ) I think the interface is clearly defined in the CGROUP_DELEGATION document though. I'm happy if a bug can be found in general. I'm happier when it's a well defined and reproducible case.
> ########## (1) abandoned cgroup ########## > > systemd isn't aware of it and it would clean the hierarchy according to its > > configuration That was related to a controller hierarchy (which I understood was the k8s issue about). Below it is a named hierarchy there it's yet different. > systemd hasn’t deleted the unknown hierarchy, it’s still presented: > [...] > cgroup.procs here and in it’s child cgroup > 8842def241fac72cb34fdce90297b632f098289270fa92ec04643837f5748c15 are empty. > Seems there are no processes attached to these cgroups. Date of creation is > Jul 16-17. What systemd version is it? What cgroup setup is it (legacy or hybrid)? > ########## (2) mysterious mount of systemd hierarchy ########## > [...] > Seems to be cyclic mount. Questions are who, why and when did the second > mysterious mount? > I have two candidates: > - runc during container creation; > - systemd, probably because it was confused by kubelet and it’s unexpected > usage of cgroups. I don't see why/how would systemd (PID 1) do this (not sure about nspawn). Anyway you can try tracing mounts systemwide (e.g. `perf trace -a -e syscalls:sys_enter_mount`) to find out who does the mount. > ########## (3) suspected owner of mysterious mount is systemd-nspawn machine > ########## > [...] > Let’s explore cgroups of centos75 machine: > # ls -lah > /sys/fs/cgroup/systemd/machine.slice/systemd-nspawn\@centos75.service/payload/system.slice/ > | grep sys-fs-cgroup-systemd > > drwxr-xr-x. 2 root root 0 Nov 9 20:07 > host\x2drootfs-sys-fs-cgroup-systemd-kubepods-burstable-pod7ffde41a\x2dfa85\x2d4b01\x2d8023\x2d69a4e4b50c55-8842def241fac72cb34fdce90297b632f098289270fa92ec04643837f5748c15.mount > > drwxr-xr-x. 2 root root 0 Jul 16 08:05 > host\x2drootfs-sys-fs-cgroup-systemd.mount > > drwxr-xr-x. 2 root root 0 Jul 16 08:05 > host\x2drootfs-var-lib-machines-centos75-sys-fs-cgroup-systemd.mount > There are three interesting cgroups in container. First one seems to be in > relation with the abandoned cgroup and mysterious mount on the host. Note those are cgroups created for .mount units (and under nested payload's system.slice). It tells that within the container a mount point at > host/rootfs/sys/fs/cgroup/systemd/kubepods/burstable/pod7ffde41a/fa85/4b01/8023/69a4e4b50c55/8842def241fac72cb34fdce90297b632f098289270fa92ec04643837f5748c15 was visible. It doesn't mean that the mount was done within the container. I can't tell why was that, it depends how was systemd-nspawn instructed to realize mounts for the container. > Creation date is Nov 9 20:07. I’ve updated kubelet at Nov 8 12:01. > Сoincidence?! I don't think so. Yes, it can be related. For instance: - The cyclic bind mount happened, - it's visibility was propagated into the nspawn container - and inner systemd created cgroup for the (generated) .mount unit (possibly after daemon-reload). > Q1. Let me ask, what is the meaning of mount inside centos75 container? > /system.slice/host\x2drootfs-sys-fs-cgroup-systemd-kubepods-burstable-pod7ffde41a\x2dfa85\x2d4b01\x2d8023\x2d69a4e4b50c55-8842def241fac72cb34fdce90297b632f098289270fa92ec04643837f5748c15.mount > > Q2. Why the mount appeared in the container at Nov 9, 20:07 ? Hopefully, it's answered above. > ##### mind-blowing but migh be important note ##### > [...] > The node already seems to have not healthy mounts: Is there the conflicting cgroup driver used again? > # cat /proc/self/mountinfo |grep systemd | grep cgr > 26 25 0:23 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:6 > - cgroup cgroup > rw,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd > 866 865 0:23 / > /var/lib/rkt/pods/run/3720606d-535b-4e59-a137-ee00246a20c1/stage1/rootfs/opt/stage2/hyperkube-amd64/rootfs/sys/fs/cgroup/systemd > rw,nosuid,nodev,noexec,relatime shared:6 - cgroup cgroup > rw,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd > 5253 26 0:23 > /kubepods/burstable/pod64ad01cf-5dd4-4283-abe0-8fb8f3f13dc3/4a81a28292c3250e03c27a7270cdf58a07940e462999ab3e2be51c01b3a6bf10 > > /sys/fs/cgroup/systemd/kubepods/burstable/pod64ad01cf-5dd4-4283-abe0-8fb8f3f13dc3/4a81a28292c3250e03c27a7270cdf58a07940e462999ab3e2be51c01b3a6bf10 > rw,nosuid,nodev,noexec,relatime shared:6 - cgroup cgroup > rw,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd > 5251 866 0:23 > /kubepods/burstable/pod64ad01cf-5dd4-4283-abe0-8fb8f3f13dc3/4a81a28292c3250e03c27a7270cdf58a07940e462999ab3e2be51c01b3a6bf10 > > /var/lib/rkt/pods/run/3720606d-535b-4e59-a137-ee00246a20c1/stage1/rootfs/opt/stage2/hyperkube-amd64/rootfs/sys/fs/cgroup/systemd/kubepods/burstable/pod64ad01cf-5dd4-4283-abe0-8fb8f3f13dc3/4a81a28292c3250e03c27a7270cdf58a07940e462999ab3e2be51c01b3a6bf10 > rw,nosuid,nodev,noexec,relatime shared:6 - cgroup cgroup > rw,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd > Also seems systemd-nspawn is not affected yet, since there is no such > cgroup inside centos75 container (we have it on each machine) but only > abandoned one, with empty cgroup.procs: It'd depend on the mounts propagation into that container and what systemd inside that container did (i.e. the mount unit may not have been created yet). Michal
signature.asc
Description: Digital signature
_______________________________________________ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel