Package: sbuild,debianutils,libc6,systemd-sysv Severity: important Hello lots of maintainers,
I am faced with a very crazy interaction bug. Roughly speaking, when you use sbuild to build a package and your build-depends happen to include systemd-sysv and you happen to install (cross building) or upgrade libc6, installing build-depends reliably fails. Since upgrading libc6 is a thing, I guess that this now affects buildds and is why I file this at important severity. Regenerating buildd chroots, will "heal" buildds, so it is self-recovering there. Without further ado, let's dive into the details. The issue is reproducible using mmdebstrap: mmdebstrap unstable --verbose --architectures amd64,arm64 --variant=apt /dev/null --include=systemd-sysv,libc6:arm64 --essential-hook='ln -sf /bin/false $1/usr/bin/ischroot' This is using a cross build setting, because libc6 is installed early during bootstrap and reproducing the bug takes configuring libc6 after systemd-sysv has been unpacked. So I simply install a foreign libc6 and apt happens to configure it late enough in my tests. So we now look into libc6.postinst. We take the "$1" = "configure" branch. We eventually run into: | # Restart init. Currently handles chroots, systemd and upstart, and | # assumes anything else is going to not fail at behaving like | # sysvinit: | TELINIT=yes | if ischroot 2>/dev/null; then | # Don't bother trying to re-exec init from a chroot: | TELINIT=no I note that mmdebstrap creates a number of namespaces and then externally runs apt. If I understand things correctly, it also runs an external dpkg --root ... without --force-scripts-chrootless. Hence dpkg performs a chroot for every maintainer script and ischroot correctly detects this, so we would be setting TELINIT=no if I were not replacing it in the --essential-hook. In sbuild, the namespace setup is different. apt is entirely run inside the namespace. ischroot compares /proc/1/mountinfo to /proc/self/mountinfo. If both are readable and equal, it concludes that we're not in a chroot. If they differ, it concludes that we are in a chroot. For mmdebstrap, pid 1 happens to be a mmdebstrap process in the initial namespace and the ischroot process sees fewer mounts. Hence it concludes success there. For sbuild, pid 1 is a runuser process already running chrooted. Hence the mountinfo files equal and ischroot concludes that we are not running in a chroot. | elif [ -n "${DPKG_ROOT:-}" ]; then | # Do not re-exec init if we are operating on a chroot from outside: | TELINIT=no In neither case DPKG_ROOT is non-empty. | elif [ -d /run/systemd/system ]; then | # Restart systemd on upgrade, but carefully. | # The restart is wanted because of LP: #1942276 and Bug: #993821 | # The care is needed because of https://bugs.debian.org/753725 | # (if systemd --help fails the system might still be quite broken but | # that seems better than the kernel panic that results if systemd | # cannot reexec itself). | TELINIT=no In neither case /run/systemd/system exists. | if systemd --help >/dev/null 2>/dev/null; then | systemctl daemon-reexec | else | echo "Error: Could not restart systemd, systemd binary not working" >&2 | fi | fi | if [ "$TELINIT" = "yes" ]; then | telinit u 2>/dev/null || true ; sleep 1 | fi And finally we run telinit u when running inside sbuild or faking ischroot in mmdebstrap. Running telinit u doesn't go well. This actually has been a known problem with different symptoms recently. Earlier, cross build nodes would get stuck in libc6.postinst hanging in telinit forever. The reason was that telinit was re-executing itself over and over again attempting to forward to another init system but always returning back to itself. This has been fixed by Luca Boccassi: https://github.com/systemd/systemd/pull/31251 and #1063147 telinit no longer reexecs itself and rather does what it is supposed to do: kill(1, SIGTERM). Sadly this doesn't go well. In case of sbuild, we kill the runuser process. It exits non-zero and sbuild considers this a failure to install Build-Depends. This is bad. So I'm not exactly sure which part is broken here. We might argue that sbuild is setting up a container that looks too much like a container and should have pid 1 outside the chroot area or that the init process should handle SIGTERM more like an init system would handle that. We might argue that ischroot should detect init-less application container environments. We might argue that libc6 should ischroot is not meant for detecting application containers and libc6.postinst asks the wrong question and should be skipping telinit for such environments as well. We might argute that telinit should not kill a pid 1 that isn't systemd. At this time, I am really unsure which of these four packages we consider at fault. Possibly, we select multiple options to harden things in depth. I am now seeking feedback from the various maintainers: - debianutils - glibc - sbuild - systemd Do you think that your package handles this situation correctly and that some other package should change or do you see your package behaving wrongly? Thanks in advance for replying Helmut