Re: [systemd-devel] [PATCH] Fix systemd crash (on assert) during shutdown/reboot in unprivileged container
On Fri, 16.01.15 19:32, Andrei Borzenkov (arvidj...@gmail.com) wrote: В Thu, 15 Jan 2015 19:24:25 -0500 Stéphane Graber stgra...@ubuntu.com пишет: @@ -871,6 +871,14 @@ static void mount_enter_unmounting(Mount *m) { m-control_command_id = MOUNT_EXEC_UNMOUNT; m-control_command = m-exec_command + MOUNT_EXEC_UNMOUNT; +/* Ignore any mounts under /dev, /proc or /sys */ +if (path_startswith(m-where, /dev/) || +path_startswith(m-where, /proc/) || +path_startswith(m-where, /sys/)) { +mount_set_state(m, MOUNT_DEAD); +return; +} + This does not look right either. I'd rather expect to a) set DefaultDependencies=no for these special mounts so that they are left at shutdown and b) ignore them in the final killing spree if needed (unless happens already). I pretty much implemented this now, though instead of setting DefaultDependencies=no for these mounts I just made DefaultDependencies=yes have no effect for them. But the general approach I agree with. If I as user do systemctl stop /dev/pts I expect it to unmount /dev/pts not fake dead state. Well, /dev/pts is not a mount point systemd picks up anyway, so the unit for that doesn't even exist. It's already exempted. But I do get your point and I agree. Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] Fix systemd crash (on assert) during shutdown/reboot in unprivileged container
On Thu, 15.01.15 19:24, Stéphane Graber (stgra...@ubuntu.com) wrote: On Thu, Jan 15, 2015 at 07:20:55PM +0100, Lennart Poettering wrote: diff --git a/src/core/mount.c b/src/core/mount.c index 612d150..4de878e 100644 --- a/src/core/mount.c +++ b/src/core/mount.c @@ -871,6 +871,14 @@ static void mount_enter_unmounting(Mount *m) { m-control_command_id = MOUNT_EXEC_UNMOUNT; m-control_command = m-exec_command + MOUNT_EXEC_UNMOUNT; +/* Ignore any mounts under /dev, /proc or /sys */ +if (path_startswith(m-where, /dev/) || +path_startswith(m-where, /proc/) || +path_startswith(m-where, /sys/)) { +mount_set_state(m, MOUNT_DEAD); +return; +} + r = exec_command_set(m-control_command, /bin/umount, m-where, NULL); if (r = 0 UNIT(m)-manager-running_as == SYSTEMD_SYSTEM) r = exec_command_append(m-control_command, -n, NULL); Ah, nah, that patch wouldn't work, the state would be restored later when we read from /proc/self/mountinfo again... Anyway, I kinda missed that this is already an issue witht the shutdown logic of the main unit engine. I assumed this was only about the final unmount spree in systemd-shutdown. I now made this change: http://cgit.freedesktop.org/systemd/systemd/commit/?id=874d3404cbf2363604106c8f86683db4082691ea This does two things: - Exempts all file systems below /dev, /proc, /sys from getting a Conflicts= dependency with umount.target. THis means during shutdown, where umount.target is pulled in these mount units will not be stopped. - In the final umount loop in systemd-shutdown we simply won't bother anymore with these file systems too. Hope this makes things work for you. Please test! Thanks, Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] Fix systemd crash (on assert) during shutdown/reboot in unprivileged container
Andrei Borzenkov [2015-01-16 19:32 +0300]: If I as user do systemctl stop /dev/pts I expect it to unmount /dev/pts not fake dead state. Well, not all such commands can be expected to do something useful. Stopping mounts which the system depends on is going to break it, so trying to systemctl stop -- -.mount is rightfully refused already, while unmounting other things like /dev/pts will just eventually break your system anyway. Personally I think that Stephane's second patch was just about the right compromise: http://lists.freedesktop.org/archives/systemd-devel/attachments/20150115/f56f60ce/attachment-0001.patch It is reasonably future-proof (unlike the first patch) and avoids the problem you mentioned too, i. e. silently not doing operations while they appear to be successful. @Lennart: We now have three patches -- could you make an executive decision which one we take? :-) Thanks, Martin -- Martin Pitt| http://www.piware.de Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org) signature.asc Description: Digital signature ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] Fix systemd crash (on assert) during shutdown/reboot in unprivileged container
On Thu, 15.01.15 12:14, Stéphane Graber (stgra...@ubuntu.com) wrote: Hello, The last big issue I'm running into when running systemd in an unprivileged LXC container is that it's crashing on an assert in the shutdown/reboot path right after unmounting all devices. That's because due to mknod not being allowed inside a user namespace, we have to bind-mount all the required device nodes from the host's /dev on top of empty files in the container's /dev. This all works great until systemd unmounts everything. At which point, all of those are 0 byte files. Systemd then opens /dev/urandom and attempts to read some bytes from there, gets 0 bytes back and trips an assertion. To fix that, I've got two different approaches, both with an associated patch attached to this e-mail: - 0001-Add-dev-urandom-to-ignore_paths.patch: This very simply adds /dev/urandom to the ignore_paths list alongside /dev/console. That way all the other mount entries are unmounted but /dev/urandom isn't, fixing the issue we're currently seeing. - 0001-Ignore-devices-bind-mounts.patch: This one is a more generic take on the problem and should be more future-proof. Rather than hardcoding /dev/urandom, it extends the existing mount_point_ignore function to ignore any mountpoint which is a character or block device. I think I'd prefer if we simply would avoid unmounting anything that sits below /sys, /dev, /proc. i.e. a simple path_startswith() check before the unmount... Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH] Fix systemd crash (on assert) during shutdown/reboot in unprivileged container
Hello, The last big issue I'm running into when running systemd in an unprivileged LXC container is that it's crashing on an assert in the shutdown/reboot path right after unmounting all devices. That's because due to mknod not being allowed inside a user namespace, we have to bind-mount all the required device nodes from the host's /dev on top of empty files in the container's /dev. This all works great until systemd unmounts everything. At which point, all of those are 0 byte files. Systemd then opens /dev/urandom and attempts to read some bytes from there, gets 0 bytes back and trips an assertion. To fix that, I've got two different approaches, both with an associated patch attached to this e-mail: - 0001-Add-dev-urandom-to-ignore_paths.patch: This very simply adds /dev/urandom to the ignore_paths list alongside /dev/console. That way all the other mount entries are unmounted but /dev/urandom isn't, fixing the issue we're currently seeing. - 0001-Ignore-devices-bind-mounts.patch: This one is a more generic take on the problem and should be more future-proof. Rather than hardcoding /dev/urandom, it extends the existing mount_point_ignore function to ignore any mountpoint which is a character or block device. I tend to prefer the latter because it's future-proof and avoids hardcoding paths, however it certainly is more likely to have side-effects than the first (though I can't think of any obvious one). -- Stéphane Graber Ubuntu developer http://www.ubuntu.com From ef87932352d505b477c27c40f33af6015ea2b2b5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?St=C3=A9phane=20Graber?= stgra...@ubuntu.com Date: Thu, 15 Jan 2015 12:04:54 -0500 Subject: [PATCH] Add /dev/urandom to ignore_paths MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit systemd opens /dev/urandom during the shutdown sequence after it's done unmounting everything. If /dev/urandom is a bind-mounted file, that will unmount it and systemd will then crash on an assert when attempting to read some random bytes and getting 0 back. Signed-off-by: Stéphane Graber stgra...@ubuntu.com --- src/core/mount-setup.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/core/mount-setup.c b/src/core/mount-setup.c index bd3a035..b8ada98 100644 --- a/src/core/mount-setup.c +++ b/src/core/mount-setup.c @@ -130,6 +130,7 @@ static const char ignore_paths[] = /* Container bind mounts */ /proc/sys\0 /dev/console\0 +/dev/urandom\0 /proc/kmsg\0; bool mount_point_is_api(const char *path) { -- 1.9.1 From 251cd89d77befb77238e4f6dd7903adcc8bbf035 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?St=C3=A9phane=20Graber?= stgra...@ubuntu.com Date: Thu, 15 Jan 2015 10:45:28 -0500 Subject: [PATCH] Ignore devices bind-mounts MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Devices bind-mounts are typically used in unprivileged containers where mknod isn't possible. On those systems, all entries under /dev are empty files with the device node bind-mounted on top. Current systemd code will unmount all of those during the shutdown sequence even though systemd may still need them, ultimately leading to crashes in the shutdown/reboot sequence. Signed-off-by: Stéphane Graber stgra...@ubuntu.com --- src/core/mount-setup.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/core/mount-setup.c b/src/core/mount-setup.c index bd3a035..faa7ab2 100644 --- a/src/core/mount-setup.c +++ b/src/core/mount-setup.c @@ -147,11 +147,21 @@ bool mount_point_is_api(const char *path) { bool mount_point_ignore(const char *path) { const char *i; +struct stat st; +/* Skip any mountpoint that's listed in ignore_paths */ NULSTR_FOREACH(i, ignore_paths) if (path_equal(path, i)) return true; +/* Additionaly skip any block/character device bind-mount as + those can't possibly be blocking anything and are very likely + to be required by systemd itself. */ +if (lstat(path, st) == 0 +(st.st_mode S_IFBLK || st.st_mode S_IFCHR)) { +return true; +} + return false; } -- 1.9.1 signature.asc Description: Digital signature ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] Fix systemd crash (on assert) during shutdown/reboot in unprivileged container
On Thu, Jan 15, 2015 at 07:20:55PM +0100, Lennart Poettering wrote: On Thu, 15.01.15 12:14, Stéphane Graber (stgra...@ubuntu.com) wrote: Hello, The last big issue I'm running into when running systemd in an unprivileged LXC container is that it's crashing on an assert in the shutdown/reboot path right after unmounting all devices. That's because due to mknod not being allowed inside a user namespace, we have to bind-mount all the required device nodes from the host's /dev on top of empty files in the container's /dev. This all works great until systemd unmounts everything. At which point, all of those are 0 byte files. Systemd then opens /dev/urandom and attempts to read some bytes from there, gets 0 bytes back and trips an assertion. To fix that, I've got two different approaches, both with an associated patch attached to this e-mail: - 0001-Add-dev-urandom-to-ignore_paths.patch: This very simply adds /dev/urandom to the ignore_paths list alongside /dev/console. That way all the other mount entries are unmounted but /dev/urandom isn't, fixing the issue we're currently seeing. - 0001-Ignore-devices-bind-mounts.patch: This one is a more generic take on the problem and should be more future-proof. Rather than hardcoding /dev/urandom, it extends the existing mount_point_ignore function to ignore any mountpoint which is a character or block device. I think I'd prefer if we simply would avoid unmounting anything that sits below /sys, /dev, /proc. i.e. a simple path_startswith() check before the unmount... Lennart Something like that? -- Stéphane Graber Ubuntu developer http://www.ubuntu.com From efbb09d4b0be9a059e4a0444ceeedb873c598b31 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?St=C3=A9phane=20Graber?= stgra...@ubuntu.com Date: Thu, 15 Jan 2015 19:22:06 -0500 Subject: [PATCH] Skip anything in dev, sys or proc on unmount When getting around to unmounting things, don't actually call /sbin/umount for anything inside /dev, /sys or /proc. You really shouldn't have any block device mounted in there and any remaining mount may well be used by systemd itself during the last few steps of shutdown. --- src/core/mount.c | 8 1 file changed, 8 insertions(+) diff --git a/src/core/mount.c b/src/core/mount.c index 612d150..4de878e 100644 --- a/src/core/mount.c +++ b/src/core/mount.c @@ -871,6 +871,14 @@ static void mount_enter_unmounting(Mount *m) { m-control_command_id = MOUNT_EXEC_UNMOUNT; m-control_command = m-exec_command + MOUNT_EXEC_UNMOUNT; +/* Ignore any mounts under /dev, /proc or /sys */ +if (path_startswith(m-where, /dev/) || +path_startswith(m-where, /proc/) || +path_startswith(m-where, /sys/)) { +mount_set_state(m, MOUNT_DEAD); +return; +} + r = exec_command_set(m-control_command, /bin/umount, m-where, NULL); if (r = 0 UNIT(m)-manager-running_as == SYSTEMD_SYSTEM) r = exec_command_append(m-control_command, -n, NULL); -- 1.9.1 signature.asc Description: Digital signature ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel