On 1/8/26 11:37, Dima Zavin wrote:
> Commit c04b565204eb6b7e3508ac8dd42539ab97752635> reworked how
switch_root moves mounts into the new root, but it
inadvertently removed the moving of the root itself onto / for the mount
namespace before chrooting.
This confuses future users of the mount namespaces since root mount gets
preserved and thus entering any derived mount namespace retains the pre-chroot
structure.
Sigh, switch_root is one of the commands I need to get scripts/test.sh
to run under mkroot to automatically regression test.
Found in yocto (scarthgap, toybox 0.8.11) where the mount namespaces
contained just /rootfs in their /. Repro is simple:
Before:
```
% sudo nsenter -m -t $$
nsenter: failed to execute /bin/sh: No such file or directory
Huh, does nsenter -m effectively do a chdir / ? Does it _always_ break
out of a normal chroot?
$ cd toybox/root/x86_64
$ sudo chroot fs
password:
$ mount -t proc proc /proc
$ nsenter -m -t $$ /bin/sh
# ls
# head -n 1 /etc/os-release
PRETTY_NAME="Devuan GNU/Linux 5 (daedalus)"
Apparently so. Good to know, I guess. (Dear lkml: what the? I know you
refused to patch the cd ../../../.. hole but this is just silly.)
% sudo nsenter -m -t $$ /rootfs/usr/lib64/ld-linux-x86-64.so.2 \
--library-path /rootfs/lib:/rootfs/lib64:/rootfs/usr/lib64:/rootfs/usr/lib \
/rootfs/usr/sbin/chroot.coreutils /rootfs
You manually ran the dynamic linker against chroot.coreutils, to chroot
into /rootfs, within which I'm assuming it ran /bin/sh. Not sure what
that proved, you just chrooted _back_ without the mount --move a second
time.
#
```
After:
```
% sudo nsenter -m -t $$
#
```
Fixes #557
Signed-off-by: Dima Zavin <[email protected]>
---
toys/other/switch_root.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/toys/other/switch_root.c b/toys/other/switch_root.c
index 1c750608f..b63b92ec3 100644
--- a/toys/other/switch_root.c
+++ b/toys/other/switch_root.c
@@ -97,6 +97,12 @@ void switch_root_main(void)
// Ok, enough safety checks: wipe root partition.
dirtree_read("/", del_node);
+ // Fix the appearance of the mount table in the newroot chroot
+ if (mount(".", "/", NULL, MS_MOVE, NULL)) {
+ perror_msg("mount");
+ goto panic;
+ }
+
// Enter the new root before starting init
if (chroot(".")) {
perror_msg("chroot");
In theory the dirtree_read("/") is supposed to operate on "/" as well as
the children. In practice there's a sequencing issue with mounts being
under other mounts (which this is a trivial case of). if you have two
mount points arranged dir1/dir2, you need to move dir2 to /tmp, move
dir1 new, and them move /tmp to new/dir1/dir2. (There's no MS_MOVE_ALL
flag I'm aware of.)
The easy fix for the current case is to DIRTREE_COMEAGAIN and handle all
the moves in the second callback, that way all children are handled
before their parents. (This avoids adding a second explicit mount() call
when the first mount() call can theoretically already handle it. Single
Point of Truth and all that...)
This doesn't solve the larger problem (ala /dev being a devtmpfs and
/dev/pts being a devpts), but might address _this_ issue without adding
significant code.
Do I _want_ to try to fix the larger issue? I'd need an arbitrary number
of mountpoints to hold arbitrarily deep trees while moving them, and I'm
not guaranteed to have any writeable space to mkdir in. That's why I
didn't try to tackle it before. In theory "switch_root before doing your
setup" has been the order of the day... in which case you don't need to
care about any child mounts, you just want to swap two mounts the way
pivot_root does.
Would the simpler non-recursive version break anybody? I have no idea.
You'd want to move /dev is if CONFIG_DEVTMPFS_MOUNT worked but the
kernel guys have refused
https://landley.net/bin/mkroot/0.8.13/linux-patches/0003-Wire-up-CONFIG_DEVTMPFS_MOUNT-to-initramfs.patch
and friends for NINE YEARS now. (Which is why the stupid "static
initramfs has no stdin/stdout/stderr when it launches PID 1" bug keeps
cropping back up, because the kernel has inconsistent behavior in
different codepaths...)
Are there existing users would be broken by doing less, or is everybody
just calling switch_root as the first thing and then have the "real"
init script live in the new filesystem?
Rob
_______________________________________________
Toybox mailing list
[email protected]
http://lists.landley.net/listinfo.cgi/toybox-landley.net