Hi all, I have a question about `systemd-nspawn` internals.
When creating the child process, it does something like: parent | clone(MOUNT) | `------------, | outer_child() | | | clone(rest) | | `------------, | return inner_child() | ,-----------' | wait() | | exec() | ||| | exit() | ,----------------------------' wait() where in the first `clone()` it unshares the mount namespace, and in the second `clone()` it unshares all of the other namespaces (except for the cgroup namespace). Initially, I was confused by the awkward dance with having two children; I couldn't imagine a reason why it is necessary to do this with a separate `inner_child` and `outer_child`; why can't everything be done in a single child process?: parent | clone(MOUNT) | `------------, | child() | | | unshare(rest) | | | exec() | ||| | exit() | ,------------' wait() It has used the current two-child approach since user-namespace support was first completed in 03cfe0d5, which only has the brief commit message "nspawn: finish user namespace support"; so there aren't too many clues to be found in the commit log. Part of the answer lies in the behavior of `unshare(CLONE_NEWPID)`. Unlike all of the other namespaces that may be unshared, calling `unshare(CLONE_NEWPID)` doesn't actually unshare the PID namespace in *this* process, it says to unshare the PID namespace at the next `fork()`/`clone()` call. So even if we changed `systemd-nspawn` to the `clone(MOUNT)/unshare(rest)` model, it would still have to `clone()` (or plain `fork()` at that point) a second, inner, child process. So then, I'm left wondering why unsharing the PID namespace can't be moved up to the initial `clone()`, allowing everything else to be `unshare`(2)ed in the initial child process: parent | clone(MOUNT|PID) | `------------, | child() | | | unshare(rest) | | | exec() | ||| | exit() | ,------------' wait() So my question becomes: what has to be done *after* unsharing the mount namespace, but *before* unsharing the PID namespace? -- Happy hacking, ~ Luke Shumaker _______________________________________________ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel