On Thu, 08.01.15 18:55, Stéphane Graber (stgra...@ubuntu.com) wrote: > On Fri, Jan 09, 2015 at 12:39:23AM +0100, Lennart Poettering wrote: > > On Thu, 08.01.15 15:33, Stéphane Graber (stgra...@ubuntu.com) wrote: > > > > > As far as I know there's no obvious way to detect this case (well, > > > short of trying a bunch of restricted syscalls). The only way I'm > > > aware of is by comparing the target of /proc/self/ns/user to that of > > > /proc/<real host pid 1>/ns/user which is doable at the host level > > > but isn't once you are in a container with your own pid namespace > > > (which since we're talking about pid 1 systemd there can probably be > > > assumed). > > > > Hmm, if this is so unreliable to detect maybe we shouldn't after all. > > > > Given that git is no longer fatally failing if it cannot write to oom > > adjust I think all is good now? > > Yeah, I think we're good for now. I've got systemd running fine in an > unprivileged container here, booting without problems to a shell and > with all the basic services running as expected (and those I was > expecting to fail, failed but didn't block the boot in any way). > > I expect we'll run into some more problems when dealing with units that > start with their own view of /dev since mknod in a userns isn't allowed > but I haven't run into one of those yet so it's not very high on my list. > > Once that happens, I expect we can solve it either by again just > ignoring the failure or by catching the failure and falling back to > doing a bind-mount of the device in question from the parent /dev (which > works fine in a userns and is what we do today for nested containers > with LXC).
Note that most of systemd's own daemons use PrivateDevices=, PrivateTmp= and suchlike by default, hence you couldn't really start much if this wouldn't work... A while back we added some changes to make permission problems with fs namespaces graceful. This was done to support CAP_SYS_ADMIN-less containers, which cannot even mount: http://cgit.freedesktop.org/systemd/systemd/tree/src/core/execute.c#n1584 I figure userns containers are only slightly less limited thatn CAP_SYS_ADMIN-less containers are, hence I think for most purposes you should already be fine... Lennart -- Lennart Poettering, Red Hat _______________________________________________ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel