On Tue, 09.10.12 00:28, Marti Raudsepp ([email protected]) wrote: > Hi list, > > Recently I upgraded to Gnome 3.6 on my Arch Linux desktop, but > gnome-session didn't work no matter what I tried. Ages of debugging > later, strace revealed this: > [pid 2063] open("/proc/1/cgroup", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No > such file or directory) > [...] > [pid 2063] writev(2, [{"gnome-session[2063]: WARNING: Could not get > session id for session. Check that logind is properly in"..., 150}], > 1) = 150 > > Turns out it happens because I was mounting /proc with hidepid=2 on my > systems. It's a nice security feature introduced in Linux 3.3 which > hides all other users' processes from unprivileged users. > > Jan Steffens pointed out that this open call actually comes from > systemd's sd-login. What's the reason why sd-login needs to poke > around in init's cgroups? It's being called by sd_pid_get_owner_uid > and sd_pid_get_session, but I'm not entirely clear what's happening in > that code.
So, basically, this is because cgroups are not virtualized for containers. So in order to figure out in which service cgroup a unit is in we need to find the cgroup of our entire system, so that we can chop of the head of the full cgroup path. This is inecessary to make systemd work nicely inside of containers. Example: A) On the host, sshd is in /system/sshd.service. B) A container is in /user/lennart/1/nspawn-4711/, i.e. PID 1 is in this group. C) sshd inside that container is in /user/lennart/1/nspawn-4711/system/sshd.service. To determine the service of a process we hence detect the cgroup PID of PID 1, then chop that of the service cgroup, then remove the /system, and there we are. This logic hence makes unit cgroups work fine regardless how deep our hierarchies are nested. > AFAICT on regular systems, init's cgroup is always "/system", in which > case it gets ignored entirely by the code. Would be safe assume that > on failure to open? Are there any other ways to solve this? Well, such a constant fallback would fix your immediate problem but would leave things broken for container setups. I'd really see this fixed properly in some way. My first reaction would be to see hidepid= fixed not to hide pid 1 in the kernel (it's kinda stupid anyway to hide it, simply since it's the parent of all processes anyway, and is quite frequently assumed to exist). Also note that systemd actually relies on /proc/1 to be around for other puroses too, for example, to detect chroots, to read system env vars and other things. Dunno. Other ideas? Lennart -- Lennart Poettering - Red Hat, Inc. _______________________________________________ systemd-devel mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/systemd-devel
