Dear all, first, thanks for the friendly and supportive help you all provide in issue trackers, on mailing lists, etc. – it is very helpful to find all this online.
However, I struggle to run unprivileged (Debian Buster) containers (on a Debian Buster host). LXC does not seem to mount the cgroup mount points for the container, thus the container's systemd tries to mount those and fails due to insufficient permissions. The log reports no writable cgroup hierarchies and no available controllers – could there be a common cause? I decided not to open an issue so far, since I am not sure if it is just me being incompetent here or if there is an actual issue. If we find an actual issue, I'll of course move this to the issue tracker. Please find all the configuration dumps and logs below. IIRC, I tried to run the script as provided in https://github.com/lxc/lxc/issues/1998#issuecomment-353241255 without success and various other things. However, I am unsure how the available information can be applied since a few things changed in LXC 3, no? And systemd seems to a moving target as well. Also, I work on an automation using Ansible to set up a host which can run unprivileged containers. This will be publicly available once everything works. Cheers, Lukas ======================================================================== symptom ------- ``lxc-start -n rproxy -l TRACE -o lxc.log -F``:: Failed to mount cgroup at /sys/fs/cgroup/systemd: Permission denied [!!!!!!] Failed to mount API filesystems. Exiting PID 1... ``lxc.log``: https://bin.privacytools.io/?28c8377e545ce6a9#9I2a28JuaYf7yHDNIxtxCQxox6LvTrxT4l4scUDgQNc= host details ------------ * ``cat /etc/debian_version``: 10.0 * ``lxc-start --version``: 3.0.3 * ``lxc-checkconfig``:: Kernel configuration not found at /proc/config.gz; searching... Kernel configuration found at /boot/config-4.19.0-5-amd64 --- Namespaces --- Namespaces: enabled Utsname namespace: enabled Ipc namespace: enabled Pid namespace: enabled User namespace: enabled Network namespace: enabled --- Control groups --- Cgroups: enabled Cgroup v1 mount points: /sys/fs/cgroup/systemd /sys/fs/cgroup/memory /sys/fs/cgroup/cpuset /sys/fs/cgroup/cpu,cpuacct /sys/fs/cgroup/blkio /sys/fs/cgroup/net_cls,net_prio /sys/fs/cgroup/perf_event /sys/fs/cgroup/rdma /sys/fs/cgroup/freezer /sys/fs/cgroup/devices /sys/fs/cgroup/pids Cgroup v2 mount points: /sys/fs/cgroup/unified Cgroup v1 clone_children flag: enabled Cgroup device: enabled Cgroup sched: enabled Cgroup cpu account: enabled Cgroup memory controller: enabled Cgroup cpuset: enabled --- Misc --- Veth pair device: enabled, not loaded Macvlan: enabled, not loaded Vlan: enabled, not loaded Bridges: enabled, loaded Advanced netfilter: enabled, loaded CONFIG_NF_NAT_IPV4: enabled, loaded CONFIG_NF_NAT_IPV6: enabled, loaded CONFIG_IP_NF_TARGET_MASQUERADE: enabled, not loaded CONFIG_IP6_NF_TARGET_MASQUERADE: enabled, not loaded CONFIG_NETFILTER_XT_TARGET_CHECKSUM: enabled, not loaded CONFIG_NETFILTER_XT_MATCH_COMMENT: enabled, loaded FUSE (for use with lxcfs): enabled, loaded --- Checkpoint/Restore --- checkpoint restore: enabled CONFIG_FHANDLE: enabled CONFIG_EVENTFD: enabled CONFIG_EPOLL: enabled CONFIG_UNIX_DIAG: enabled CONFIG_INET_DIAG: enabled CONFIG_PACKET_DIAG: enabled CONFIG_NETLINK_DIAG: enabled File capabilities: Note : Before booting a new kernel, you can check its configuration usage : CONFIG=/path/to/config /usr/bin/lxc-checkconfig * ``uname -a``: Linux hive 4.19.0-5-amd64 #1 SMP Debian 4.19.37-3 (2019-05-15) x86_64 GNU/Linux * ``cat /proc/self/cgroup``:: 11:pids:/user.slice/user-1000.slice/session-4.scope 10:devices:/user.slice 9:freezer:/user/lxc/0 8:rdma:/ 7:perf_event:/ 6:net_cls,net_prio:/ 5:blkio:/user.slice 4:cpu,cpuacct:/user/lxc/0 3:cpuset:/user/lxc/0 2:memory:/user/lxc/0 1:name=systemd:/user/lxc/0 0::/user.slice/user-1000.slice/session-4.scope/user/lxc/0 * ``cat /proc/self/mountinfo``:: 20 25 0:19 / /sys rw,nosuid,nodev,noexec,relatime shared:7 - sysfs sysfs rw 21 25 0:4 / /proc rw,relatime shared:14 - proc proc rw,hidepid=2 22 25 0:6 / /dev rw,nosuid,relatime shared:2 - devtmpfs udev rw,size=6134028k,nr_inodes=1533507,mode=755 23 22 0:20 / /dev/pts rw,nosuid,noexec,relatime shared:3 - devpts devpts rw,gid=5,mode=620,ptmxmode=000 24 25 0:21 / /run rw,nosuid,noexec,relatime shared:5 - tmpfs tmpfs rw,size=1229916k,mode=755 25 0 0:22 / / rw,relatime shared:1 - btrfs /dev/sda4 rw,compress=lzo,space_cache,user_subvol_rm_allowed,subvolid=5,subvol=/ 26 20 0:7 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime shared:8 - securityfs securityfs rw 27 22 0:24 / /dev/shm rw,nosuid,nodev shared:4 - tmpfs tmpfs rw 28 24 0:25 / /run/lock rw,nosuid,nodev,noexec,relatime shared:6 - tmpfs tmpfs rw,size=5120k 29 20 0:26 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:9 - tmpfs tmpfs ro,mode=755 30 29 0:27 / /sys/fs/cgroup/unified rw,nosuid,nodev,noexec,relatime shared:10 - cgroup2 cgroup2 rw 31 29 0:28 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:11 - cgroup cgroup rw,xattr,name=systemd 32 20 0:29 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime shared:12 - pstore pstore rw 33 20 0:30 / /sys/fs/bpf rw,nosuid,nodev,noexec,relatime shared:13 - bpf bpf rw,mode=700 34 29 0:31 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:15 - cgroup cgroup rw,memory 35 29 0:32 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:16 - cgroup cgroup rw,cpuset,clone_children 36 29 0:33 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,cpu,cpuacct 37 29 0:34 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:18 - cgroup cgroup rw,blkio 38 29 0:35 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime shared:19 - cgroup cgroup rw,net_cls,net_prio 39 29 0:36 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime shared:20 - cgroup cgroup rw,perf_event 40 29 0:37 / /sys/fs/cgroup/rdma rw,nosuid,nodev,noexec,relatime shared:21 - cgroup cgroup rw,rdma 41 29 0:38 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:22 - cgroup cgroup rw,freezer 42 29 0:39 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:23 - cgroup cgroup rw,devices 43 29 0:40 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:24 - cgroup cgroup rw,pids 45 22 0:18 / /dev/mqueue rw,relatime shared:25 - mqueue mqueue rw 44 22 0:41 / /dev/hugepages rw,relatime shared:26 - hugetlbfs hugetlbfs rw,pagesize=2M 46 20 0:8 / /sys/kernel/debug rw,relatime shared:27 - debugfs debugfs rw 47 21 0:42 / /proc/sys/fs/binfmt_misc rw,relatime shared:28 - autofs systemd-1 rw,fd=41,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=1678 230 25 0:49 / /var/lib/lxcfs rw,nosuid,nodev,relatime shared:122 - fuse.lxcfs lxcfs rw,user_id=0,group_id=0,allow_other 245 20 0:50 / /sys/fs/fuse/connections rw,relatime shared:161 - fusectl fusectl rw 266 24 0:51 / /run/user/1000 rw,nosuid,nodev,relatime shared:169 - tmpfs tmpfs rw,size=1229912k,mode=700,uid=1000,gid=1000 * ``grep cgfs /etc/pam.d/common-session*``:: session optional pam_cgfs.so -c freezer,memory,cpu,cpuset,cpuacct,unified,name=systemd session optional pam_cgfs.so -c freezer,memory,cpu,cpuset,cpuacct,unified,name=systemd container config ---------------- * ``cat rproxy/config``:: lxc.include = /home/lxc/.config/lxc/common.conf lxc.uts.name = rproxy lxc.rootfs.path = btrfs:/home/lxc/rproxy/rootfs lxc.net.0.link = lxc-br-rproxy lxc.net.0.ipv6.address = fd00::2/16 * ``cat /home/lxc/.config/lxc/common.conf`` lxc.include = /usr/share/lxc/config/common.conf lxc.include = /usr/share/lxc/config/userns.conf lxc.include = /etc/lxc/default.conf lxc.apparmor.profile = unconfined lxc.arch = x86_64 lxc.start.auto = 1 lxc.start.delay = 20 lxc.net.0.type = veth lxc.net.0.name = eth0 lxc.net.0.flags = up lxc.net.0.ipv6.gateway = auto lxc.idmap = u 0 165536 65536 lxc.idmap = g 0 165536 65536 * ``/etc/lxc/default.conf``:: lxc.net.0.type = empty lxc.apparmor.profile = generated lxc.apparmor.allow_nesting = 1 * ``cat /usr/share/lxc/config/userns.conf`` lxc.cgroup.devices.deny = lxc.cgroup.devices.allow = lxc.cap.drop = lxc.cap.keep = lxc.tty.dir = lxc.mount.auto = sys:rw * ``cat /usr/share/lxc/config/common.conf``:: # Setup the LXC devices in /dev/lxc/ lxc.tty.dir = lxc # Allow for 1024 pseudo terminals lxc.pty.max = 1024 # Setup 4 tty devices lxc.tty.max = 4 # Drop some harmful capabilities lxc.cap.drop = mac_admin mac_override sys_time sys_module sys_rawio # Ensure hostname is changed on clone lxc.hook.clone = /usr/share/lxc/hooks/clonehostname # CGroup whitelist lxc.cgroup.devices.deny = a ## Allow any mknod (but not reading/writing the node) lxc.cgroup.devices.allow = c *:* m lxc.cgroup.devices.allow = b *:* m ## Allow specific devices ### /dev/null lxc.cgroup.devices.allow = c 1:3 rwm ### /dev/zero lxc.cgroup.devices.allow = c 1:5 rwm ### /dev/full lxc.cgroup.devices.allow = c 1:7 rwm ### /dev/tty lxc.cgroup.devices.allow = c 5:0 rwm ### /dev/console lxc.cgroup.devices.allow = c 5:1 rwm ### /dev/ptmx lxc.cgroup.devices.allow = c 5:2 rwm ### /dev/random lxc.cgroup.devices.allow = c 1:8 rwm ### /dev/urandom lxc.cgroup.devices.allow = c 1:9 rwm ### /dev/pts/* lxc.cgroup.devices.allow = c 136:* rwm ### fuse lxc.cgroup.devices.allow = c 10:229 rwm lxc.mount.auto = cgroup:mixed proc:mixed sys:mixed lxc.mount.entry = /sys/fs/fuse/connections sys/fs/fuse/connections none bind,optional 0 0 lxc.seccomp.profile = /usr/share/lxc/config/common.seccomp lxc.include = /usr/share/lxc/config/common.conf.d/ * ``grep lxc /etc/sub{g,u}id``:: /etc/subgid:lxc:165536:65536 /etc/subuid:lxc:165536:65536 * ``umask``: 077 * I also tried this (overkill approach) to make the cgroups writable (I guess?) without success:: for x in `find /sys/fs/cgroup -name lxc`; do echo; echo $x; chgrp -R lxc $x; chmod g+rw $x; done
signature.asc
Description: This is a digitally signed message part
_______________________________________________ lxc-users mailing list lxc-users@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-users