Hi,
I hit the same issue.
I upgraded from 1:4.0.4-6 to 1:4.0.5-2, and from kernel 5.9.0-4-amd64 to
5.10.0-2-amd64, and some of my containers that used to work before don't work
anyomre. The ones that still work don't drop sys_admin.
stracing lxc-start I see this:
openat2(33</usr/lib/x86_64-linux-gnu/lxc/rootfs>, "/sys/fs/cgroup",
{flags=O_RDONLY|O_CLOEXEC|O_PATH,
resolve=RESOLVE_NO_XDEV|RESOLVE_NO_MAGICLINKS|RESOLVE_NO_SYMLINKS|RESOLVE_BENEATH},
24) = -1 EXDEV (Invalid cross-device link)
The corresponding message from lxc-start with loglevel debug is:
lxc-start unifiadmin 20210125231743.129 ERROR conf -
conf.c:lxc_mount_auto_mounts:727 - Invalid cross-device link - Failed to mount
"/sys/fs/cgroup"
Some context from lxc-start log output:
lxc-start unifiadmin 20210125231742.854 INFO start - start.c:lxc_init:837 -
Container "unifiadmin" is initialized
lxc-start unifiadmin 20210125231742.876 WARN cgfsng -
cgroups/cgfsng.c:mkdir_eexist_on_last:1152 - File exists - Failed to create
directory "/sys/fs/cgroup/cpuset//lxc.monitor.unifiadmin"
lxc-start unifiadmin 20210125231742.886 INFO cgfsng -
cgroups/cgfsng.c:cgfsng_monitor_create:1368 - The monitor process uses
"lxc.monitor.unifiadmin" as cgroup
lxc-start unifiadmin 20210125231742.904 WARN cgfsng -
cgroups/cgfsng.c:mkdir_eexist_on_last:1152 - File exists - Failed to create
directory "/sys/fs/cgroup/cpuset//lxc.payload.unifiadmin"
lxc-start unifiadmin 20210125231742.916 INFO cgfsng -
cgroups/cgfsng.c:cgfsng_payload_create:1471 - The container process uses
"lxc.payload.unifiadmin" as cgroup
lxc-start unifiadmin 20210125231742.944 INFO start - start.c:lxc_spawn:1700
- Cloned CLONE_NEWNS
lxc-start unifiadmin 20210125231742.944 INFO start - start.c:lxc_spawn:1700
- Cloned CLONE_NEWPID
lxc-start unifiadmin 20210125231742.945 INFO start - start.c:lxc_spawn:1700
- Cloned CLONE_NEWUTS
lxc-start unifiadmin 20210125231742.945 INFO start - start.c:lxc_spawn:1700
- Cloned CLONE_NEWIPC
lxc-start unifiadmin 20210125231742.945 INFO start - start.c:lxc_spawn:1700
- Cloned CLONE_NEWNET
lxc-start unifiadmin 20210125231742.945 DEBUG start -
start.c:lxc_try_preserve_namespaces:166 - Preserved mnt namespace via fd 31
lxc-start unifiadmin 20210125231742.945 DEBUG start -
start.c:lxc_try_preserve_namespaces:166 - Preserved pid namespace via fd 32
lxc-start unifiadmin 20210125231742.946 DEBUG start -
start.c:lxc_try_preserve_namespaces:166 - Preserved uts namespace via fd 33
lxc-start unifiadmin 20210125231742.946 DEBUG start -
start.c:lxc_try_preserve_namespaces:166 - Preserved ipc namespace via fd 34
lxc-start unifiadmin 20210125231742.946 DEBUG start -
start.c:lxc_try_preserve_namespaces:166 - Preserved net namespace via fd 35
lxc-start unifiadmin 20210125231742.949 INFO cgfsng -
cgroups/cgfsng.c:cgfsng_setup_limits_legacy:2881 - Limits for the legacy cgroup
hierarchies have been setup
lxc-start unifiadmin 20210125231742.955 WARN cgfsng -
cgroups/cgfsng.c:cgfsng_setup_limits:2942 - Invalid argument - Ignoring cgroup2
limits on legacy cgroup system
lxc-start unifiadmin 20210125231743.315 INFO network -
network.c:instantiate_veth:285 - Retrieved mtu 1500 from intra
lxc-start unifiadmin 20210125231743.666 INFO network -
network.c:instantiate_veth:333 - Attached "veth-unifi" to bridge "intra"
lxc-start unifiadmin 20210125231743.687 DEBUG network -
network.c:instantiate_veth:449 - Instantiated veth tunnel "veth-unifi <-->
vethv7jzuF"
lxc-start unifiadmin 20210125231743.699 WARN start - start.c:do_start:1166
- Using /dev/null from the host for container init's standard file descriptors.
Migration will not work
lxc-start unifiadmin 20210125231743.704 INFO start - start.c:do_start:1198
- Unshared CLONE_NEWCGROUP
lxc-start unifiadmin 20210125231743.731 DEBUG storage -
storage/storage.c:get_storage_by_name:211 - Detected rootfs type "dir"
lxc-start unifiadmin 20210125231743.734 DEBUG conf -
conf.c:lxc_mount_rootfs:1259 - Mounted rootfs "/var/lib/lxc/unifiadmin/rootfs"
onto "/usr/lib/x86_64-linux-gnu/lxc/rootfs" with options "(null)"
lxc-start unifiadmin 20210125231743.738 INFO conf -
conf.c:setup_utsname:751 - Set hostname to "unifiadmin"
lxc-start unifiadmin 20210125231743.740 DEBUG network -
network.c:lxc_network_setup_in_child_namespaces_common:3510 - Network device ""
has been setup
lxc-start unifiadmin 20210125231743.977 DEBUG network -
network.c:setup_hw_addr:3360 - Mac address "00:16:3e:11:22:33" on "eth0" has
been setup
lxc-start unifiadmin 20210125231743.103 DEBUG network -
network.c:lxc_network_setup_in_child_namespaces_common:3510 - Network device
"eth0" has been setup
lxc-start unifiadmin 20210125231743.103 INFO network -
network.c:lxc_setup_network_in_child_namespaces:3532 - Network has been setup
lxc-start unifiadmin 20210125231743.116 DEBUG conf - conf.c:mount_entry:1943
- Remounting "/shared/cache/apt/lists" on
"/usr/lib/x86_64-linux-gnu/lxc/rootfs/var/lib/apt/lists" to respect bind or
remount options
lxc-start unifiadmin 20210125231743.116 DEBUG conf - conf.c:mount_entry:1962
- Flags for "/shared/cache/apt/lists" were 1038, required extra flags are 14
lxc-start unifiadmin 20210125231743.117 DEBUG conf - conf.c:mount_entry:1971
- Mountflags already were 5134, skipping remount
lxc-start unifiadmin 20210125231743.117 DEBUG conf - conf.c:mount_entry:2006
- Mounted "/shared/cache/apt/lists" on
"/usr/lib/x86_64-linux-gnu/lxc/rootfs/var/lib/apt/lists" with filesystem type
"none"
lxc-start unifiadmin 20210125231743.118 DEBUG conf - conf.c:mount_entry:1943
- Remounting "/shared/cache/apt/archives" on
"/usr/lib/x86_64-linux-gnu/lxc/rootfs/var/cache/apt/archives" to respect bind
or remount options
lxc-start unifiadmin 20210125231743.118 DEBUG conf - conf.c:mount_entry:1962
- Flags for "/shared/cache/apt/archives" were 1038, required extra flags are 14
lxc-start unifiadmin 20210125231743.118 DEBUG conf - conf.c:mount_entry:1971
- Mountflags already were 5134, skipping remount
lxc-start unifiadmin 20210125231743.118 DEBUG conf - conf.c:mount_entry:2006
- Mounted "/shared/cache/apt/archives" on
"/usr/lib/x86_64-linux-gnu/lxc/rootfs/var/cache/apt/archives" with filesystem
type "none"
lxc-start unifiadmin 20210125231743.119 DEBUG conf - conf.c:mount_entry:1943
- Remounting "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/null" on
"/usr/lib/x86_64-linux-gnu/lxc/rootfs/proc/kcore" to respect bind or remount
options
lxc-start unifiadmin 20210125231743.119 DEBUG conf - conf.c:mount_entry:1962
- Flags for "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/null" were 1024, required
extra flags are 0
lxc-start unifiadmin 20210125231743.120 DEBUG conf - conf.c:mount_entry:1971
- Mountflags already were 4096, skipping remount
lxc-start unifiadmin 20210125231743.120 DEBUG conf - conf.c:mount_entry:2006
- Mounted "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/null" on
"/usr/lib/x86_64-linux-gnu/lxc/rootfs/proc/kcore" with filesystem type "none"
lxc-start unifiadmin 20210125231743.123 DEBUG conf - conf.c:mount_entry:1943
- Remounting "/sys/fs/fuse/connections" on
"/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/fuse/connections" to respect bind
or remount options
lxc-start unifiadmin 20210125231743.123 DEBUG conf - conf.c:mount_entry:1962
- Flags for "/sys/fs/fuse/connections" were 4110, required extra flags are 14
lxc-start unifiadmin 20210125231743.123 DEBUG conf - conf.c:mount_entry:2006
- Mounted "/sys/fs/fuse/connections" on
"/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/fuse/connections" with filesystem
type "none"
lxc-start unifiadmin 20210125231743.125 DEBUG conf - conf.c:mount_entry:1943
- Remounting "/sys/fs/fuse/connections" on
"/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/fuse/connections" to respect bind
or remount options
lxc-start unifiadmin 20210125231743.125 DEBUG conf - conf.c:mount_entry:1962
- Flags for "/sys/fs/fuse/connections" were 4110, required extra flags are 14
lxc-start unifiadmin 20210125231743.125 DEBUG conf - conf.c:mount_entry:2006
- Mounted "/sys/fs/fuse/connections" on
"/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/fuse/connections" with filesystem
type "none"
lxc-start unifiadmin 20210125231743.127 DEBUG conf - conf.c:mount_entry:2006
- Mounted "run" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/run" with filesystem
type "tmpfs"
lxc-start unifiadmin 20210125231743.128 DEBUG conf - conf.c:mount_entry:2006
- Mounted "none" on "/usr/lib/x86_64-linux-gnu/lxc/rootfs/dev/shm" with
filesystem type "tmpfs"
lxc-start unifiadmin 20210125231743.129 ERROR conf -
conf.c:lxc_mount_auto_mounts:727 - Invalid cross-device link - Failed to mount
"/sys/fs/cgroup"
lxc-start unifiadmin 20210125231743.130 ERROR conf - conf.c:lxc_setup:3365 -
Failed to setup remaining automatic mounts
lxc-start unifiadmin 20210125231743.130 ERROR start - start.c:do_start:1218
- Failed to setup container "unifiadmin"
lxc-start unifiadmin 20210125231743.131 ERROR sync - sync.c:__sync_wait:36 -
An error occurred in another process (expected sequence number 5)
lxc-start unifiadmin 20210125231743.132 DEBUG network -
network.c:lxc_delete_network:3665 - Deleted network devices
lxc-start unifiadmin 20210125231743.133 ERROR lxccontainer -
lxccontainer.c:wait_on_daemonized_start:859 - Received container state
"ABORTING" instead of "RUNNING"
lxc-start unifiadmin 20210125231743.134 ERROR lxc_start -
tools/lxc_start.c:main:308 - The container failed to start
lxc-start unifiadmin 20210125231743.135 ERROR lxc_start -
tools/lxc_start.c:main:311 - To get more details, run the container in
foreground mode
lxc-start unifiadmin 20210125231743.135 ERROR start -
start.c:__lxc_start:1999 - Failed to spawn container "unifiadmin"
lxc-start unifiadmin 20210125231743.136 ERROR lxc_start -
tools/lxc_start.c:main:313 - Additional information can be obtained by setting
the --logfile and --logpriority options
lxc-start unifiadmin 20210125231743.136 WARN start - start.c:lxc_abort:1012
- No such process - Failed to send SIGKILL via pidfd 30 for process 15227
lxc-start unifiadmin 20210125231743.748 INFO conf -
conf.c:run_script_argv:342 - Executing script
"/usr/share/lxcfs/lxc.reboot.hook" for container "unifiadmin"
lxc-start unifiadmin 20210125231744.288 INFO conf -
conf.c:run_script_argv:342 - Executing script
"/usr/share/lxcfs/lxc.reboot.hook" for container "unifiadmin"
If I don't drop the sys_admin capability, it works again.
Before the upgrade, it also worked if I dropped sys_admin.
The configfile for this guest is:
----- 8< -----
lxc.include = /usr/share/lxc/config/common.conf
lxc.apparmor.profile = generated
lxc.apparmor.allow_nesting = 0
lxc.hook.version = 1
lxc.mount.entry = run run tmpfs
rw,nodev,relatime,mode=755,size=20m,create=dir 0 0
lxc.mount.entry = none dev/shm tmpfs
rw,nosuid,nodev,mode=1777,size=100m,create=dir 0 0
lxc.cap.drop = sys_resource audit_write block_suspend linux_immutable mac_admin
mac_override sys_admin sys_module sys_pacct sys_rawio sys_resource sys_time
sys_tty_config syslog
lxc.start.auto = 1
lxc.cgroup.devices.deny = a
lxc.cgroup.devices.allow = c 1:3 rwm
lxc.cgroup.devices.allow = c 1:5 rwm
lxc.cgroup.devices.allow = c 1:7 rwm
lxc.cgroup.devices.allow = c 1:8 rwm
lxc.cgroup.devices.allow = c 1:9 rwm
lxc.cgroup.devices.allow = c 5:0 rwm
lxc.cgroup.devices.allow = c 5:1 rwm
lxc.cgroup.devices.allow = c 5:2 rwm
lxc.cgroup.devices.allow = c 136:* rwm
lxc.cgroup.devices.allow = c 10:229 rwm
lxc.cgroup.devices.allow = c 254:0 rm
lxc.cgroup.devices.allow = c 10:200 rwm
lxc.cgroup.devices.allow = c 10:228 rwm
lxc.cgroup.devices.allow = c 10:232 rwm
lxc.autodev = 0
lxc.tty.dir =
lxc.tty.max = 0
# Container specific configuration
lxc.rootfs.path = dir:/var/lib/lxc/unifiadmin/rootfs
lxc.uts.name = unifiadmin
lxc.arch = amd64
# Network configuration
lxc.net.0.type = empty
lxc.net.1.type = veth
lxc.net.1.link = intra
lxc.net.1.flags = up
lxc.net.1.name = eth0
lxc.net.1.veth.pair = veth-unifi
lxc.net.1.hwaddr = 00:16:3e:11:22:33
lxc.mount.fstab = /var/lib/lxc/unifiadmin/fstab
----- >8 -----
The fstab doesn't reference cgroup or /sys.
I googled around and found this post from 2012:
https://lists.linuxfoundation.org/pipermail/containers/2012-November/030827.html
-- based on this, maybe the problem is that cap_sys_admin is dropped too early
now?
Also, https://github.com/lxc/lxc/issues/1737 looks related.
https://blog.iwakd.de/lxc-cap_sys_admin-jessie also suggests that running
containers without cap_sys_admin used to be possible.
Or maybe I should be using cgroup2?
FWIW, both host and guest use runit, so systemd is not involved; runit doesn't
interfere with cgroups or capabilities on its own in any way.
AndrĂ¡s
--
Nothing screws up a good story like an eyewitness.