[lxc-users] starting any container with umask 007 breaks lxc-stop and prevents host system shutdown

Forest Wed, 30 Nov 2016 16:14:15 -0800

If I have umask 007 (or any other value that masks the world-execute bit)
when I run lxc-start for the first time after logging in, my host system
enters a state with the following problems:


* lxc-stop hangs forever instead of stopping any container, even one that
wasn't started with umask 007.
* lxc-stop --kill --nolock hangs in the same way.
* Attempts to reboot or shut down the host system fail, requiring a hard
reset to recover.

When lxc-stop hangs, messages like these appear in syslog every couple of
minutes:

Nov 22 14:36:55 xenialbox kernel: [ 484.506570] INFO: task systemd:3086
blocked for more than 120 seconds.
Nov 22 14:36:55 xenialbox kernel: [ 484.506578] Not tainted
4.9.0-040900rc6-generic #201611201731
Nov 22 14:36:55 xenialbox kernel: [ 484.506579] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 22 14:36:55 xenialbox kernel: [ 484.506582] systemd D 0 3086 3076
0x00000104
Nov 22 14:36:55 xenialbox kernel: [ 484.506589] ffff8eaa1810bc00
ffff8eaa58f4d800 ffff8eaa57f59d00 ffff8eaa5fc19340
Nov 22 14:36:55 xenialbox kernel: [ 484.506593] ffff8eaa54bec880
ffff9bab80f17b78 ffffffff8bc87183 ffffffff8c244480
Nov 22 14:36:55 xenialbox kernel: [ 484.506596] 00ff8eaa57f59d00
ffff8eaa5fc19340 0000000000000000 ffff8eaa57f59d00
Nov 22 14:36:55 xenialbox kernel: [ 484.506600] Call Trace:
Nov 22 14:36:55 xenialbox kernel: [ 484.506622] [<ffffffff8bc87183>] ?
__schedule+0x233/0x6e0
Nov 22 14:36:55 xenialbox kernel: [ 484.506626] [<ffffffff8bc87666>]
schedule+0x36/0x80
Nov 22 14:36:55 xenialbox kernel: [ 484.506629] [<ffffffff8bc8a4da>]
rwsem_down_write_failed+0x20a/0x380
Nov 22 14:36:55 xenialbox kernel: [ 484.506634] [<ffffffff8b463f3e>] ?
kvm_sched_clock_read+0x1e/0x30
Nov 22 14:36:55 xenialbox kernel: [ 484.506643] [<ffffffff8b6ba5e0>] ?
kernfs_sop_show_options+0x40/0x40
Nov 22 14:36:55 xenialbox kernel: [ 484.506651] [<ffffffff8b8265a7>]
call_rwsem_down_write_failed+0x17/0x30
Nov 22 14:36:55 xenialbox kernel: [ 484.506655] [<ffffffff8bc89b1d>]
down_write+0x2d/0x40
Nov 22 14:36:55 xenialbox kernel: [ 484.506658] [<ffffffff8b639fa0>]
grab_super+0x30/0xa0
Nov 22 14:36:55 xenialbox kernel: [ 484.506661] [<ffffffff8b63a58f>]
sget_userns+0x18f/0x4d0
Nov 22 14:36:55 xenialbox kernel: [ 484.506663] [<ffffffff8b6ba670>] ?
kernfs_sop_show_path+0x50/0x50
Nov 22 14:36:55 xenialbox kernel: [ 484.506666] [<ffffffff8b6ba89e>]
kernfs_mount_ns+0x7e/0x230
Nov 22 14:36:55 xenialbox kernel: [ 484.506674] [<ffffffff8b5230b8>]
cgroup_mount+0x328/0x840
Nov 22 14:36:55 xenialbox kernel: [ 484.506679] [<ffffffff8b6036f5>] ?
alloc_pages_current+0x95/0x140
Nov 22 14:36:55 xenialbox kernel: [ 484.506682] [<ffffffff8b63b578>]
mount_fs+0x38/0x150
Nov 22 14:36:55 xenialbox kernel: [ 484.506686] [<ffffffff8b659177>]
vfs_kern_mount+0x67/0x110
Nov 22 14:36:55 xenialbox kernel: [ 484.506688] [<ffffffff8b65baf1>]
do_mount+0x1e1/0xcb0
Nov 22 14:36:55 xenialbox kernel: [ 484.506691] [<ffffffff8b6330df>] ?
__check_object_size+0xff/0x1d6
Nov 22 14:36:55 xenialbox kernel: [ 484.506695] [<ffffffff8b60efe7>] ?
kmem_cache_alloc_trace+0xd7/0x190
Nov 22 14:36:55 xenialbox kernel: [ 484.506697] [<ffffffff8b65c8d3>]
SyS_mount+0x83/0xd0
Nov 22 14:36:55 xenialbox kernel: [ 484.506700] [<ffffffff8b403b6b>]
do_syscall_64+0x5b/0xc0
Nov 22 14:36:55 xenialbox kernel: [ 484.506702] [<ffffffff8bc8c46f>]
entry_SYSCALL64_slow_path+0x25/0x25
When system shutdown hangs, similar messages appear on the console every
couple of minutes.

I can reproduce this at will on both real hardware and a freshly-installed
and fully-updated host OS in VirtualBox, with either an old-ish container or
a new one, and with either the current Ubuntu LTS kernel or a mainline
kernel.

Here is the uname ouput of the two kernels I most recently used:

Linux xenialbox 4.4.0-47-generic #68-Ubuntu SMP Wed Oct 26 19:39:52 UTC 2016
x86_64 x86_64 x86_64 GNU/Linux
Linux xenialbox 4.9.0-040900rc6-generic #201611201731 SMP Sun Nov 20
22:33:21 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

I'm running lxc 2.0.5-0ubuntu1~ubuntu16.04.2 on xubuntu 16.04.1 LTS amd64.

My containers are all unprivileged.

My umask at container creation time does not seem to matter. As far as I
have seen, my umask only matters the first time I start a container in my
login session.

I can work around the bug by manually setting my umask to something more
permissive before I start my first container of the day, and then setting it
back again, but that's rather a hassle. (Even worse, it's very easy to
forget this workaround and be left with containers that can't be stopped and
a host system that won't shut down cleanly.)

Possibly related: when the problem is triggered, I notice that my guest
instances start with no /etc/resolv.conf and no inet address.

Ubuntu bug report:
https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1642767
_______________________________________________
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

[lxc-users] starting any container with umask 007 breaks lxc-stop and prevents host system shutdown

Reply via email to