Bug#875733: lxc.mount.auto = cgroup:mixed doesn't seem to work in Stretch anymore

2018-01-14 Thread Matthijs Kooijman
Package: lxc
Version: 1:2.0.7-2+deb9u1
Followup-For: Bug #875733

Hi folks,

I also ran into this exact issue. It seems upstream fixed this bug, see
https://github.com/lxc/lxc/issues/1737

I've backported this fix (along with some other commits it needs) to the
Debian stretch version, which works as expected. I've attached a patch
to the Debian packaging that does this. Since this is a regression from
earlier Debian versions, I guess this would be worth including in
stretch update?

One caveat to note: In my setup, I had `lxc.cgroup.use=@all` in my
`lxc.conf` file, which prevented this fix from working. See
https://github.com/lxc/lxc/issues/2084 for more details.

Gr.

Matthijs
From f1aa85a4b1c1c38211a9fa15afac72b3df142b3d Mon Sep 17 00:00:00 2001
From: Matthijs Kooijman 
Date: Sun, 14 Jan 2018 15:47:31 +0100
Subject: [PATCH] Backport upstream commits to fix running without
 CAP_SYS_ADMIN

Closes: #875733
---
 .../0011-lxc-cgroups-move-helper-functions.patch   | 166 ++
 .../0012-lxc-cgroups-handle-hubrid-layouts.patch   | 358 +
 .../0013-lxc-cgroups-without-cap-sys-admin.patch   | 157 +
 debian/patches/series  |   3 +
 4 files changed, 684 insertions(+)
 create mode 100644 debian/patches/0011-lxc-cgroups-move-helper-functions.patch
 create mode 100644 debian/patches/0012-lxc-cgroups-handle-hubrid-layouts.patch
 create mode 100644 debian/patches/0013-lxc-cgroups-without-cap-sys-admin.patch

diff --git a/debian/patches/0011-lxc-cgroups-move-helper-functions.patch 
b/debian/patches/0011-lxc-cgroups-move-helper-functions.patch
new file mode 100644
index 000..b1e7cea
--- /dev/null
+++ b/debian/patches/0011-lxc-cgroups-move-helper-functions.patch
@@ -0,0 +1,166 @@
+commit 04ad7ffe2a42fb2fa2e78e694990b385fd2dd5e0
+Author: Christian Brauner 
+Date:   Wed Jul 26 14:57:35 2017 +0200
+
+utils: move helpers from cgfsng.c to utils.{c,h}
+
+Signed-off-by: Christian Brauner 
+
+---
+
+This commit was backported from upstream to Debian without changes, to
+make the patch 0013-lxc-cgroups-without-cap-sys-admin.patch apply.
+
+--- a/src/lxc/cgroups/cgfsng.c
 b/src/lxc/cgroups/cgfsng.c
+@@ -112,36 +112,12 @@ static void free_string_list(char **clis
+   }
+ }
+ 
+-/* Re-alllocate a pointer, do not fail */
+-static void *must_realloc(void *orig, size_t sz)
+-{
+-  void *ret;
+-
+-  do {
+-  ret = realloc(orig, sz);
+-  } while (!ret);
+-  return ret;
+-}
+-
+ /* Allocate a pointer, do not fail */
+ static void *must_alloc(size_t sz)
+ {
+   return must_realloc(NULL, sz);
+ }
+ 
+-/* return copy of string @entry;  do not fail. */
+-static char *must_copy_string(const char *entry)
+-{
+-  char *ret;
+-
+-  if (!entry)
+-  return NULL;
+-  do {
+-  ret = strdup(entry);
+-  } while (!ret);
+-  return ret;
+-}
+-
+ /*
+  * This is a special case - return a copy of @entry
+  * prepending 'name='.  I.e. turn systemd into name=systemd.
+@@ -253,8 +229,6 @@ struct hierarchy *get_hierarchy(const ch
+   return NULL;
+ }
+ 
+-static char *must_make_path(const char *first, ...) __attribute__((sentinel));
+-
+ #define BATCH_SIZE 50
+ static void batch_realloc(char **mem, size_t oldlen, size_t newlen)
+ {
+@@ -1165,33 +1139,6 @@ out_free:
+   return NULL;
+ }
+ 
+-/*
+- * Concatenate all passed-in strings into one path.  Do not fail.  If any 
piece is
+- * not prefixed with '/', add a '/'.
+- */
+-static char *must_make_path(const char *first, ...)
+-{
+-  va_list args;
+-  char *cur, *dest;
+-  size_t full_len = strlen(first);
+-
+-  dest = must_copy_string(first);
+-
+-  va_start(args, first);
+-  while ((cur = va_arg(args, char *)) != NULL) {
+-  full_len += strlen(cur);
+-  if (cur[0] != '/')
+-  full_len++;
+-  dest = must_realloc(dest, full_len + 1);
+-  if (cur[0] != '/')
+-  strcat(dest, "/");
+-  strcat(dest, cur);
+-  }
+-  va_end(args);
+-
+-  return dest;
+-}
+-
+ static int cgroup_rmdir(char *dirname)
+ {
+   struct dirent *direntp;
+--- a/src/lxc/utils.c
 b/src/lxc/utils.c
+@@ -2083,3 +2083,50 @@ int lxc_setgroups(int size, gid_t list[]
+ 
+   return 0;
+ }
++
++char *must_make_path(const char *first, ...)
++{
++  va_list args;
++  char *cur, *dest;
++  size_t full_len = strlen(first);
++
++  dest = must_copy_string(first);
++
++  va_start(args, first);
++  while ((cur = va_arg(args, char *)) != NULL) {
++  full_len += strlen(cur);
++  if (cur[0] != '/')
++  full_len++;
++  dest = must_realloc(dest, full_len + 1);
++  if (cur[0] != '/')
++  strcat(dest, "/");
++  strcat(dest, cur);
++  }
++  va_end(args);
++
++  return dest;
++}
++
++char *must_copy_string(const char *entry)
++{
+

Bug#875733: [pkg-lxc-devel] Bug#875733: lxc.mount.auto = cgroup:mixed doesn't seem to work in Stretch anymore

2017-09-17 Thread Evgeni Golov
On Sun, Sep 17, 2017 at 10:40:27AM +0200, Evgeni Golov wrote:
> TL;DR: I can reproduce the "does not create cgroups" behaviour, but I
> don't know why yet.
> 
> Either way, you are right, the cgroups are missing in Stretch, and I don't 
> yet understand why.

This happens because cgfsng_mount() is a NOOP when cgroup namespaces are
supported [4].

This seems intentional, but I don't know who is suposed to used the
namespaced cgroups then.

[4] https://github.com/lxc/lxc/blob/master/src/lxc/cgroups/cgfsng.c#L1627-L1628



Bug#875733: lxc.mount.auto = cgroup:mixed doesn't seem to work in Stretch anymore

2017-09-17 Thread Evgeni Golov
control: found -1 1:2.0.8-2

Hi,

TL;DR: I can reproduce the "does not create cgroups" behaviour, but I
don't know why yet.

On Thu, Sep 14, 2017 at 10:01:41AM +0200, Yves-Alexis Perez wrote:
> On Thu, 2017-09-14 at 09:23 +0200, Yves-Alexis Perez wrote:
> > Package: lxc
> > Version: 1:2.0.7-2
> > Severity: normal
> > 
> > I'll setup a more simple container and config so I can provide it and
> > some logs to you so you can reproduce.
> 
> lxc-create -n test -t debian
> 
> I added:
> 
> lxc.autodev = 1
> lxc.mount.auto = proc:mixed
> lxc.mount.auto = sys:mixed
> lxc.mount.auto = cgroup:mixed

This is default in LXC 2.0 [3].

> lxc.cap.drop = sys_admin
> 
> to the lxc configuration but I think for now only the two last line matter:
> dropping CAP_SYS_ADMIN will prevent systemd to do the mounts itself,
> lxc.mount.auto = cgroup:mixed should have lxc mount /sys/fs/cgroup properly
> (and thus systemd should be happy), but it's not working.
> 
> I'm starting with:
> 
> lxc-start -n test -o /tmp/lxc.log -l DEBUG -F
> Failed to mount tmpfs at /dev/shm: Operation not permitted
> Failed to mount tmpfs at /run: Operation not permitted
> Failed to mount tmpfs at /run/lock: Operation not permitted
> Failed to mount tmpfs at /sys/fs/cgroup: Operation not permitted
> Failed to mount cgroup at /sys/fs/cgroup/systemd: No such file or directory
> [!!] Failed to mount API filesystems, freezing.
> Freezing execution.

as mentioned on IRC, the behaviour I see is a bit different.
I am using the official Debian Vagrant boxes [1][2], where I just did:
# apt install lxc (1:1.0.6-6+deb8u6 on jessie, 1:2.0.7-2 on stretch)
# lxc-create -n debian8onX -t debian -- -r jessie
# lxc-create -n debian9onX -t debian -- -r stetch

The Jessie version needed two small tweaks to the Debian template to be able to 
bootstrap Stretch.

Without any config changes of the containers, they start just fine with 
`lxc-start -n  -d` and I can attach to them using `lxc-attach -n `.

Jessie host:

root@debian8on8:~# ps aux
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root 1  0.3  0.0  27900  4316 ?Ss   11:10   0:00 /sbin/init
root19  0.2  0.0  32968  4348 ?Ss   11:10   0:00 
/lib/systemd/systemd-journald
root68  0.0  0.0  55188  5448 ?Ss   11:10   0:00 /usr/sbin/sshd 
-D
root71  0.0  0.0  12668  1852 tty4 Ss+  11:10   0:00 /sbin/agetty 
--noclear tty4 linux
root72  0.0  0.0  12668  1864 tty1 Ss+  11:10   0:00 /sbin/agetty 
--noclear tty1 linux
root73  0.0  0.0  12668  1860 tty3 Ss+  11:10   0:00 /sbin/agetty 
--noclear tty3 linux
root74  0.0  0.0  12668  1872 tty2 Ss+  11:10   0:00 /sbin/agetty 
--noclear tty2 linux
root75  0.0  0.0  14240  2244 console  Ss+  11:10   0:00 /sbin/agetty 
--noclear --keep-baud console 115200 38400 9600 vt102
root82  0.0  0.0  21868  3704 ?S11:10   0:00 /bin/bash
root83  0.0  0.0  19076  2332 ?R+   11:10   0:00 ps aux

root@debian8on8:~# mount |grep cgroup
cgroup on /sys/fs/cgroup/systemd type cgroup 
(rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd)
cgroup on /sys/fs/cgroup/cpuset type cgroup 
(rw,nosuid,nodev,noexec,relatime,cpuset,clone_children)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup 
(rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/devices type cgroup 
(rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/freezer type cgroup 
(rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup 
(rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/blkio type cgroup 
(rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/perf_event type cgroup 
(rw,nosuid,nodev,noexec,relatime,perf_event)

root@debian9on8:~# ps aux
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root 1  0.6  0.0  56664  6564 ?Ss   11:49   0:00 /sbin/init
root16  0.1  0.0  46092  5744 ?Ss   11:49   0:00 
/lib/systemd/systemd-journald
root46  0.0  0.0  69944  5704 ?Ss   11:49   0:00 /usr/sbin/sshd 
-D
root48  0.0  0.0  12672  1756 tty4 Ss+  11:49   0:00 /sbin/agetty 
--noclear tty4 linux
root49  0.0  0.0  12672  1664 tty2 Ss+  11:49   0:00 /sbin/agetty 
--noclear tty2 linux
root50  0.0  0.0  12672  1740 tty3 Ss+  11:49   0:00 /sbin/agetty 
--noclear tty3 linux
root51  0.0  0.0  12672  1660 tty1 Ss+  11:49   0:00 /sbin/agetty 
--noclear tty1 linux
root52  0.0  0.0  14316  2076 console  Ss+  11:49   0:00 /sbin/agetty 
--noclear --keep-baud console 115200,38400,9600 vt220
root54  0.0  0.0  19828  3560 ?S11:50   0:00 /bin/bash
root55  0.0  0.0  38276  3268 ?R+   11:50   0:00 ps aux

root@debian9on8:~# mount |grep cgroup
cgroup on /sys/fs/cgroup/systemd type cgroup 
(rw,nosuid,node

Bug#875733: lxc.mount.auto = cgroup:mixed doesn't seem to work in Stretch anymore

2017-09-14 Thread Yves-Alexis Perez
On Thu, 2017-09-14 at 09:23 +0200, Yves-Alexis Perez wrote:
> Package: lxc
> Version: 1:2.0.7-2
> Severity: normal
> 
> I'll setup a more simple container and config so I can provide it and
> some logs to you so you can reproduce.

lxc-create -n test -t debian

I added:

lxc.autodev = 1
lxc.mount.auto = proc:mixed
lxc.mount.auto = sys:mixed
lxc.mount.auto = cgroup:mixed
lxc.cap.drop = sys_admin

to the lxc configuration but I think for now only the two last line matter:
dropping CAP_SYS_ADMIN will prevent systemd to do the mounts itself,
lxc.mount.auto = cgroup:mixed should have lxc mount /sys/fs/cgroup properly
(and thus systemd should be happy), but it's not working.

I'm starting with:

lxc-start -n test -o /tmp/lxc.log -l DEBUG -F
Failed to mount tmpfs at /dev/shm: Operation not permitted
Failed to mount tmpfs at /run: Operation not permitted
Failed to mount tmpfs at /run/lock: Operation not permitted
Failed to mount tmpfs at /sys/fs/cgroup: Operation not permitted
Failed to mount cgroup at /sys/fs/cgroup/systemd: No such file or directory
[!!] Failed to mount API filesystems, freezing.
Freezing execution.

and I'm attaching the lxc.log here. There are some more errors in the console
logs because I don't setup some of the mounts, but they don't look critical
since they don't prevent the boot.

Regards,
-- 
Yves-Alexis  lxc-start 20170914075446.754 INFO lxc_start_ui - tools/lxc_start.c:main:275 - using rcfile /var/lib/lxc/test/config
  lxc-start 20170914075446.755 WARN lxc_confile - confile.c:config_pivotdir:1910 - lxc.pivotdir is ignored.  It will soon become an error.
  lxc-start 20170914075446.756 INFO lxc_lsm - lsm/lsm.c:lsm_init:48 - LSM security driver nop
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:402 - processing: .reject_force_umount  # comment this to allow umount -f;  not recommended.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:567 - Adding native rule for reject_force_umount action 0.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:do_resolve_add_rule:251 - Setting Seccomp rule to reject force umounts.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:570 - Adding compat rule for reject_force_umount action 0.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:do_resolve_add_rule:251 - Setting Seccomp rule to reject force umounts.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:402 - processing: .[all].
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:402 - processing: .kexec_load errno 1.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:567 - Adding native rule for kexec_load action 327681.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:570 - Adding compat rule for kexec_load action 327681.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:402 - processing: .open_by_handle_at errno 1.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:567 - Adding native rule for open_by_handle_at action 327681.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:570 - Adding compat rule for open_by_handle_at action 327681.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:402 - processing: .init_module errno 1.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:567 - Adding native rule for init_module action 327681.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:570 - Adding compat rule for init_module action 327681.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:402 - processing: .finit_module errno 1.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:567 - Adding native rule for finit_module action 327681.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:570 - Adding compat rule for finit_module action 327681.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:402 - processing: .delete_module errno 1.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:567 - Adding native rule for delete_module action 327681.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:570 - Adding compat rule for delete_module action 327681.
  lxc-start 20170914075446.756 INFO lxc_seccomp - seccomp.c:parse_config_v2:580 - Merging in the compat Seccomp ctx into the main one.
  lxc-start 20170914075446.756 WARN lxc_monitor - monitor.c:lxc_monitor_fifo_send:111 - Failed to open fifo to send message: No such file or directory.
  lxc-start 20170914075446.756 DEBUGlxc_start - start.c:setup_signal_

Bug#875733: lxc.mount.auto = cgroup:mixed doesn't seem to work in Stretch anymore

2017-09-14 Thread Yves-Alexis Perez
Package: lxc
Version: 1:2.0.7-2
Severity: normal

Hi,

In Jessie I was using a container setup with LXC and unprivileged
containers. By unprivileged, I mean container config had a bunch of
lxc.cap.drop lines, especially including sys_admin.

That means the init system inside the container (systemd) is not able to
do any privileged operation, including mounts, so the mounts need to be
done before starting the containers. It worked fine in Jessie (both host
and guests) with lines suchs as:

auto = proc:mixed sys:ro cgroup:mixed

Which takes care of mounting /proc, /sys and /sys/fs/cgroup for the
container.

Now in Stretch with lxc 2.0.7-2, it doesn't work anymore. Console output
for a Jessie container shows:

Failed to mount tmpfs at /sys/fs/cgroup: Operation not permitted

While for a Stretch container I have:

Failed to mount tmpfs at /sys/fs/cgroup: Operation not permitted
Failed to mount cgroup at /sys/fs/cgroup/systemd: No such file or
directory
[!!] Failed to mount API filesystems, freezing.
Freezing execution.

So it looks like systemd is trying to mount /sys/fs/cgroup and fails
(because it doesn't have CAP_SYS_ADMIN, which is expected). That means
lxc somehow failed to mount /sys/fs/cgroup in the container, which looks
like a regression from Jessie.

I'll setup a more simple container and config so I can provide it and
some logs to you so you can reproduce.

Regards,
-- 
Yves-Alexis

-- System Information:
Debian Release: 9.1
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.9.0-3-amd64 (SMP w/4 CPU cores)
Locale: LANG=fr_FR.utf8, LC_CTYPE=fr_FR.utf8 (charmap=UTF-8), 
LANGUAGE=fr_FR.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages lxc depends on:
ii  init-system-helpers  1.48
ii  libapparmor1 2.11.0-3
ii  libc62.24-11+deb9u1
ii  libcap2  1:2.25-1
ii  libgnutls30  3.5.8-5+deb9u2
ii  liblxc1  1:2.0.7-2
ii  libseccomp2  2.3.1-2.1
ii  libselinux1  2.6-3+b1
ii  lsb-base 9.20161125
ii  python3  3.5.3-1
ii  python3-lxc  1:2.0.7-2

Versions of packages lxc recommends:
pn  bridge-utils  
ii  debootstrap   1.0.89
ii  dirmngr   2.1.18-6
pn  dnsmasq-base  
ii  gnupg 2.1.18-6
ii  iptables  1.6.0+snapshot20161117-6
pn  libpam-cgfs   
pn  lxcfs 
ii  openssl   1.1.0f-3
ii  rsync 3.1.2-1
pn  uidmap

Versions of packages lxc suggests:
pn  apparmor 
pn  btrfs-tools  
ii  lvm2 2.02.168-2

-- debconf information:
* lxc/directory: /srv/lxc
  lxc/shutdown: /usr/bin/lxc-halt
  lxc/title:
  lxc/auto: