Looks good, except one thing - please see below.

On 5/5/21 6:40 PM, Konstantin Khorenko wrote:
Recent versions of containerd (as a part of k3s-1.19.5)
started to apply strict rules when parsing the contents of
'devices.list' files located in the devices cgroup.
Namely, the access token is allowed to contain only those values [rwm],
that are described in
https://www.kernel.org/doc/Documentation/cgroup-v1/devices.txt

In vzkernel we do have an extra permission in device cgroup to allow
mount of a block device inside a Container ('M'), so this upsets
containerd.

Let's leave 'devices.{allow,deny}' files to be able to handle vz
specific "M" permission, but 'devices.list' to show only [rwm]
permissions suppressing possible "M" presence.

Let's introduce another file 'devices.extra_list' to show all
permissions, including possible "M".

  $ echo "b 253:3182 rmM" > devices.allow
  $ cat devices.list
  ...
  b 253:3182 rm
  $ cat devices.extra_list
  ...
  b 253:3182 rmM

https://jira.sw.ru/browse/PSBM-123743

Signed-off-by: Konstantin Khorenko <khore...@virtuozzo.com>
---
  include/linux/device_cgroup.h |  4 ++--
  security/device_cgroup.c      | 13 ++++++++++++-
  2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/include/linux/device_cgroup.h b/include/linux/device_cgroup.h
index 5a8fb7f78962..5353e22a6ad0 100644
--- a/include/linux/device_cgroup.h
+++ b/include/linux/device_cgroup.h
@@ -6,8 +6,8 @@
  #define DEVCG_ACC_READ  2
  #define DEVCG_ACC_WRITE 4
  #define DEVCG_ACC_MOUNT 64
-#define DEVCG_ACC_MASK (DEVCG_ACC_MKNOD | DEVCG_ACC_READ | DEVCG_ACC_WRITE | \
-                       DEVCG_ACC_MOUNT)
+#define DEVCG_ACC_MASK (DEVCG_ACC_MKNOD | DEVCG_ACC_READ | DEVCG_ACC_WRITE)
+#define DEVCG_ACC_EXTRA_MASK (DEVCG_ACC_MASK | DEVCG_ACC_MOUNT)
#define DEVCG_DEV_BLOCK 1
  #define DEVCG_DEV_CHAR  2
diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index 97dbc72969ce..d20fa5386bc9 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -230,6 +230,7 @@ static void devcgroup_css_free(struct cgroup_subsys_state 
*css)
  #define DEVCG_ALLOW 1
  #define DEVCG_DENY 2
  #define DEVCG_LIST 3
+#define DEVCG_EXTRA_LIST 32
#define MAJMINLEN 13
  #define ACCLEN 5
@@ -272,6 +273,11 @@ static int devcgroup_seq_show(struct seq_file *m, void *v)
        struct dev_cgroup *devcgroup = css_to_devcgroup(seq_css(m));
        struct dev_exception_item *ex;
        char maj[MAJMINLEN], min[MAJMINLEN], acc[ACCLEN];
+       short type, mask;
+
+       type = (short)seq_cft(m)->private;
+       mask = (type == DEVCG_EXTRA_LIST) ?
+               DEVCG_ACC_EXTRA_MASK : DEVCG_ACC_MASK;
rcu_read_lock();
        /*
@@ -288,7 +294,7 @@ static int devcgroup_seq_show(struct seq_file *m, void *v)
                           maj, min, acc);

Should not we also patch DEVCG_DEFAULT_ALLOW branch of this if? We can print "a *:* rwm" in devices.list, but "a *:* rwmM" on devices.extra_list for consistency.

        } else {

Note: In vz8 we don't have commit a6dba9fbee35f ("ve/device_cgroup: show all devices allowed in ct to fool docker") which also need to be adopted.

                list_for_each_entry_rcu(ex, &devcgroup->exceptions, list) {
-                       set_access(acc, ex->access);
+                       set_access(acc, ex->access & mask);
                        set_majmin(maj, ex->major);
                        set_majmin(min, ex->minor);
                        seq_printf(m, "%c %s:%s %s\n", type_to_char(ex->type),
@@ -799,6 +805,11 @@ static struct cftype dev_cgroup_files[] = {
                .seq_show = devcgroup_seq_show,
                .private = DEVCG_LIST,
        },
+       {
+               .name = "extra_list",
+               .seq_show = devcgroup_seq_show,
+               .private = DEVCG_EXTRA_LIST,
+       },
        { }     /* terminate */
  };

--
Best regards, Tikhomirov Pavel
Software Developer, Virtuozzo.
_______________________________________________
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to