[
https://issues.apache.org/jira/browse/MESOS-9070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541713#comment-16541713
]
Qian Zhang commented on MESOS-9070:
-----------------------------------
RR: https://reviews.apache.org/r/67896/
> Support systemd and freezer cgroup subsystems bind mount for container with
> rootfs.
> -----------------------------------------------------------------------------------
>
> Key: MESOS-9070
> URL: https://issues.apache.org/jira/browse/MESOS-9070
> Project: Mesos
> Issue Type: Task
> Components: containerization
> Reporter: Gilbert Song
> Assignee: Qian Zhang
> Priority: Major
> Labels: cgroups, containerizer, systemd
>
> From MESOS-8327, cgroup subsystems are bind mounted to the container's
> rootfs, but systemd and freezer cgroup are not bind mounted yet since they
> are not subsystems under the cgroup isolator but from the linux launcher.
> Some applications (e.g., dockerd) may check the /proc/self/cgorup for enabled
> subsystems and check them at /proc/self/mountinfo to make sure there are
> those mounts. Here is an example:
> {noformat}
> ➜ aws dcos task exec --interactive
> test.bf2fad80-846b-11e8-b5a0-eaa1bec34306 /bin/bash
> cat /proc/self/cgroup
> 11:blkio:/mesos/87899f08-53e5-47bf-aba3-712c31c33543
> 10:perf_event:/mesos/87899f08-53e5-47bf-aba3-712c31c33543
> 9:cpuset:/mesos/87899f08-53e5-47bf-aba3-712c31c33543
> 8:memory:/mesos/87899f08-53e5-47bf-aba3-712c31c33543
> 7:pids:/mesos/87899f08-53e5-47bf-aba3-712c31c33543
> 6:devices:/mesos/87899f08-53e5-47bf-aba3-712c31c33543
> 5:cpu,cpuacct:/mesos/87899f08-53e5-47bf-aba3-712c31c33543
> 4:freezer:/mesos/87899f08-53e5-47bf-aba3-712c31c33543/mesos/12fde554-5262-473c-a20c-7dd201148b11
> 3:net_cls,net_prio:/mesos/87899f08-53e5-47bf-aba3-712c31c33543
> 2:hugetlb:/mesos/87899f08-53e5-47bf-aba3-712c31c33543
> 1:name=systemd:/mesos/87899f08-53e5-47bf-aba3-712c31c33543/mesos/12fde554-5262-473c-a20c-7dd201148b11
>
> cat /proc/self/mountinfo
> 388 387 202:9 / / rw,relatime master:1 - ext4 /dev/xvda9
> rw,seclabel,data=ordered
> 389 388 254:0 / /usr ro,relatime master:2 - ext4 /dev/mapper/usr
> ro,seclabel,block_validity,delalloc,barrier,user_xattr,acl
> 390 389 202:6 / /usr/share/oem rw,nodev,relatime master:32 - ext4 /dev/xvda6
> rw,seclabel,commit=600,data=ordered
> 391 388 0:6 / /dev rw,nosuid master:3 - devtmpfs devtmpfs
> rw,seclabel,size=8201844k,nr_inodes=2050461,mode=755
> 392 391 0:19 / /dev/shm rw,nosuid,nodev master:4 - tmpfs tmpfs rw,seclabel
> 393 391 0:20 / /dev/pts rw,nosuid,noexec,relatime master:5 - devpts devpts
> rw,seclabel,gid=5,mode=620,ptmxmode=000
> 394 391 0:15 / /dev/mqueue rw,relatime master:26 - mqueue mqueue rw,seclabel
> 395 391 0:37 / /dev/hugepages rw,relatime master:27 - hugetlbfs hugetlbfs
> rw,seclabel
> 396 388 0:4 / /proc rw,nosuid,nodev,noexec,relatime master:6 - proc proc rw
> 397 396 0:35 / /proc/sys/fs/binfmt_misc rw,relatime master:24 - autofs
> systemd-1 rw,fd=23,pgrp=0,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=1017
> 398 396 0:40 / /proc/xen rw,relatime master:31 - xenfs xenfs rw
> 399 388 0:18 / /sys rw,nosuid,nodev,noexec,relatime master:7 - sysfs sysfs
> rw,seclabel
> 400 399 0:17 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime master:8
> - securityfs securityfs rw
> 401 399 0:22 / /sys/fs/cgroup ro,nosuid,nodev,noexec master:9 - tmpfs tmpfs
> ro,seclabel,mode=755
> 402 401 0:23 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime
> master:10 - cgroup cgroup
> rw,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd
> 403 401 0:25 / /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime
> master:11 - cgroup cgroup rw,hugetlb
> 404 401 0:26 / /sys/fs/cgroup/net_cls,net_prio
> rw,nosuid,nodev,noexec,relatime master:12 - cgroup cgroup rw,net_cls,net_prio
> 405 401 0:27 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime
> master:13 - cgroup cgroup rw,freezer
> 406 401 0:28 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime
> master:14 - cgroup cgroup rw,cpu,cpuacct
> 407 401 0:29 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime
> master:15 - cgroup cgroup rw,devices
> 408 401 0:30 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime master:16
> - cgroup cgroup rw,pids
> 409 401 0:31 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime
> master:17 - cgroup cgroup rw,memory
> 410 401 0:32 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime
> master:18 - cgroup cgroup rw,cpuset
> 411 401 0:33 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime
> master:19 - cgroup cgroup rw,perf_event
> 412 401 0:34 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime master:20
> - cgroup cgroup rw,blkio
> 413 399 0:24 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime master:21 -
> pstore pstore rw,seclabel
> 414 399 0:16 / /sys/fs/selinux rw,relatime master:22 - selinuxfs selinuxfs rw
> 415 399 0:7 / /sys/kernel/debug rw,relatime master:29 - debugfs debugfs
> rw,seclabel
> 416 388 0:21 / /run rw,nosuid,nodev master:23 - tmpfs tmpfs
> rw,seclabel,mode=755
> 417 388 0:36 / /boot rw,relatime master:25 - autofs systemd-1
> rw,fd=33,pgrp=0,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=10774
> 418 417 202:1 / /boot rw,relatime master:33 - vfat /dev/xvda1
> rw,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,errors=remount-ro
> 419 388 0:38 / /media rw,nosuid,nodev,noexec,relatime master:28 - tmpfs tmpfs
> rw,seclabel
> 420 388 0:39 / /tmp rw,nosuid,nodev master:30 - tmpfs tmpfs rw,seclabel
> 421 388 202:16 / /var/lib rw,relatime master:218 - ext4 /dev/xvdb
> rw,seclabel,data=ordered
> 422 421 202:16 /docker/overlay /var/lib/docker/overlay rw,relatime - ext4
> /dev/xvdb rw,seclabel,data=ordered
> 423 421 202:16
> /mesos/slave/volumes/roles/kubernetes-role/b12a0508-c837-4d89-b1e3-d1400355833c
>
> /var/lib/mesos/slave/slaves/cbb0007d-bcc7-4fe8-b47d-3d67604a2eb2-S0/frameworks/cbb0007d-bcc7-4fe8-b47d-3d67604a2eb2-0002/executors/kubernetes__etcd__465602c0-ad54-4f46-960e-3a5e8e18f3e8/runs/300d07e7-319d-4642-b9c9-63b9293765fd/data-dir
> rw,relatime master:218 - ext4 /dev/xvdb rw,seclabel,data=ordered
> 424 421 202:16
> /mesos/slave/volumes/roles/kubernetes-role/a60b4165-e5ee-4847-8437-2a7f78f38c5d
>
> /var/lib/mesos/slave/slaves/cbb0007d-bcc7-4fe8-b47d-3d67604a2eb2-S0/frameworks/cbb0007d-bcc7-4fe8-b47d-3d67604a2eb2-0002/executors/kubernetes__etcd__465602c0-ad54-4f46-960e-3a5e8e18f3e8/runs/300d07e7-319d-4642-b9c9-63b9293765fd/wal-pv
> rw,relatime master:218 - ext4 /dev/xvdb rw,seclabel,data=ordered
> 426 396 0:51 / /proc rw,nosuid,nodev,noexec,relatime - proc proc rw
> 427 421 0:52 /
> /var/lib/mesos/slave/slaves/cbb0007d-bcc7-4fe8-b47d-3d67604a2eb2-S0/frameworks/cbb0007d-bcc7-4fe8-b47d-3d67604a2eb2-0001/executors/test.bf2fad80-846b-11e8-b5a0-eaa1bec34306/runs/87899f08-53e5-47bf-aba3-712c31c33543/.secret-113d83da-d9ce-4a5f-9565-9179ed8bd94a
> rw,relatime - ramfs ramfs rw
> ➜ aws dcos task exec --interactive
> debian.6c333651-846c-11e8-b5a0-eaa1bec34306 /bin/bash
> cat /proc/self/cgroup
> 11:freezer:/mesos/66896178-3726-439f-ac45-6eb025b944fc/mesos/e69b6a82-4c4a-4758-99c8-6afac41ae1a5
> 10:devices:/mesos/66896178-3726-439f-ac45-6eb025b944fc
> 9:hugetlb:/mesos/66896178-3726-439f-ac45-6eb025b944fc
> 8:blkio:/mesos/66896178-3726-439f-ac45-6eb025b944fc
> 7:cpuset:/mesos/66896178-3726-439f-ac45-6eb025b944fc
> 6:pids:/mesos/66896178-3726-439f-ac45-6eb025b944fc
> 5:perf_event:/mesos/66896178-3726-439f-ac45-6eb025b944fc
> 4:cpu,cpuacct:/mesos/66896178-3726-439f-ac45-6eb025b944fc
> 3:memory:/mesos/66896178-3726-439f-ac45-6eb025b944fc
> 2:net_cls,net_prio:/mesos/66896178-3726-439f-ac45-6eb025b944fc
> 1:name=systemd:/mesos/66896178-3726-439f-ac45-6eb025b944fc/mesos/e69b6a82-4c4a-4758-99c8-6afac41ae1a5
> cat /proc/self/mountinfo
> 466 423 0:51 / / rw,relatime master:148 - overlay overlay
> rw,lowerdir=/tmp/xRzx5s/1:/tmp/xRzx5s/0,upperdir=/var/lib/mesos/slave/provisioner/containers/66896178-3726-439f-ac45-6eb025b944fc/backends/overlay/scratch/704eebdc-1862-4054-9245-2025563a1919/upperdir,workdir=/var/lib/mesos/slave/provisioner/containers/66896178-3726-439f-ac45-6eb025b944fc/backends/overlay/scratch/704eebdc-1862-4054-9245-2025563a1919/workdir
> 467 466 202:9 /etc/resolv.conf//deleted /etc/resolv.conf
> ro,nosuid,nodev,noexec,relatime master:1 - ext4 /dev/xvda9
> rw,seclabel,data=ordered
> 468 466 202:9 /etc/hostname /etc/hostname ro,nosuid,nodev,noexec,relatime
> master:1 - ext4 /dev/xvda9 rw,seclabel,data=ordered
> 469 466 202:9 /etc/hosts /etc/hosts ro,nosuid,nodev,noexec,relatime master:1
> - ext4 /dev/xvda9 rw,seclabel,data=ordered
> 470 466 202:16
> /mesos/slave/slaves/cbb0007d-bcc7-4fe8-b47d-3d67604a2eb2-S1/frameworks/cbb0007d-bcc7-4fe8-b47d-3d67604a2eb2-0001/executors/debian.6c333651-846c-11e8-b5a0-eaa1bec34306/runs/66896178-3726-439f-ac45-6eb025b944fc
> /mnt/mesos/sandbox rw,relatime master:218 - ext4 /dev/xvdb
> rw,seclabel,data=ordered
> 471 466 0:52 / /proc rw,nosuid,nodev,noexec,relatime - proc proc rw
> 472 471 0:52 /bus /proc/bus ro,nosuid,nodev,noexec,relatime - proc proc rw
> 473 471 0:52 /fs /proc/fs ro,nosuid,nodev,noexec,relatime - proc proc rw
> 474 471 0:52 /irq /proc/irq ro,nosuid,nodev,noexec,relatime - proc proc rw
> 475 471 0:52 /sys /proc/sys ro,nosuid,nodev,noexec,relatime - proc proc rw
> 476 471 0:52 /sysrq-trigger /proc/sysrq-trigger
> ro,nosuid,nodev,noexec,relatime - proc proc rw
> 477 466 0:18 / /sys ro,nosuid,nodev,noexec,relatime - sysfs sysfs rw,seclabel
> 478 477 0:54 / /sys/fs/cgroup rw,nosuid,nodev,noexec,relatime - tmpfs tmpfs
> rw,seclabel,mode=755
> 479 466 0:55 / /dev rw,nosuid,noexec - tmpfs tmpfs rw,seclabel,mode=755
> 480 479 0:56 / /dev/pts rw,nosuid,noexec,relatime - devpts devpts
> rw,seclabel,mode=600,ptmxmode=666
> 481 479 0:57 / /dev/shm rw,nosuid,nodev - tmpfs tmpfs rw,seclabel
> 482 478 0:31 /mesos/66896178-3726-439f-ac45-6eb025b944fc /sys/fs/cgroup/blkio
> rw,nosuid,nodev,noexec,relatime master:17 - cgroup cgroup rw,blkio
> 483 478 0:27 /mesos/66896178-3726-439f-ac45-6eb025b944fc
> /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime master:13 - cgroup
> cgroup rw,cpu,cpuacct
> 484 478 0:30 /mesos/66896178-3726-439f-ac45-6eb025b944fc
> /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime master:16 - cgroup
> cgroup rw,cpuset
> 485 478 0:33 /mesos/66896178-3726-439f-ac45-6eb025b944fc
> /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime master:19 - cgroup
> cgroup rw,devices
> 486 478 0:32 /mesos/66896178-3726-439f-ac45-6eb025b944fc
> /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime master:18 - cgroup
> cgroup rw,hugetlb
> 487 478 0:26 /mesos/66896178-3726-439f-ac45-6eb025b944fc
> /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime master:12 - cgroup
> cgroup rw,memory
> 488 478 0:25 /mesos/66896178-3726-439f-ac45-6eb025b944fc
> /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime master:11 -
> cgroup cgroup rw,net_cls,net_prio
> 489 478 0:28 /mesos/66896178-3726-439f-ac45-6eb025b944fc
> /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime master:14 - cgroup
> cgroup rw,perf_event
> 490 478 0:29 /mesos/66896178-3726-439f-ac45-6eb025b944fc /sys/fs/cgroup/pids
> rw,nosuid,nodev,noexec,relatime master:15 - cgroup cgroup rw,pids
> {noformat}
> The first one is a task without image, the second one is a task using debian
> image. So any app relies on systemd and freezer cgroup would may fail:
> {noformat}
> returned error: cgroups: cannot find cgroup mount destination: unknown
> ./docker/docker: Error response from daemon: cgroups: cannot find cgroup
> mount destination: unknown.
> {noformat}
> So, we should consider add systemd and freezer cgroup bind mount at the
> cgroup isolator and make a *NOTE* for this behavior.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)