[jira] [Commented] (MESOS-10131) Agent frequently dies with error "Cycle found in mount table hierarchy"
[ https://issues.apache.org/jira/browse/MESOS-10131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17321152#comment-17321152 ] Charles Natali commented on MESOS-10131: I think this could possibly happen without a loop in {{/proc/PID/mountinfo}} because reading from {{/proc/PID/mountinfo}} isn't atomic - definitely not if it can't be read in a single {{read}} syscall, which is very likely the case here since it's larger than 30K. Could explain why it happens randomly especially if there are many short-lived tasks being started. Since it didn't re-occur and the potential fix for it would be far from trivial, probably time to close. > Agent frequently dies with error "Cycle found in mount table hierarchy" > --- > > Key: MESOS-10131 > URL: https://issues.apache.org/jira/browse/MESOS-10131 > Project: Mesos > Issue Type: Bug > Components: agent, framework >Affects Versions: 1.9.0 >Reporter: Thomas Plummer >Assignee: Andrei Budnik >Priority: Major > Attachments: log.txt > > > Our mesos agent frequently dies with the follow error in the slave logs: > > {code:java} > F0509 22:10:33.036993 17723 fs.cpp:217] Check failed: > !visitedParents.contains(parentId) Cycle found in mount table hierarchy at > entry '1954': > 18 41 0:18 / /sys rw,nosuid,nodev,noexec,relatime shared:6 - sysfs sysfs > rw,seclabel > 19 41 0:3 / /proc rw,nosuid,nodev,noexec,relatime shared:5 - proc proc rw > 20 41 0:5 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs > rw,seclabel,size=65852208k,nr_inodes=16463052,mode=755 > 21 18 0:17 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime shared:7 - > securityfs securityfs rw > 22 20 0:19 / /dev/shm rw,nosuid,nodev,noexec shared:3 - tmpfs tmpfs > rw,seclabel > 23 20 0:12 / /dev/pts rw,nosuid,noexec,relatime shared:4 - devpts devpts > rw,seclabel,gid=5,mode=620,ptmxmode=000 > 24 41 0:20 / /run rw,nosuid,nodev shared:24 - tmpfs tmpfs rw,seclabel,mode=755 > 25 18 0:21 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:8 - tmpfs tmpfs > ro,seclabel,mode=755 > 26 25 0:22 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:9 > - cgroup cgroup > rw,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd > 27 18 0:23 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime shared:20 - > pstore pstore rw > 28 18 0:24 / /sys/firmware/efi/efivars rw,nosuid,nodev,noexec,relatime > shared:21 - efivarfs efivarfs rw > 29 25 0:25 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime > shared:10 - cgroup cgroup rw,seclabel,perf_event > 30 25 0:26 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime > shared:11 - cgroup cgroup rw,seclabel,net_prio,net_cls > 31 25 0:27 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:12 > - cgroup cgroup rw,seclabel,cpuset > 32 25 0:28 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:13 - > cgroup cgroup rw,seclabel,blkio > 33 25 0:29 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:14 > - cgroup cgroup rw,seclabel,freezer > 34 25 0:30 / /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime shared:15 > - cgroup cgroup rw,seclabel,hugetlb > 35 25 0:31 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:16 > - cgroup cgroup rw,seclabel,devices > 36 25 0:32 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime > shared:17 - cgroup cgroup rw,seclabel,cpuacct,cpu > 37 25 0:33 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:18 > - cgroup cgroup rw,seclabel,memory > 38 25 0:34 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:19 - > cgroup cgroup rw,seclabel,pids > 39 18 0:35 / /sys/kernel/config rw,relatime shared:22 - configfs configfs rw > 41 0 253:0 / / rw,relatime shared:1 - xfs /dev/mapper/vg_system-root > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 42 18 0:16 / /sys/fs/selinux rw,relatime shared:23 - selinuxfs selinuxfs rw > 43 19 0:37 / /proc/sys/fs/binfmt_misc rw,relatime shared:25 - autofs > systemd-1 > rw,fd=32,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=11414 > 44 18 0:6 / /sys/kernel/debug rw,relatime shared:26 - debugfs debugfs rw > 45 20 0:15 / /dev/mqueue rw,relatime shared:27 - mqueue mqueue rw,seclabel > 46 20 0:38 / /dev/hugepages rw,relatime shared:28 - hugetlbfs hugetlbfs > rw,seclabel > 47 41 8:2 / /boot rw,relatime shared:29 - xfs /dev/sda2 > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 48 47 8:1 / /boot/efi rw,relatime shared:30 - vfat /dev/sda1 > rw,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=winnt,errors=remount-ro > 49 41 253:2 / /var rw,relatime shared:31 - xfs /dev/mapper/vg_system-var > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=51
[jira] [Commented] (MESOS-10131) Agent frequently dies with error "Cycle found in mount table hierarchy"
[ https://issues.apache.org/jira/browse/MESOS-10131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17135978#comment-17135978 ] Thomas Plummer commented on MESOS-10131: We haven't seen this issue in over two weeks now. We are not sure what the root cause of the issue would be, but here are a few of the things we changed: 1) We set --gc_delay=1days from the default of 1 week. 2) We upgraded to new hardware. 3) We found some issues with our custom mesos framework (we were taking a long time to respond to status updates and resource offers) We are comfortable with closing this issue and we will be sure to reopen it if the error comes back again. > Agent frequently dies with error "Cycle found in mount table hierarchy" > --- > > Key: MESOS-10131 > URL: https://issues.apache.org/jira/browse/MESOS-10131 > Project: Mesos > Issue Type: Bug > Components: agent, framework >Affects Versions: 1.9.0 >Reporter: Thomas Plummer >Assignee: Andrei Budnik >Priority: Major > Attachments: log.txt > > > Our mesos agent frequently dies with the follow error in the slave logs: > > {code:java} > F0509 22:10:33.036993 17723 fs.cpp:217] Check failed: > !visitedParents.contains(parentId) Cycle found in mount table hierarchy at > entry '1954': > 18 41 0:18 / /sys rw,nosuid,nodev,noexec,relatime shared:6 - sysfs sysfs > rw,seclabel > 19 41 0:3 / /proc rw,nosuid,nodev,noexec,relatime shared:5 - proc proc rw > 20 41 0:5 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs > rw,seclabel,size=65852208k,nr_inodes=16463052,mode=755 > 21 18 0:17 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime shared:7 - > securityfs securityfs rw > 22 20 0:19 / /dev/shm rw,nosuid,nodev,noexec shared:3 - tmpfs tmpfs > rw,seclabel > 23 20 0:12 / /dev/pts rw,nosuid,noexec,relatime shared:4 - devpts devpts > rw,seclabel,gid=5,mode=620,ptmxmode=000 > 24 41 0:20 / /run rw,nosuid,nodev shared:24 - tmpfs tmpfs rw,seclabel,mode=755 > 25 18 0:21 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:8 - tmpfs tmpfs > ro,seclabel,mode=755 > 26 25 0:22 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:9 > - cgroup cgroup > rw,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd > 27 18 0:23 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime shared:20 - > pstore pstore rw > 28 18 0:24 / /sys/firmware/efi/efivars rw,nosuid,nodev,noexec,relatime > shared:21 - efivarfs efivarfs rw > 29 25 0:25 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime > shared:10 - cgroup cgroup rw,seclabel,perf_event > 30 25 0:26 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime > shared:11 - cgroup cgroup rw,seclabel,net_prio,net_cls > 31 25 0:27 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:12 > - cgroup cgroup rw,seclabel,cpuset > 32 25 0:28 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:13 - > cgroup cgroup rw,seclabel,blkio > 33 25 0:29 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:14 > - cgroup cgroup rw,seclabel,freezer > 34 25 0:30 / /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime shared:15 > - cgroup cgroup rw,seclabel,hugetlb > 35 25 0:31 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:16 > - cgroup cgroup rw,seclabel,devices > 36 25 0:32 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime > shared:17 - cgroup cgroup rw,seclabel,cpuacct,cpu > 37 25 0:33 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:18 > - cgroup cgroup rw,seclabel,memory > 38 25 0:34 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:19 - > cgroup cgroup rw,seclabel,pids > 39 18 0:35 / /sys/kernel/config rw,relatime shared:22 - configfs configfs rw > 41 0 253:0 / / rw,relatime shared:1 - xfs /dev/mapper/vg_system-root > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 42 18 0:16 / /sys/fs/selinux rw,relatime shared:23 - selinuxfs selinuxfs rw > 43 19 0:37 / /proc/sys/fs/binfmt_misc rw,relatime shared:25 - autofs > systemd-1 > rw,fd=32,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=11414 > 44 18 0:6 / /sys/kernel/debug rw,relatime shared:26 - debugfs debugfs rw > 45 20 0:15 / /dev/mqueue rw,relatime shared:27 - mqueue mqueue rw,seclabel > 46 20 0:38 / /dev/hugepages rw,relatime shared:28 - hugetlbfs hugetlbfs > rw,seclabel > 47 41 8:2 / /boot rw,relatime shared:29 - xfs /dev/sda2 > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 48 47 8:1 / /boot/efi rw,relatime shared:30 - vfat /dev/sda1 > rw,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=winnt,errors=remount-ro > 49 41 253:2 / /var rw,relatime shared:31 - xfs /dev/mapper/vg_system-var > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512
[jira] [Commented] (MESOS-10131) Agent frequently dies with error "Cycle found in mount table hierarchy"
[ https://issues.apache.org/jira/browse/MESOS-10131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119576#comment-17119576 ] Andrei Budnik commented on MESOS-10131: --- Please keep posting error messages on agent crash. Hopefully, we'll capture a part of `mountinfo` containing the loop. I think it might be worth capturing mount info after the moment it happens. We could check then if there are duplicate records or even detect a loop or find some other anomalies. `mount && cat /proc/1/mountinfo` && `cat /proc//mountinfo` > Agent frequently dies with error "Cycle found in mount table hierarchy" > --- > > Key: MESOS-10131 > URL: https://issues.apache.org/jira/browse/MESOS-10131 > Project: Mesos > Issue Type: Bug > Components: agent, framework >Affects Versions: 1.9.0 >Reporter: Thomas Plummer >Assignee: Andrei Budnik >Priority: Major > Attachments: log.txt > > > Our mesos agent frequently dies with the follow error in the slave logs: > > {code:java} > F0509 22:10:33.036993 17723 fs.cpp:217] Check failed: > !visitedParents.contains(parentId) Cycle found in mount table hierarchy at > entry '1954': > 18 41 0:18 / /sys rw,nosuid,nodev,noexec,relatime shared:6 - sysfs sysfs > rw,seclabel > 19 41 0:3 / /proc rw,nosuid,nodev,noexec,relatime shared:5 - proc proc rw > 20 41 0:5 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs > rw,seclabel,size=65852208k,nr_inodes=16463052,mode=755 > 21 18 0:17 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime shared:7 - > securityfs securityfs rw > 22 20 0:19 / /dev/shm rw,nosuid,nodev,noexec shared:3 - tmpfs tmpfs > rw,seclabel > 23 20 0:12 / /dev/pts rw,nosuid,noexec,relatime shared:4 - devpts devpts > rw,seclabel,gid=5,mode=620,ptmxmode=000 > 24 41 0:20 / /run rw,nosuid,nodev shared:24 - tmpfs tmpfs rw,seclabel,mode=755 > 25 18 0:21 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:8 - tmpfs tmpfs > ro,seclabel,mode=755 > 26 25 0:22 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:9 > - cgroup cgroup > rw,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd > 27 18 0:23 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime shared:20 - > pstore pstore rw > 28 18 0:24 / /sys/firmware/efi/efivars rw,nosuid,nodev,noexec,relatime > shared:21 - efivarfs efivarfs rw > 29 25 0:25 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime > shared:10 - cgroup cgroup rw,seclabel,perf_event > 30 25 0:26 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime > shared:11 - cgroup cgroup rw,seclabel,net_prio,net_cls > 31 25 0:27 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:12 > - cgroup cgroup rw,seclabel,cpuset > 32 25 0:28 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:13 - > cgroup cgroup rw,seclabel,blkio > 33 25 0:29 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:14 > - cgroup cgroup rw,seclabel,freezer > 34 25 0:30 / /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime shared:15 > - cgroup cgroup rw,seclabel,hugetlb > 35 25 0:31 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:16 > - cgroup cgroup rw,seclabel,devices > 36 25 0:32 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime > shared:17 - cgroup cgroup rw,seclabel,cpuacct,cpu > 37 25 0:33 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:18 > - cgroup cgroup rw,seclabel,memory > 38 25 0:34 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:19 - > cgroup cgroup rw,seclabel,pids > 39 18 0:35 / /sys/kernel/config rw,relatime shared:22 - configfs configfs rw > 41 0 253:0 / / rw,relatime shared:1 - xfs /dev/mapper/vg_system-root > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 42 18 0:16 / /sys/fs/selinux rw,relatime shared:23 - selinuxfs selinuxfs rw > 43 19 0:37 / /proc/sys/fs/binfmt_misc rw,relatime shared:25 - autofs > systemd-1 > rw,fd=32,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=11414 > 44 18 0:6 / /sys/kernel/debug rw,relatime shared:26 - debugfs debugfs rw > 45 20 0:15 / /dev/mqueue rw,relatime shared:27 - mqueue mqueue rw,seclabel > 46 20 0:38 / /dev/hugepages rw,relatime shared:28 - hugetlbfs hugetlbfs > rw,seclabel > 47 41 8:2 / /boot rw,relatime shared:29 - xfs /dev/sda2 > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 48 47 8:1 / /boot/efi rw,relatime shared:30 - vfat /dev/sda1 > rw,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=winnt,errors=remount-ro > 49 41 253:2 / /var rw,relatime shared:31 - xfs /dev/mapper/vg_system-var > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 50 41 253:5 / /home rw,nodev,relatime shared:32 - xfs > /dev/mapper/vg_system-home > rw,seclabel,attr2,inode64,
[jira] [Commented] (MESOS-10131) Agent frequently dies with error "Cycle found in mount table hierarchy"
[ https://issues.apache.org/jira/browse/MESOS-10131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119177#comment-17119177 ] Rick Naik commented on MESOS-10131: --- Andrei, is this something we need to capture the moment it happens? If so that will be challenging. I think what Tom posted is the extent of what we can capture hours after the fact. > Agent frequently dies with error "Cycle found in mount table hierarchy" > --- > > Key: MESOS-10131 > URL: https://issues.apache.org/jira/browse/MESOS-10131 > Project: Mesos > Issue Type: Bug > Components: agent, framework >Affects Versions: 1.9.0 >Reporter: Thomas Plummer >Assignee: Andrei Budnik >Priority: Major > Attachments: log.txt > > > Our mesos agent frequently dies with the follow error in the slave logs: > > {code:java} > F0509 22:10:33.036993 17723 fs.cpp:217] Check failed: > !visitedParents.contains(parentId) Cycle found in mount table hierarchy at > entry '1954': > 18 41 0:18 / /sys rw,nosuid,nodev,noexec,relatime shared:6 - sysfs sysfs > rw,seclabel > 19 41 0:3 / /proc rw,nosuid,nodev,noexec,relatime shared:5 - proc proc rw > 20 41 0:5 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs > rw,seclabel,size=65852208k,nr_inodes=16463052,mode=755 > 21 18 0:17 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime shared:7 - > securityfs securityfs rw > 22 20 0:19 / /dev/shm rw,nosuid,nodev,noexec shared:3 - tmpfs tmpfs > rw,seclabel > 23 20 0:12 / /dev/pts rw,nosuid,noexec,relatime shared:4 - devpts devpts > rw,seclabel,gid=5,mode=620,ptmxmode=000 > 24 41 0:20 / /run rw,nosuid,nodev shared:24 - tmpfs tmpfs rw,seclabel,mode=755 > 25 18 0:21 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:8 - tmpfs tmpfs > ro,seclabel,mode=755 > 26 25 0:22 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:9 > - cgroup cgroup > rw,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd > 27 18 0:23 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime shared:20 - > pstore pstore rw > 28 18 0:24 / /sys/firmware/efi/efivars rw,nosuid,nodev,noexec,relatime > shared:21 - efivarfs efivarfs rw > 29 25 0:25 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime > shared:10 - cgroup cgroup rw,seclabel,perf_event > 30 25 0:26 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime > shared:11 - cgroup cgroup rw,seclabel,net_prio,net_cls > 31 25 0:27 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:12 > - cgroup cgroup rw,seclabel,cpuset > 32 25 0:28 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:13 - > cgroup cgroup rw,seclabel,blkio > 33 25 0:29 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:14 > - cgroup cgroup rw,seclabel,freezer > 34 25 0:30 / /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime shared:15 > - cgroup cgroup rw,seclabel,hugetlb > 35 25 0:31 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:16 > - cgroup cgroup rw,seclabel,devices > 36 25 0:32 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime > shared:17 - cgroup cgroup rw,seclabel,cpuacct,cpu > 37 25 0:33 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:18 > - cgroup cgroup rw,seclabel,memory > 38 25 0:34 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:19 - > cgroup cgroup rw,seclabel,pids > 39 18 0:35 / /sys/kernel/config rw,relatime shared:22 - configfs configfs rw > 41 0 253:0 / / rw,relatime shared:1 - xfs /dev/mapper/vg_system-root > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 42 18 0:16 / /sys/fs/selinux rw,relatime shared:23 - selinuxfs selinuxfs rw > 43 19 0:37 / /proc/sys/fs/binfmt_misc rw,relatime shared:25 - autofs > systemd-1 > rw,fd=32,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=11414 > 44 18 0:6 / /sys/kernel/debug rw,relatime shared:26 - debugfs debugfs rw > 45 20 0:15 / /dev/mqueue rw,relatime shared:27 - mqueue mqueue rw,seclabel > 46 20 0:38 / /dev/hugepages rw,relatime shared:28 - hugetlbfs hugetlbfs > rw,seclabel > 47 41 8:2 / /boot rw,relatime shared:29 - xfs /dev/sda2 > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 48 47 8:1 / /boot/efi rw,relatime shared:30 - vfat /dev/sda1 > rw,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=winnt,errors=remount-ro > 49 41 253:2 / /var rw,relatime shared:31 - xfs /dev/mapper/vg_system-var > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 50 41 253:5 / /home rw,nodev,relatime shared:32 - xfs > /dev/mapper/vg_system-home > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 51 41 253:4 / /tmp rw,nosuid,nodev,noexec,relatime shared:33 - xfs > /dev/mapper/vg_system-tmp > rw,seclabel,attr2,inode64,logbsize=256k
[jira] [Commented] (MESOS-10131) Agent frequently dies with error "Cycle found in mount table hierarchy"
[ https://issues.apache.org/jira/browse/MESOS-10131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118896#comment-17118896 ] Andrei Budnik commented on MESOS-10131: --- I think the message containing the whole mount table is long enough (~30k bytes) to reach the limit of the logger buffer... [~tomplummer] Could you capture both truncated log message and the output of "cat /proc//mountinfo" next time it crashes? > Agent frequently dies with error "Cycle found in mount table hierarchy" > --- > > Key: MESOS-10131 > URL: https://issues.apache.org/jira/browse/MESOS-10131 > Project: Mesos > Issue Type: Bug > Components: agent, framework >Affects Versions: 1.9.0 >Reporter: Thomas Plummer >Assignee: Andrei Budnik >Priority: Major > Attachments: log.txt > > > Our mesos agent frequently dies with the follow error in the slave logs: > > {code:java} > F0509 22:10:33.036993 17723 fs.cpp:217] Check failed: > !visitedParents.contains(parentId) Cycle found in mount table hierarchy at > entry '1954': > 18 41 0:18 / /sys rw,nosuid,nodev,noexec,relatime shared:6 - sysfs sysfs > rw,seclabel > 19 41 0:3 / /proc rw,nosuid,nodev,noexec,relatime shared:5 - proc proc rw > 20 41 0:5 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs > rw,seclabel,size=65852208k,nr_inodes=16463052,mode=755 > 21 18 0:17 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime shared:7 - > securityfs securityfs rw > 22 20 0:19 / /dev/shm rw,nosuid,nodev,noexec shared:3 - tmpfs tmpfs > rw,seclabel > 23 20 0:12 / /dev/pts rw,nosuid,noexec,relatime shared:4 - devpts devpts > rw,seclabel,gid=5,mode=620,ptmxmode=000 > 24 41 0:20 / /run rw,nosuid,nodev shared:24 - tmpfs tmpfs rw,seclabel,mode=755 > 25 18 0:21 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:8 - tmpfs tmpfs > ro,seclabel,mode=755 > 26 25 0:22 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:9 > - cgroup cgroup > rw,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd > 27 18 0:23 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime shared:20 - > pstore pstore rw > 28 18 0:24 / /sys/firmware/efi/efivars rw,nosuid,nodev,noexec,relatime > shared:21 - efivarfs efivarfs rw > 29 25 0:25 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime > shared:10 - cgroup cgroup rw,seclabel,perf_event > 30 25 0:26 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime > shared:11 - cgroup cgroup rw,seclabel,net_prio,net_cls > 31 25 0:27 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:12 > - cgroup cgroup rw,seclabel,cpuset > 32 25 0:28 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:13 - > cgroup cgroup rw,seclabel,blkio > 33 25 0:29 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:14 > - cgroup cgroup rw,seclabel,freezer > 34 25 0:30 / /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime shared:15 > - cgroup cgroup rw,seclabel,hugetlb > 35 25 0:31 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:16 > - cgroup cgroup rw,seclabel,devices > 36 25 0:32 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime > shared:17 - cgroup cgroup rw,seclabel,cpuacct,cpu > 37 25 0:33 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:18 > - cgroup cgroup rw,seclabel,memory > 38 25 0:34 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:19 - > cgroup cgroup rw,seclabel,pids > 39 18 0:35 / /sys/kernel/config rw,relatime shared:22 - configfs configfs rw > 41 0 253:0 / / rw,relatime shared:1 - xfs /dev/mapper/vg_system-root > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 42 18 0:16 / /sys/fs/selinux rw,relatime shared:23 - selinuxfs selinuxfs rw > 43 19 0:37 / /proc/sys/fs/binfmt_misc rw,relatime shared:25 - autofs > systemd-1 > rw,fd=32,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=11414 > 44 18 0:6 / /sys/kernel/debug rw,relatime shared:26 - debugfs debugfs rw > 45 20 0:15 / /dev/mqueue rw,relatime shared:27 - mqueue mqueue rw,seclabel > 46 20 0:38 / /dev/hugepages rw,relatime shared:28 - hugetlbfs hugetlbfs > rw,seclabel > 47 41 8:2 / /boot rw,relatime shared:29 - xfs /dev/sda2 > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 48 47 8:1 / /boot/efi rw,relatime shared:30 - vfat /dev/sda1 > rw,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=winnt,errors=remount-ro > 49 41 253:2 / /var rw,relatime shared:31 - xfs /dev/mapper/vg_system-var > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 50 41 253:5 / /home rw,nodev,relatime shared:32 - xfs > /dev/mapper/vg_system-home > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 51 41 253:4 / /tmp rw,nosuid,nodev,noexec,relatime shared:33 - xfs
[jira] [Commented] (MESOS-10131) Agent frequently dies with error "Cycle found in mount table hierarchy"
[ https://issues.apache.org/jira/browse/MESOS-10131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118003#comment-17118003 ] Thomas Plummer commented on MESOS-10131: [~abudnik] I have attached the appropriate portion of the log. I was initially copying from a terminal, not realized that it was several terminal screens long. > Agent frequently dies with error "Cycle found in mount table hierarchy" > --- > > Key: MESOS-10131 > URL: https://issues.apache.org/jira/browse/MESOS-10131 > Project: Mesos > Issue Type: Bug > Components: agent, framework >Affects Versions: 1.9.0 >Reporter: Thomas Plummer >Assignee: Andrei Budnik >Priority: Major > Attachments: log.txt > > > Our mesos agent frequently dies with the follow error in the slave logs: > > {code:java} > F0509 22:10:33.036993 17723 fs.cpp:217] Check failed: > !visitedParents.contains(parentId) Cycle found in mount table hierarchy at > entry '1954': > 18 41 0:18 / /sys rw,nosuid,nodev,noexec,relatime shared:6 - sysfs sysfs > rw,seclabel > 19 41 0:3 / /proc rw,nosuid,nodev,noexec,relatime shared:5 - proc proc rw > 20 41 0:5 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs > rw,seclabel,size=65852208k,nr_inodes=16463052,mode=755 > 21 18 0:17 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime shared:7 - > securityfs securityfs rw > 22 20 0:19 / /dev/shm rw,nosuid,nodev,noexec shared:3 - tmpfs tmpfs > rw,seclabel > 23 20 0:12 / /dev/pts rw,nosuid,noexec,relatime shared:4 - devpts devpts > rw,seclabel,gid=5,mode=620,ptmxmode=000 > 24 41 0:20 / /run rw,nosuid,nodev shared:24 - tmpfs tmpfs rw,seclabel,mode=755 > 25 18 0:21 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:8 - tmpfs tmpfs > ro,seclabel,mode=755 > 26 25 0:22 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:9 > - cgroup cgroup > rw,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd > 27 18 0:23 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime shared:20 - > pstore pstore rw > 28 18 0:24 / /sys/firmware/efi/efivars rw,nosuid,nodev,noexec,relatime > shared:21 - efivarfs efivarfs rw > 29 25 0:25 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime > shared:10 - cgroup cgroup rw,seclabel,perf_event > 30 25 0:26 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime > shared:11 - cgroup cgroup rw,seclabel,net_prio,net_cls > 31 25 0:27 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:12 > - cgroup cgroup rw,seclabel,cpuset > 32 25 0:28 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:13 - > cgroup cgroup rw,seclabel,blkio > 33 25 0:29 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:14 > - cgroup cgroup rw,seclabel,freezer > 34 25 0:30 / /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime shared:15 > - cgroup cgroup rw,seclabel,hugetlb > 35 25 0:31 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:16 > - cgroup cgroup rw,seclabel,devices > 36 25 0:32 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime > shared:17 - cgroup cgroup rw,seclabel,cpuacct,cpu > 37 25 0:33 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:18 > - cgroup cgroup rw,seclabel,memory > 38 25 0:34 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:19 - > cgroup cgroup rw,seclabel,pids > 39 18 0:35 / /sys/kernel/config rw,relatime shared:22 - configfs configfs rw > 41 0 253:0 / / rw,relatime shared:1 - xfs /dev/mapper/vg_system-root > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 42 18 0:16 / /sys/fs/selinux rw,relatime shared:23 - selinuxfs selinuxfs rw > 43 19 0:37 / /proc/sys/fs/binfmt_misc rw,relatime shared:25 - autofs > systemd-1 > rw,fd=32,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=11414 > 44 18 0:6 / /sys/kernel/debug rw,relatime shared:26 - debugfs debugfs rw > 45 20 0:15 / /dev/mqueue rw,relatime shared:27 - mqueue mqueue rw,seclabel > 46 20 0:38 / /dev/hugepages rw,relatime shared:28 - hugetlbfs hugetlbfs > rw,seclabel > 47 41 8:2 / /boot rw,relatime shared:29 - xfs /dev/sda2 > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 48 47 8:1 / /boot/efi rw,relatime shared:30 - vfat /dev/sda1 > rw,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=winnt,errors=remount-ro > 49 41 253:2 / /var rw,relatime shared:31 - xfs /dev/mapper/vg_system-var > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 50 41 253:5 / /home rw,nodev,relatime shared:32 - xfs > /dev/mapper/vg_system-home > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 51 41 253:4 / /tmp rw,nosuid,nodev,noexec,relatime shared:33 - xfs > /dev/mapper/vg_system-tmp > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,sw
[jira] [Commented] (MESOS-10131) Agent frequently dies with error "Cycle found in mount table hierarchy"
[ https://issues.apache.org/jira/browse/MESOS-10131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117903#comment-17117903 ] Andrei Budnik commented on MESOS-10131: --- [~tomplummer] It seems that the tail of the log message is missing. Could you please provide the whole log message containing the mount table? We will try to reproduce the problem by running a unit test to ensure that this is not a bug in the code. > Agent frequently dies with error "Cycle found in mount table hierarchy" > --- > > Key: MESOS-10131 > URL: https://issues.apache.org/jira/browse/MESOS-10131 > Project: Mesos > Issue Type: Bug > Components: agent, framework >Affects Versions: 1.9.0 >Reporter: Thomas Plummer >Assignee: Andrei Budnik >Priority: Major > > Our mesos agent frequently dies with the follow error in the slave logs: > > {code:java} > F0509 22:10:33.036993 17723 fs.cpp:217] Check failed: > !visitedParents.contains(parentId) Cycle found in mount table hierarchy at > entry '1954': > 18 41 0:18 / /sys rw,nosuid,nodev,noexec,relatime shared:6 - sysfs sysfs > rw,seclabel > 19 41 0:3 / /proc rw,nosuid,nodev,noexec,relatime shared:5 - proc proc rw > 20 41 0:5 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs > rw,seclabel,size=65852208k,nr_inodes=16463052,mode=755 > 21 18 0:17 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime shared:7 - > securityfs securityfs rw > 22 20 0:19 / /dev/shm rw,nosuid,nodev,noexec shared:3 - tmpfs tmpfs > rw,seclabel > 23 20 0:12 / /dev/pts rw,nosuid,noexec,relatime shared:4 - devpts devpts > rw,seclabel,gid=5,mode=620,ptmxmode=000 > 24 41 0:20 / /run rw,nosuid,nodev shared:24 - tmpfs tmpfs rw,seclabel,mode=755 > 25 18 0:21 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:8 - tmpfs tmpfs > ro,seclabel,mode=755 > 26 25 0:22 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:9 > - cgroup cgroup > rw,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd > 27 18 0:23 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime shared:20 - > pstore pstore rw > 28 18 0:24 / /sys/firmware/efi/efivars rw,nosuid,nodev,noexec,relatime > shared:21 - efivarfs efivarfs rw > 29 25 0:25 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime > shared:10 - cgroup cgroup rw,seclabel,perf_event > 30 25 0:26 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime > shared:11 - cgroup cgroup rw,seclabel,net_prio,net_cls > 31 25 0:27 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:12 > - cgroup cgroup rw,seclabel,cpuset > 32 25 0:28 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:13 - > cgroup cgroup rw,seclabel,blkio > 33 25 0:29 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:14 > - cgroup cgroup rw,seclabel,freezer > 34 25 0:30 / /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime shared:15 > - cgroup cgroup rw,seclabel,hugetlb > 35 25 0:31 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:16 > - cgroup cgroup rw,seclabel,devices > 36 25 0:32 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime > shared:17 - cgroup cgroup rw,seclabel,cpuacct,cpu > 37 25 0:33 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:18 > - cgroup cgroup rw,seclabel,memory > 38 25 0:34 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:19 - > cgroup cgroup rw,seclabel,pids > 39 18 0:35 / /sys/kernel/config rw,relatime shared:22 - configfs configfs rw > 41 0 253:0 / / rw,relatime shared:1 - xfs /dev/mapper/vg_system-root > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 42 18 0:16 / /sys/fs/selinux rw,relatime shared:23 - selinuxfs selinuxfs rw > 43 19 0:37 / /proc/sys/fs/binfmt_misc rw,relatime shared:25 - autofs > systemd-1 > rw,fd=32,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=11414 > 44 18 0:6 / /sys/kernel/debug rw,relatime shared:26 - debugfs debugfs rw > 45 20 0:15 / /dev/mqueue rw,relatime shared:27 - mqueue mqueue rw,seclabel > 46 20 0:38 / /dev/hugepages rw,relatime shared:28 - hugetlbfs hugetlbfs > rw,seclabel > 47 41 8:2 / /boot rw,relatime shared:29 - xfs /dev/sda2 > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 48 47 8:1 / /boot/efi rw,relatime shared:30 - vfat /dev/sda1 > rw,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=winnt,errors=remount-ro > 49 41 253:2 / /var rw,relatime shared:31 - xfs /dev/mapper/vg_system-var > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 50 41 253:5 / /home rw,nodev,relatime shared:32 - xfs > /dev/mapper/vg_system-home > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 51 41 253:4 / /tmp rw,nosuid,nodev,noexec,relatime shared:33 - xfs > /dev/mapper/vg_system-tmp
[jira] [Commented] (MESOS-10131) Agent frequently dies with error "Cycle found in mount table hierarchy"
[ https://issues.apache.org/jira/browse/MESOS-10131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117892#comment-17117892 ] Andrei Budnik commented on MESOS-10131: --- Mount table without extra newlines: {code:java} 18 41 0:18 / /sys rw,nosuid,nodev,noexec,relatime shared:6 - sysfs sysfs rw,seclabel 19 41 0:3 / /proc rw,nosuid,nodev,noexec,relatime shared:5 - proc proc rw 20 41 0:5 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs rw,seclabel,size=65852208k,nr_inodes=16463052,mode=755 21 18 0:17 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime shared:7 - securityfs securityfs rw 22 20 0:19 / /dev/shm rw,nosuid,nodev,noexec shared:3 - tmpfs tmpfs rw,seclabel 23 20 0:12 / /dev/pts rw,nosuid,noexec,relatime shared:4 - devpts devpts rw,seclabel,gid=5,mode=620,ptmxmode=000 24 41 0:20 / /run rw,nosuid,nodev shared:24 - tmpfs tmpfs rw,seclabel,mode=755 25 18 0:21 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:8 - tmpfs tmpfs ro,seclabel,mode=755 26 25 0:22 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:9 - cgroup cgroup rw,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 27 18 0:23 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime shared:20 - pstore pstore rw 28 18 0:24 / /sys/firmware/efi/efivars rw,nosuid,nodev,noexec,relatime shared:21 - efivarfs efivarfs rw 29 25 0:25 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime shared:10 - cgroup cgroup rw,seclabel,perf_event 30 25 0:26 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime shared:11 - cgroup cgroup rw,seclabel,net_prio,net_cls 31 25 0:27 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:12 - cgroup cgroup rw,seclabel,cpuset 32 25 0:28 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:13 - cgroup cgroup rw,seclabel,blkio 33 25 0:29 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:14 - cgroup cgroup rw,seclabel,freezer 34 25 0:30 / /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime shared:15 - cgroup cgroup rw,seclabel,hugetlb 35 25 0:31 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:16 - cgroup cgroup rw,seclabel,devices 36 25 0:32 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,seclabel,cpuacct,cpu 37 25 0:33 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:18 - cgroup cgroup rw,seclabel,memory 38 25 0:34 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:19 - cgroup cgroup rw,seclabel,pids 39 18 0:35 / /sys/kernel/config rw,relatime shared:22 - configfs configfs rw 41 0 253:0 / / rw,relatime shared:1 - xfs /dev/mapper/vg_system-root rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota 42 18 0:16 / /sys/fs/selinux rw,relatime shared:23 - selinuxfs selinuxfs rw 43 19 0:37 / /proc/sys/fs/binfmt_misc rw,relatime shared:25 - autofs systemd-1 rw,fd=32,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=11414 44 18 0:6 / /sys/kernel/debug rw,relatime shared:26 - debugfs debugfs rw 45 20 0:15 / /dev/mqueue rw,relatime shared:27 - mqueue mqueue rw,seclabel 46 20 0:38 / /dev/hugepages rw,relatime shared:28 - hugetlbfs hugetlbfs rw,seclabel 47 41 8:2 / /boot rw,relatime shared:29 - xfs /dev/sda2 rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota 48 47 8:1 / /boot/efi rw,relatime shared:30 - vfat /dev/sda1 rw,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=winnt,errors=remount-ro 49 41 253:2 / /var rw,relatime shared:31 - xfs /dev/mapper/vg_system-var rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota 50 41 253:5 / /home rw,nodev,relatime shared:32 - xfs /dev/mapper/vg_system-home rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota 51 41 253:4 / /tmp rw,nosuid,nodev,noexec,relatime shared:33 - xfs /dev/mapper/vg_system-tmp rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota 53 49 253:4 / /var/tmp rw,nosuid,nodev,noexec,relatime shared:33 - xfs /dev/mapper/vg_system-tmp rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota 52 49 253:3 / /var/log rw,relatime shared:34 - xfs /dev/mapper/vg_system-varlog rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota 54 52 253:6 / /var/log/audit rw,relatime shared:35 - xfs /dev/mapper/vg_system-varlogaudit rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota 187 41 0:41 / /mnt/receipt rw,relatime shared:165 - nfs4 dtmetlnfsa01p.a.carfax.us:/ rw,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.18.154.117,local_lock=none,addr=172.18.138.237 188 41 0:42 / /mnt/receipt_web_dev rw,relatime shared:169 - nfs4 dtmetlnfsa01b.a.carfax.us:/ rw,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.18.154.117,local_lock=none,addr=172.18.137.248 192 41 0:41 / /mnt/rece
[jira] [Commented] (MESOS-10131) Agent frequently dies with error "Cycle found in mount table hierarchy"
[ https://issues.apache.org/jira/browse/MESOS-10131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17117873#comment-17117873 ] Andrei Budnik commented on MESOS-10131: --- I've copy-pasted the mount table from the log excerpt into one of our unit tests (`FsTest.MountInfoTableReadSortedParentOfSelf`). It failed with the following error message: {code:java} ../../src/tests/containerizer/fs_tests.cpp:344: Failure table: Failed to parse entry 'docker/overlay2/l/LOG7DILAFLJBIQ7CKDQVFXJLP7:/var/lib/docker/overlay2/l/6JVIPP3XCCWKZPFAUWKXCDWYXL:/var/lib/docker/overlay2/l/L5VKHJHVOWG24VJPJCAKGTQX5G:/var/lib/docker/overlay2/l/ZIIS5MWCIF4C6KXI2LVKVU4TMF:/var/lib/docker/overlay2/l/4JXI': Could not find separator ' - ' {code} It seems that there was a memory corruption. I'm investigating what could be the cause. > Agent frequently dies with error "Cycle found in mount table hierarchy" > --- > > Key: MESOS-10131 > URL: https://issues.apache.org/jira/browse/MESOS-10131 > Project: Mesos > Issue Type: Bug > Components: agent, framework >Affects Versions: 1.9.0 >Reporter: Thomas Plummer >Assignee: Andrei Budnik >Priority: Major > > Our mesos agent frequently dies with the follow error in the slave logs: > > {code:java} > F0509 22:10:33.036993 17723 fs.cpp:217] Check failed: > !visitedParents.contains(parentId) Cycle found in mount table hierarchy at > entry '1954': > 18 41 0:18 / /sys rw,nosuid,nodev,noexec,relatime shared:6 - sysfs sysfs > rw,seclabel > 19 41 0:3 / /proc rw,nosuid,nodev,noexec,relatime shared:5 - proc proc rw > 20 41 0:5 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs > rw,seclabel,size=65852208k,nr_inodes=16463052,mode=755 > 21 18 0:17 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime shared:7 - > securityfs securityfs rw > 22 20 0:19 / /dev/shm rw,nosuid,nodev,noexec shared:3 - tmpfs tmpfs > rw,seclabel > 23 20 0:12 / /dev/pts rw,nosuid,noexec,relatime shared:4 - devpts devpts > rw,seclabel,gid=5,mode=620,ptmxmode=000 > 24 41 0:20 / /run rw,nosuid,nodev shared:24 - tmpfs tmpfs rw,seclabel,mode=755 > 25 18 0:21 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:8 - tmpfs tmpfs > ro,seclabel,mode=755 > 26 25 0:22 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:9 > - cgroup cgroup > rw,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd > 27 18 0:23 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime shared:20 - > pstore pstore rw > 28 18 0:24 / /sys/firmware/efi/efivars rw,nosuid,nodev,noexec,relatime > shared:21 - efivarfs efivarfs rw > 29 25 0:25 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime > shared:10 - cgroup cgroup rw,seclabel,perf_event > 30 25 0:26 / /sys/fs/cgroup/net_cls,net_prio rw,nosuid,nodev,noexec,relatime > shared:11 - cgroup cgroup rw,seclabel,net_prio,net_cls > 31 25 0:27 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:12 > - cgroup cgroup rw,seclabel,cpuset > 32 25 0:28 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:13 - > cgroup cgroup rw,seclabel,blkio > 33 25 0:29 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:14 > - cgroup cgroup rw,seclabel,freezer > 34 25 0:30 / /sys/fs/cgroup/hugetlb rw,nosuid,nodev,noexec,relatime shared:15 > - cgroup cgroup rw,seclabel,hugetlb > 35 25 0:31 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:16 > - cgroup cgroup rw,seclabel,devices > 36 25 0:32 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime > shared:17 - cgroup cgroup rw,seclabel,cpuacct,cpu > 37 25 0:33 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:18 > - cgroup cgroup rw,seclabel,memory > 38 25 0:34 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime shared:19 - > cgroup cgroup rw,seclabel,pids > 39 18 0:35 / /sys/kernel/config rw,relatime shared:22 - configfs configfs rw > 41 0 253:0 / / rw,relatime shared:1 - xfs /dev/mapper/vg_system-root > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 42 18 0:16 / /sys/fs/selinux rw,relatime shared:23 - selinuxfs selinuxfs rw > 43 19 0:37 / /proc/sys/fs/binfmt_misc rw,relatime shared:25 - autofs > systemd-1 > rw,fd=32,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=11414 > 44 18 0:6 / /sys/kernel/debug rw,relatime shared:26 - debugfs debugfs rw > 45 20 0:15 / /dev/mqueue rw,relatime shared:27 - mqueue mqueue rw,seclabel > 46 20 0:38 / /dev/hugepages rw,relatime shared:28 - hugetlbfs hugetlbfs > rw,seclabel > 47 41 8:2 / /boot rw,relatime shared:29 - xfs /dev/sda2 > rw,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota > 48 47 8:1 / /boot/efi rw,relatime shared:30 - vfat /dev/sda1 > rw,fmask=0077,dmask=0077,codepage=437,iocharset=ascii,shortname=winnt,errors=remount-ro > 49 41 2