[jira] [Updated] (MESOS-6118) Agent would crash with docker container tasks due to host mount table read.
[ https://issues.apache.org/jira/browse/MESOS-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-6118: -- Fix Version/s: 1.1.0 > Agent would crash with docker container tasks due to host mount table read. > --- > > Key: MESOS-6118 > URL: https://issues.apache.org/jira/browse/MESOS-6118 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 1.0.1 > Environment: Build: 2016-08-26 23:06:27 by centos > Version: 1.0.1 > Git tag: 1.0.1 > Git SHA: 3611eb0b7eea8d144e9b2e840e0ba16f2f659ee3 > systemd version `219` detected > Inializing systemd state > Created systemd slice: `/run/systemd/system/mesos_executors.slice` > Started systemd slice `mesos_executors.slice` > Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni > Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher > Linux ip-10-254-192-40 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 > UTC 2016 x86_64 x86_64 x86_64 GNU/Linux >Reporter: Jamie Briant >Assignee: Kevin Klues >Priority: Blocker > Labels: linux, slave > Fix For: 1.0.2, 1.1.0 > > Attachments: crashlogfull.log, cycle2.log, cycle3.log, cycle5.log, > cycle6.log, slave-crash.log > > > I have a framework which schedules thousands of short running (a few seconds > to a few minutes) of tasks, over a period of several minutes. In 1.0.1, the > slave process will crash every few minutes (with systemd restarting it). > Crash is: > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: F0901 20:52:23.905678 1232 > fs.cpp:140] Check failed: !visitedParents.contains(parentId) > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: *** Check failure stack trace: > *** > Version 1.0.0 works without this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6118) Agent would crash with docker container tasks due to host mount table read.
[ https://issues.apache.org/jira/browse/MESOS-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-6118: -- Target Version/s: (was: 1.0.2, 1.1.0) > Agent would crash with docker container tasks due to host mount table read. > --- > > Key: MESOS-6118 > URL: https://issues.apache.org/jira/browse/MESOS-6118 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 1.0.1 > Environment: Build: 2016-08-26 23:06:27 by centos > Version: 1.0.1 > Git tag: 1.0.1 > Git SHA: 3611eb0b7eea8d144e9b2e840e0ba16f2f659ee3 > systemd version `219` detected > Inializing systemd state > Created systemd slice: `/run/systemd/system/mesos_executors.slice` > Started systemd slice `mesos_executors.slice` > Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni > Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher > Linux ip-10-254-192-40 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 > UTC 2016 x86_64 x86_64 x86_64 GNU/Linux >Reporter: Jamie Briant >Assignee: Kevin Klues >Priority: Blocker > Labels: linux, slave > Fix For: 1.0.2, 1.1.0 > > Attachments: crashlogfull.log, cycle2.log, cycle3.log, cycle5.log, > cycle6.log, slave-crash.log > > > I have a framework which schedules thousands of short running (a few seconds > to a few minutes) of tasks, over a period of several minutes. In 1.0.1, the > slave process will crash every few minutes (with systemd restarting it). > Crash is: > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: F0901 20:52:23.905678 1232 > fs.cpp:140] Check failed: !visitedParents.contains(parentId) > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: *** Check failure stack trace: > *** > Version 1.0.0 works without this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6118) Agent would crash with docker container tasks due to host mount table read.
[ https://issues.apache.org/jira/browse/MESOS-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-6118: -- Fix Version/s: 1.0.2 > Agent would crash with docker container tasks due to host mount table read. > --- > > Key: MESOS-6118 > URL: https://issues.apache.org/jira/browse/MESOS-6118 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 1.0.1 > Environment: Build: 2016-08-26 23:06:27 by centos > Version: 1.0.1 > Git tag: 1.0.1 > Git SHA: 3611eb0b7eea8d144e9b2e840e0ba16f2f659ee3 > systemd version `219` detected > Inializing systemd state > Created systemd slice: `/run/systemd/system/mesos_executors.slice` > Started systemd slice `mesos_executors.slice` > Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni > Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher > Linux ip-10-254-192-40 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 > UTC 2016 x86_64 x86_64 x86_64 GNU/Linux >Reporter: Jamie Briant >Assignee: Kevin Klues >Priority: Blocker > Labels: linux, slave > Fix For: 1.0.2, 1.1.0 > > Attachments: crashlogfull.log, cycle2.log, cycle3.log, cycle5.log, > cycle6.log, slave-crash.log > > > I have a framework which schedules thousands of short running (a few seconds > to a few minutes) of tasks, over a period of several minutes. In 1.0.1, the > slave process will crash every few minutes (with systemd restarting it). > Crash is: > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: F0901 20:52:23.905678 1232 > fs.cpp:140] Check failed: !visitedParents.contains(parentId) > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: *** Check failure stack trace: > *** > Version 1.0.0 works without this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6118) Agent would crash with docker container tasks due to host mount table read.
[ https://issues.apache.org/jira/browse/MESOS-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-6118: -- Shepherd: Jie Yu > Agent would crash with docker container tasks due to host mount table read. > --- > > Key: MESOS-6118 > URL: https://issues.apache.org/jira/browse/MESOS-6118 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 1.0.1 > Environment: Build: 2016-08-26 23:06:27 by centos > Version: 1.0.1 > Git tag: 1.0.1 > Git SHA: 3611eb0b7eea8d144e9b2e840e0ba16f2f659ee3 > systemd version `219` detected > Inializing systemd state > Created systemd slice: `/run/systemd/system/mesos_executors.slice` > Started systemd slice `mesos_executors.slice` > Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni > Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher > Linux ip-10-254-192-40 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 > UTC 2016 x86_64 x86_64 x86_64 GNU/Linux >Reporter: Jamie Briant >Assignee: Kevin Klues >Priority: Blocker > Labels: linux, slave > Attachments: crashlogfull.log, cycle2.log, cycle3.log, cycle5.log, > cycle6.log, slave-crash.log > > > I have a framework which schedules thousands of short running (a few seconds > to a few minutes) of tasks, over a period of several minutes. In 1.0.1, the > slave process will crash every few minutes (with systemd restarting it). > Crash is: > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: F0901 20:52:23.905678 1232 > fs.cpp:140] Check failed: !visitedParents.contains(parentId) > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: *** Check failure stack trace: > *** > Version 1.0.0 works without this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6118) Agent would crash with docker container tasks due to host mount table read.
[ https://issues.apache.org/jira/browse/MESOS-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Klues updated MESOS-6118: --- Priority: Blocker (was: Critical) > Agent would crash with docker container tasks due to host mount table read. > --- > > Key: MESOS-6118 > URL: https://issues.apache.org/jira/browse/MESOS-6118 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 1.0.1 > Environment: Build: 2016-08-26 23:06:27 by centos > Version: 1.0.1 > Git tag: 1.0.1 > Git SHA: 3611eb0b7eea8d144e9b2e840e0ba16f2f659ee3 > systemd version `219` detected > Inializing systemd state > Created systemd slice: `/run/systemd/system/mesos_executors.slice` > Started systemd slice `mesos_executors.slice` > Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni > Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher > Linux ip-10-254-192-40 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 > UTC 2016 x86_64 x86_64 x86_64 GNU/Linux >Reporter: Jamie Briant >Assignee: Kevin Klues >Priority: Blocker > Labels: linux, slave > Attachments: crashlogfull.log, cycle2.log, cycle3.log, cycle5.log, > cycle6.log, slave-crash.log > > > I have a framework which schedules thousands of short running (a few seconds > to a few minutes) of tasks, over a period of several minutes. In 1.0.1, the > slave process will crash every few minutes (with systemd restarting it). > Crash is: > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: F0901 20:52:23.905678 1232 > fs.cpp:140] Check failed: !visitedParents.contains(parentId) > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: *** Check failure stack trace: > *** > Version 1.0.0 works without this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6118) Agent would crash with docker container tasks due to host mount table read.
[ https://issues.apache.org/jira/browse/MESOS-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kone updated MESOS-6118: -- Target Version/s: 1.1.0, 1.0.2 Fix Version/s: (was: 1.0.2) (was: 1.1.0) > Agent would crash with docker container tasks due to host mount table read. > --- > > Key: MESOS-6118 > URL: https://issues.apache.org/jira/browse/MESOS-6118 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 1.0.1 > Environment: Build: 2016-08-26 23:06:27 by centos > Version: 1.0.1 > Git tag: 1.0.1 > Git SHA: 3611eb0b7eea8d144e9b2e840e0ba16f2f659ee3 > systemd version `219` detected > Inializing systemd state > Created systemd slice: `/run/systemd/system/mesos_executors.slice` > Started systemd slice `mesos_executors.slice` > Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni > Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher > Linux ip-10-254-192-40 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 > UTC 2016 x86_64 x86_64 x86_64 GNU/Linux >Reporter: Jamie Briant >Assignee: Kevin Klues >Priority: Critical > Labels: linux, slave > Attachments: crashlogfull.log, cycle2.log, cycle3.log, cycle5.log, > cycle6.log, slave-crash.log > > > I have a framework which schedules thousands of short running (a few seconds > to a few minutes) of tasks, over a period of several minutes. In 1.0.1, the > slave process will crash every few minutes (with systemd restarting it). > Crash is: > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: F0901 20:52:23.905678 1232 > fs.cpp:140] Check failed: !visitedParents.contains(parentId) > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: *** Check failure stack trace: > *** > Version 1.0.0 works without this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6118) Agent would crash with docker container tasks due to host mount table read.
[ https://issues.apache.org/jira/browse/MESOS-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jamie Briant updated MESOS-6118: Attachment: cycle6.log > Agent would crash with docker container tasks due to host mount table read. > --- > > Key: MESOS-6118 > URL: https://issues.apache.org/jira/browse/MESOS-6118 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 1.0.1 > Environment: Build: 2016-08-26 23:06:27 by centos > Version: 1.0.1 > Git tag: 1.0.1 > Git SHA: 3611eb0b7eea8d144e9b2e840e0ba16f2f659ee3 > systemd version `219` detected > Inializing systemd state > Created systemd slice: `/run/systemd/system/mesos_executors.slice` > Started systemd slice `mesos_executors.slice` > Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni > Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher > Linux ip-10-254-192-40 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 > UTC 2016 x86_64 x86_64 x86_64 GNU/Linux >Reporter: Jamie Briant >Assignee: Kevin Klues >Priority: Critical > Labels: linux, slave > Fix For: 1.1.0, 1.0.2 > > Attachments: crashlogfull.log, cycle2.log, cycle3.log, cycle5.log, > cycle6.log, slave-crash.log > > > I have a framework which schedules thousands of short running (a few seconds > to a few minutes) of tasks, over a period of several minutes. In 1.0.1, the > slave process will crash every few minutes (with systemd restarting it). > Crash is: > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: F0901 20:52:23.905678 1232 > fs.cpp:140] Check failed: !visitedParents.contains(parentId) > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: *** Check failure stack trace: > *** > Version 1.0.0 works without this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6118) Agent would crash with docker container tasks due to host mount table read.
[ https://issues.apache.org/jira/browse/MESOS-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jamie Briant updated MESOS-6118: Attachment: cycle5.log > Agent would crash with docker container tasks due to host mount table read. > --- > > Key: MESOS-6118 > URL: https://issues.apache.org/jira/browse/MESOS-6118 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 1.0.1 > Environment: Build: 2016-08-26 23:06:27 by centos > Version: 1.0.1 > Git tag: 1.0.1 > Git SHA: 3611eb0b7eea8d144e9b2e840e0ba16f2f659ee3 > systemd version `219` detected > Inializing systemd state > Created systemd slice: `/run/systemd/system/mesos_executors.slice` > Started systemd slice `mesos_executors.slice` > Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni > Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher > Linux ip-10-254-192-40 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 > UTC 2016 x86_64 x86_64 x86_64 GNU/Linux >Reporter: Jamie Briant >Assignee: Kevin Klues >Priority: Critical > Labels: linux, slave > Fix For: 1.1.0, 1.0.2 > > Attachments: crashlogfull.log, cycle2.log, cycle3.log, cycle5.log, > slave-crash.log > > > I have a framework which schedules thousands of short running (a few seconds > to a few minutes) of tasks, over a period of several minutes. In 1.0.1, the > slave process will crash every few minutes (with systemd restarting it). > Crash is: > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: F0901 20:52:23.905678 1232 > fs.cpp:140] Check failed: !visitedParents.contains(parentId) > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: *** Check failure stack trace: > *** > Version 1.0.0 works without this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6118) Agent would crash with docker container tasks due to host mount table read.
[ https://issues.apache.org/jira/browse/MESOS-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jamie Briant updated MESOS-6118: Attachment: cycle3.log > Agent would crash with docker container tasks due to host mount table read. > --- > > Key: MESOS-6118 > URL: https://issues.apache.org/jira/browse/MESOS-6118 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 1.0.1 > Environment: Build: 2016-08-26 23:06:27 by centos > Version: 1.0.1 > Git tag: 1.0.1 > Git SHA: 3611eb0b7eea8d144e9b2e840e0ba16f2f659ee3 > systemd version `219` detected > Inializing systemd state > Created systemd slice: `/run/systemd/system/mesos_executors.slice` > Started systemd slice `mesos_executors.slice` > Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni > Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher > Linux ip-10-254-192-40 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 > UTC 2016 x86_64 x86_64 x86_64 GNU/Linux >Reporter: Jamie Briant >Assignee: Kevin Klues >Priority: Critical > Labels: linux, slave > Fix For: 1.1.0, 1.0.2 > > Attachments: crashlogfull.log, cycle2.log, cycle3.log, slave-crash.log > > > I have a framework which schedules thousands of short running (a few seconds > to a few minutes) of tasks, over a period of several minutes. In 1.0.1, the > slave process will crash every few minutes (with systemd restarting it). > Crash is: > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: F0901 20:52:23.905678 1232 > fs.cpp:140] Check failed: !visitedParents.contains(parentId) > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: *** Check failure stack trace: > *** > Version 1.0.0 works without this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6118) Agent would crash with docker container tasks due to host mount table read.
[ https://issues.apache.org/jira/browse/MESOS-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jamie Briant updated MESOS-6118: Attachment: cycle2.log > Agent would crash with docker container tasks due to host mount table read. > --- > > Key: MESOS-6118 > URL: https://issues.apache.org/jira/browse/MESOS-6118 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 1.0.1 > Environment: Build: 2016-08-26 23:06:27 by centos > Version: 1.0.1 > Git tag: 1.0.1 > Git SHA: 3611eb0b7eea8d144e9b2e840e0ba16f2f659ee3 > systemd version `219` detected > Inializing systemd state > Created systemd slice: `/run/systemd/system/mesos_executors.slice` > Started systemd slice `mesos_executors.slice` > Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni > Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher > Linux ip-10-254-192-40 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 > UTC 2016 x86_64 x86_64 x86_64 GNU/Linux >Reporter: Jamie Briant >Assignee: Kevin Klues >Priority: Critical > Labels: linux, slave > Fix For: 1.1.0, 1.0.2 > > Attachments: crashlogfull.log, cycle2.log, slave-crash.log > > > I have a framework which schedules thousands of short running (a few seconds > to a few minutes) of tasks, over a period of several minutes. In 1.0.1, the > slave process will crash every few minutes (with systemd restarting it). > Crash is: > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: F0901 20:52:23.905678 1232 > fs.cpp:140] Check failed: !visitedParents.contains(parentId) > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: *** Check failure stack trace: > *** > Version 1.0.0 works without this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-6118) Agent would crash with docker container tasks due to host mount table read.
[ https://issues.apache.org/jira/browse/MESOS-6118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gilbert Song updated MESOS-6118: Labels: linux slave (was: ) Summary: Agent would crash with docker container tasks due to host mount table read. (was: Agent crashes under load) > Agent would crash with docker container tasks due to host mount table read. > --- > > Key: MESOS-6118 > URL: https://issues.apache.org/jira/browse/MESOS-6118 > Project: Mesos > Issue Type: Bug > Components: slave >Affects Versions: 1.0.1 > Environment: Build: 2016-08-26 23:06:27 by centos > Version: 1.0.1 > Git tag: 1.0.1 > Git SHA: 3611eb0b7eea8d144e9b2e840e0ba16f2f659ee3 > systemd version `219` detected > Inializing systemd state > Created systemd slice: `/run/systemd/system/mesos_executors.slice` > Started systemd slice `mesos_executors.slice` > Using isolation: posix/cpu,posix/mem,filesystem/posix,network/cni > Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher > Linux ip-10-254-192-40 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 > UTC 2016 x86_64 x86_64 x86_64 GNU/Linux >Reporter: Jamie Briant >Assignee: Kevin Klues >Priority: Critical > Labels: linux, slave > Fix For: 1.1.0, 1.0.2 > > Attachments: slave-crash.log > > > I have a framework which schedules thousands of short running (a few seconds > to a few minutes) of tasks, over a period of several minutes. In 1.0.1, the > slave process will crash every few minutes (with systemd restarting it). > Crash is: > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: F0901 20:52:23.905678 1232 > fs.cpp:140] Check failed: !visitedParents.contains(parentId) > Sep 01 20:52:23 ip-10-254-192-99 mesos-slave: *** Check failure stack trace: > *** > Version 1.0.0 works without this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)