[jira] [Commented] (MESOS-8051) Killing TASK_GROUP fail to kill some tasks

2017-10-30 Thread Qian Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226257#comment-16226257
 ] 

Qian Zhang commented on MESOS-8051:
---

commit 05c7dd88f269692b7248c1087a3f57759eba6853
Author: Qian Zhang 
Date:   Mon Oct 9 09:01:15 2017 +0800

Ignored the tasks already being killed when killing the task group.

When the scheduler tries to kill multiple tasks in the task group
simultaneously, the default executor will kill the tasks one by
one. When the first task is killed, the default executor will kill
all the other tasks in the task group, however, we need to ignore
the tasks which are already being killed, otherwise, the check
`CHECK(!container->killing);` in `DefaultExecutor::kill()` will fail.

Review: https://reviews.apache.org/r/62836

commit 28831de34d098c894042246dd6fef402eb3b960d
Author: Qian Zhang 
Date:   Mon Oct 9 14:25:31 2017 +0800

Added a test `DefaultExecutorTest.KillMultipleTasks`.

Review: https://reviews.apache.org/r/62837

> Killing TASK_GROUP fail to kill some tasks
> --
>
> Key: MESOS-8051
> URL: https://issues.apache.org/jira/browse/MESOS-8051
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, executor
>Affects Versions: 1.4.0
>Reporter: A. Dukhovniy
>Assignee: Qian Zhang
>Priority: Critical
> Attachments: dcos-mesos-master.log.gz, dcos-mesos-slave.log.gz, 
> screenshot-1.png
>
>
> When starting following pod definition via marathon:
> {code:java}
> {
>   "id": "/simple-pod",
>   "scaling": {
> "kind": "fixed",
> "instances": 3
>   },
>   "environment": {
> "PING": "PONG"
>   },
>   "containers": [
> {
>   "name": "ct1",
>   "resources": {
> "cpus": 0.1,
> "mem": 32
>   },
>   "image": {
> "kind": "MESOS",
> "id": "busybox"
>   },
>   "exec": {
> "command": {
>   "shell": "while true; do echo the current time is $(date) > 
> ./test-v1/clock; sleep 1; done"
> }
>   },
>   "volumeMounts": [
> {
>   "name": "v1",
>   "mountPath": "test-v1"
> }
>   ]
> },
> {
>   "name": "ct2",
>   "resources": {
> "cpus": 0.1,
> "mem": 32
>   },
>   "exec": {
> "command": {
>   "shell": "while true; do echo -n $PING ' '; cat ./etc/clock; sleep 
> 1; done"
> }
>   },
>   "volumeMounts": [
> {
>   "name": "v1",
>   "mountPath": "etc"
> },
> {
>   "name": "v2",
>   "mountPath": "docker"
> }
>   ]
> }
>   ],
>   "networks": [
> {
>   "mode": "host"
> }
>   ],
>   "volumes": [
> {
>   "name": "v1"
> },
> {
>   "name": "v2",
>   "host": "/var/lib/docker"
> }
>   ]
> }
> {code}
> mesos will successfully kill all {{ct2}} containers but fail to kill all/some 
> of the {{ct1}} containers. I've attached both master and agent logs. The 
> interesting part starts after marathon issues 6 kills:
> {code:java}
> Oct 04 14:58:25 ip-10-0-5-229.eu-central-1.compute.internal 
> mesos-master[4708]: I1004 14:58:25.209966  4746 master.cpp:5297] Processing 
> KILL call for task 'simple-pod.instance-3c1098e5-a914-11e7-bcd5-e63c853d
> bf20.ct1' of framework bae11d5d-20c2-4d66-9ec3-773d1d717e58-0001 (marathon) 
> at scheduler-c61c493c-728f-4bd9-be60-7373574749af@10.0.5.229:15101
> Oct 04 14:58:25 ip-10-0-5-229.eu-central-1.compute.internal 
> mesos-master[4708]: I1004 14:58:25.210033  4746 master.cpp:5371] Telling 
> agent bae11d5d-20c2-4d66-9ec3-773d1d717e58-S1 at slave(1)@10.0.1.207:5051 (
> 10.0.1.207) to kill task 
> simple-pod.instance-3c1098e5-a914-11e7-bcd5-e63c853dbf20.ct1 of framework 
> bae11d5d-20c2-4d66-9ec3-773d1d717e58-0001 (marathon) at 
> scheduler-c61c493c-728f-4bd9-be60-7373574749af@10.0.5
> .229:15101
> Oct 04 14:58:25 ip-10-0-5-229.eu-central-1.compute.internal 
> mesos-master[4708]: I1004 14:58:25.210471  4748 master.cpp:5297] Processing 
> KILL call for task 'simple-pod.instance-3c1098e5-a914-11e7-bcd5-e63c853d
> bf20.ct2' of framework bae11d5d-20c2-4d66-9ec3-773d1d717e58-0001 (marathon) 
> at scheduler-c61c493c-728f-4bd9-be60-7373574749af@10.0.5.229:15101
> Oct 04 14:58:25 ip-10-0-5-229.eu-central-1.compute.internal 
> mesos-master[4708]: I1004 14:58:25.210518  4748 master.cpp:5371] Telling 
> agent bae11d5d-20c2-4d66-9ec3-773d1d717e58-S1 at slave(1)@10.0.1.207:5051 (
> 10.0.1.207) to kill task 
> simple-pod.instance-3c1098e5-a914-11e7-bcd5-e63c853dbf20.ct2 of framework 
> bae11d5d-20c2-4d66-9ec3-773d1d717e58-0001 (marathon) at 
> scheduler-c61c493c-728f-4bd9-be60-7373574749af@10.0.5
> .229:15101
> Oct 04 14:58:25 

[jira] [Commented] (MESOS-8051) Killing TASK_GROUP fail to kill some tasks

2017-10-08 Thread Qian Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196406#comment-16196406
 ] 

Qian Zhang commented on MESOS-8051:
---

RR:
https://reviews.apache.org/r/62836/

> Killing TASK_GROUP fail to kill some tasks
> --
>
> Key: MESOS-8051
> URL: https://issues.apache.org/jira/browse/MESOS-8051
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, executor
>Affects Versions: 1.4.0
>Reporter: A. Dukhovniy
>Assignee: Qian Zhang
>Priority: Critical
> Attachments: dcos-mesos-master.log.gz, dcos-mesos-slave.log.gz, 
> screenshot-1.png
>
>
> When starting following pod definition via marathon:
> {code:java}
> {
>   "id": "/simple-pod",
>   "scaling": {
> "kind": "fixed",
> "instances": 3
>   },
>   "environment": {
> "PING": "PONG"
>   },
>   "containers": [
> {
>   "name": "ct1",
>   "resources": {
> "cpus": 0.1,
> "mem": 32
>   },
>   "image": {
> "kind": "MESOS",
> "id": "busybox"
>   },
>   "exec": {
> "command": {
>   "shell": "while true; do echo the current time is $(date) > 
> ./test-v1/clock; sleep 1; done"
> }
>   },
>   "volumeMounts": [
> {
>   "name": "v1",
>   "mountPath": "test-v1"
> }
>   ]
> },
> {
>   "name": "ct2",
>   "resources": {
> "cpus": 0.1,
> "mem": 32
>   },
>   "exec": {
> "command": {
>   "shell": "while true; do echo -n $PING ' '; cat ./etc/clock; sleep 
> 1; done"
> }
>   },
>   "volumeMounts": [
> {
>   "name": "v1",
>   "mountPath": "etc"
> },
> {
>   "name": "v2",
>   "mountPath": "docker"
> }
>   ]
> }
>   ],
>   "networks": [
> {
>   "mode": "host"
> }
>   ],
>   "volumes": [
> {
>   "name": "v1"
> },
> {
>   "name": "v2",
>   "host": "/var/lib/docker"
> }
>   ]
> }
> {code}
> mesos will successfully kill all {{ct2}} containers but fail to kill all/some 
> of the {{ct1}} containers. I've attached both master and agent logs. The 
> interesting part starts after marathon issues 6 kills:
> {code:java}
> Oct 04 14:58:25 ip-10-0-5-229.eu-central-1.compute.internal 
> mesos-master[4708]: I1004 14:58:25.209966  4746 master.cpp:5297] Processing 
> KILL call for task 'simple-pod.instance-3c1098e5-a914-11e7-bcd5-e63c853d
> bf20.ct1' of framework bae11d5d-20c2-4d66-9ec3-773d1d717e58-0001 (marathon) 
> at scheduler-c61c493c-728f-4bd9-be60-7373574749af@10.0.5.229:15101
> Oct 04 14:58:25 ip-10-0-5-229.eu-central-1.compute.internal 
> mesos-master[4708]: I1004 14:58:25.210033  4746 master.cpp:5371] Telling 
> agent bae11d5d-20c2-4d66-9ec3-773d1d717e58-S1 at slave(1)@10.0.1.207:5051 (
> 10.0.1.207) to kill task 
> simple-pod.instance-3c1098e5-a914-11e7-bcd5-e63c853dbf20.ct1 of framework 
> bae11d5d-20c2-4d66-9ec3-773d1d717e58-0001 (marathon) at 
> scheduler-c61c493c-728f-4bd9-be60-7373574749af@10.0.5
> .229:15101
> Oct 04 14:58:25 ip-10-0-5-229.eu-central-1.compute.internal 
> mesos-master[4708]: I1004 14:58:25.210471  4748 master.cpp:5297] Processing 
> KILL call for task 'simple-pod.instance-3c1098e5-a914-11e7-bcd5-e63c853d
> bf20.ct2' of framework bae11d5d-20c2-4d66-9ec3-773d1d717e58-0001 (marathon) 
> at scheduler-c61c493c-728f-4bd9-be60-7373574749af@10.0.5.229:15101
> Oct 04 14:58:25 ip-10-0-5-229.eu-central-1.compute.internal 
> mesos-master[4708]: I1004 14:58:25.210518  4748 master.cpp:5371] Telling 
> agent bae11d5d-20c2-4d66-9ec3-773d1d717e58-S1 at slave(1)@10.0.1.207:5051 (
> 10.0.1.207) to kill task 
> simple-pod.instance-3c1098e5-a914-11e7-bcd5-e63c853dbf20.ct2 of framework 
> bae11d5d-20c2-4d66-9ec3-773d1d717e58-0001 (marathon) at 
> scheduler-c61c493c-728f-4bd9-be60-7373574749af@10.0.5
> .229:15101
> Oct 04 14:58:25 ip-10-0-5-229.eu-central-1.compute.internal 
> mesos-master[4708]: I1004 14:58:25.210602  4748 master.cpp:5297] Processing 
> KILL call for task 'simple-pod.instance-3c0ffca4-a914-11e7-bcd5-e63c853d
> bf20.ct1' of framework bae11d5d-20c2-4d66-9ec3-773d1d717e58-0001 (marathon) 
> at scheduler-c61c493c-728f-4bd9-be60-7373574749af@10.0.5.229:15101
> Oct 04 14:58:25 ip-10-0-5-229.eu-central-1.compute.internal 
> mesos-master[4708]: I1004 14:58:25.210639  4748 master.cpp:5371] Telling 
> agent bae11d5d-20c2-4d66-9ec3-773d1d717e58-S1 at slave(1)@10.0.1.207:5051 (
> 10.0.1.207) to kill task 
> simple-pod.instance-3c0ffca4-a914-11e7-bcd5-e63c853dbf20.ct1 of framework 
> bae11d5d-20c2-4d66-9ec3-773d1d717e58-0001 (marathon) at 
> scheduler-c61c493c-728f-4bd9-be60-7373574749af@10.0.5
> .229:15101
> Oct 04 14:58:25 ip-10-0-5-229.eu-central-1.compute.internal 
> mesos-master[4708]: I1004 14:58:25.210932  4753 

[jira] [Commented] (MESOS-8051) Killing TASK_GROUP fail to kill some tasks

2017-10-08 Thread Qian Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196097#comment-16196097
 ] 

Qian Zhang commented on MESOS-8051:
---

I can reproduce this issue with the latest Mesos code (master branch) + 
Marathon v1.5.1. I created the same task group (pod) as [~zen-dog]'s, and when 
it is running, run the following command to delete it:
{code}
curl -X DELETE http://192.168.1.3:8080/v2/pods/simple-pod
{code}

Then I can see one task is in {{TASK_KILLED}} status which is good:
{code}
I1008 19:58:59.928489 20665 slave.cpp:4411] Handling status update TASK_KILLED 
(UUID: e4b6562b-629a-4da8-85b8-ec7a57d1157e) for task 
simple-pod.instance-c034f882-ac1f-11e7-bc9c-024255337c79.ct2 of framework 
206eabc8-a6ba-430f-8ecb-b0900b26b820-
{code}

But the other task is in {{TASK_FAILED}} status:
{code}
I1008 19:59:00.413954 20665 slave.cpp:4411] Handling status update TASK_FAILED 
(UUID: 65467177-066c-4a5d-b04a-f35c66a6d89c) for task 
simple-pod.instance-c034f882-ac1f-11e7-bc9c-024255337c79.ct1 of framework 
206eabc8-a6ba-430f-8ecb-b0900b26b820- from @0.0.0.0:0
{code}

And I see the executor is core dumped (I found the same core dump in 
[~zen-dog]'s agent log as well):
{code}
I1008 19:59:00.412891 20665 slave.cpp:5424] Executor 
'instance-simple-pod.c034f882-ac1f-11e7-bc9c-024255337c79' of framework 
206eabc8-a6ba-430f-8ecb-b0900b26b820- terminated with signal Aborted (core 
dumped)
{code}

And then in the {{stderr}} of the executor, I found the root cause of its core 
dump.
{code}
# cat 
/opt/mesos/slaves/097ad1b0-4a33-400d-9d06-b554f9c7c009-S0/frameworks/206eabc8-a6ba-430f-8ecb-b0900b26b820-/executors/instance-simple-pod.c034f882-ac1f-11e7-bc9c-024255337c79/runs/019b7103-b5e4-4fa9-aaa6-e4b092167000/stderr
I1008 19:56:43.578035 20917 executor.cpp:192] Version: 1.5.0
I1008 19:56:43.612187 20938 default_executor.cpp:185] Received SUBSCRIBED event
I1008 19:56:43.614189 20938 default_executor.cpp:189] Subscribed executor on 
workstation
I1008 19:56:43.619079 20932 default_executor.cpp:185] Received LAUNCH_GROUP 
event
I1008 19:56:43.621937 20931 default_executor.cpp:394] Setting 
'MESOS_CONTAINER_IP' to: 192.168.1.3
I1008 19:56:43.703390 20933 default_executor.cpp:624] Successfully launched 
tasks [ simple-pod.instance-c034f882-ac1f-11e7-bc9c-024255337c79.ct1, 
simple-pod.instance-c034f882-ac1f-11e7-bc9c-024255337c79.ct2 ] in child 
containers [ 
019b7103-b5e4-4fa9-aaa6-e4b092167000.3d104d9e-9e66-4a55-b067-ec58e782c62a, 
019b7103-b5e4-4fa9-aaa6-e4b092167000.6f655d39-7a60-4fb5-84d3-5c9ef61070e8 ]
I1008 19:56:43.707593 20936 default_executor.cpp:697] Waiting for child 
container 
019b7103-b5e4-4fa9-aaa6-e4b092167000.3d104d9e-9e66-4a55-b067-ec58e782c62a of 
task 'simple-pod.instance-c034f882-ac1f-11e7-bc9c-024255337c79.ct1'
I1008 19:56:43.707864 20936 default_executor.cpp:697] Waiting for child 
container 
019b7103-b5e4-4fa9-aaa6-e4b092167000.6f655d39-7a60-4fb5-84d3-5c9ef61070e8 of 
task 'simple-pod.instance-c034f882-ac1f-11e7-bc9c-024255337c79.ct2'
I1008 19:56:43.730860 20938 default_executor.cpp:185] Received ACKNOWLEDGED 
event
I1008 19:56:43.762722 20937 default_executor.cpp:185] Received ACKNOWLEDGED 
event
I1008 19:58:59.719534 20934 default_executor.cpp:185] Received KILL event
I1008 19:58:59.719614 20934 default_executor.cpp:1119] Received kill for task 
'simple-pod.instance-c034f882-ac1f-11e7-bc9c-024255337c79.ct1'
I1008 19:58:59.719633 20934 default_executor.cpp:1004] Killing task 
simple-pod.instance-c034f882-ac1f-11e7-bc9c-024255337c79.ct1 running in child 
container 
019b7103-b5e4-4fa9-aaa6-e4b092167000.3d104d9e-9e66-4a55-b067-ec58e782c62a with 
SIGTERM signal
I1008 19:58:59.719643 20934 default_executor.cpp:1026] Scheduling escalation to 
SIGKILL in 3secs from now
I1008 19:58:59.723572 20931 default_executor.cpp:185] Received KILL event
I1008 19:58:59.723633 20931 default_executor.cpp:1119] Received kill for task 
'simple-pod.instance-c034f882-ac1f-11e7-bc9c-024255337c79.ct2'
I1008 19:58:59.723646 20931 default_executor.cpp:1004] Killing task 
simple-pod.instance-c034f882-ac1f-11e7-bc9c-024255337c79.ct2 running in child 
container 
019b7103-b5e4-4fa9-aaa6-e4b092167000.6f655d39-7a60-4fb5-84d3-5c9ef61070e8 with 
SIGTERM signal
I1008 19:58:59.723654 20931 default_executor.cpp:1026] Scheduling escalation to 
SIGKILL in 3secs from now
I1008 19:58:59.896623 20936 default_executor.cpp:842] Child container 
019b7103-b5e4-4fa9-aaa6-e4b092167000.6f655d39-7a60-4fb5-84d3-5c9ef61070e8 of 
task 'simple-pod.instance-c034f882-ac1f-11e7-bc9c-024255337c79.ct2' in state 
TASK_KILLED terminated with signal Terminated
I1008 19:58:59.896730 20936 default_executor.cpp:879] Killing task group 
containing tasks [ 
simple-pod.instance-c034f882-ac1f-11e7-bc9c-024255337c79.ct1, 
simple-pod.instance-c034f882-ac1f-11e7-bc9c-024255337c79.ct2 ]
F1008 19:58:59.896762 20936 default_executor.cpp:979] Check failed: 

[jira] [Commented] (MESOS-8051) Killing TASK_GROUP fail to kill some tasks

2017-10-08 Thread Qian Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196014#comment-16196014
 ] 

Qian Zhang commented on MESOS-8051:
---

[~vinodkone] Sure, I am working on it.

> Killing TASK_GROUP fail to kill some tasks
> --
>
> Key: MESOS-8051
> URL: https://issues.apache.org/jira/browse/MESOS-8051
> Project: Mesos
>  Issue Type: Bug
>  Components: agent, executor
>Affects Versions: 1.4.0
>Reporter: A. Dukhovniy
>Assignee: Qian Zhang
>Priority: Critical
> Attachments: dcos-mesos-master.log.gz, dcos-mesos-slave.log.gz, 
> screenshot-1.png
>
>
> When starting following pod definition via marathon:
> {code:java}
> {
>   "id": "/simple-pod",
>   "scaling": {
> "kind": "fixed",
> "instances": 3
>   },
>   "environment": {
> "PING": "PONG"
>   },
>   "containers": [
> {
>   "name": "ct1",
>   "resources": {
> "cpus": 0.1,
> "mem": 32
>   },
>   "image": {
> "kind": "MESOS",
> "id": "busybox"
>   },
>   "exec": {
> "command": {
>   "shell": "while true; do echo the current time is $(date) > 
> ./test-v1/clock; sleep 1; done"
> }
>   },
>   "volumeMounts": [
> {
>   "name": "v1",
>   "mountPath": "test-v1"
> }
>   ]
> },
> {
>   "name": "ct2",
>   "resources": {
> "cpus": 0.1,
> "mem": 32
>   },
>   "exec": {
> "command": {
>   "shell": "while true; do echo -n $PING ' '; cat ./etc/clock; sleep 
> 1; done"
> }
>   },
>   "volumeMounts": [
> {
>   "name": "v1",
>   "mountPath": "etc"
> },
> {
>   "name": "v2",
>   "mountPath": "docker"
> }
>   ]
> }
>   ],
>   "networks": [
> {
>   "mode": "host"
> }
>   ],
>   "volumes": [
> {
>   "name": "v1"
> },
> {
>   "name": "v2",
>   "host": "/var/lib/docker"
> }
>   ]
> }
> {code}
> mesos will successfully kill all {{ct2}} containers but fail to kill all/some 
> of the {{ct1}} containers. I've attached both master and agent logs. The 
> interesting part starts after marathon issues 6 kills:
> {code:java}
> Oct 04 14:58:25 ip-10-0-5-229.eu-central-1.compute.internal 
> mesos-master[4708]: I1004 14:58:25.209966  4746 master.cpp:5297] Processing 
> KILL call for task 'simple-pod.instance-3c1098e5-a914-11e7-bcd5-e63c853d
> bf20.ct1' of framework bae11d5d-20c2-4d66-9ec3-773d1d717e58-0001 (marathon) 
> at scheduler-c61c493c-728f-4bd9-be60-7373574749af@10.0.5.229:15101
> Oct 04 14:58:25 ip-10-0-5-229.eu-central-1.compute.internal 
> mesos-master[4708]: I1004 14:58:25.210033  4746 master.cpp:5371] Telling 
> agent bae11d5d-20c2-4d66-9ec3-773d1d717e58-S1 at slave(1)@10.0.1.207:5051 (
> 10.0.1.207) to kill task 
> simple-pod.instance-3c1098e5-a914-11e7-bcd5-e63c853dbf20.ct1 of framework 
> bae11d5d-20c2-4d66-9ec3-773d1d717e58-0001 (marathon) at 
> scheduler-c61c493c-728f-4bd9-be60-7373574749af@10.0.5
> .229:15101
> Oct 04 14:58:25 ip-10-0-5-229.eu-central-1.compute.internal 
> mesos-master[4708]: I1004 14:58:25.210471  4748 master.cpp:5297] Processing 
> KILL call for task 'simple-pod.instance-3c1098e5-a914-11e7-bcd5-e63c853d
> bf20.ct2' of framework bae11d5d-20c2-4d66-9ec3-773d1d717e58-0001 (marathon) 
> at scheduler-c61c493c-728f-4bd9-be60-7373574749af@10.0.5.229:15101
> Oct 04 14:58:25 ip-10-0-5-229.eu-central-1.compute.internal 
> mesos-master[4708]: I1004 14:58:25.210518  4748 master.cpp:5371] Telling 
> agent bae11d5d-20c2-4d66-9ec3-773d1d717e58-S1 at slave(1)@10.0.1.207:5051 (
> 10.0.1.207) to kill task 
> simple-pod.instance-3c1098e5-a914-11e7-bcd5-e63c853dbf20.ct2 of framework 
> bae11d5d-20c2-4d66-9ec3-773d1d717e58-0001 (marathon) at 
> scheduler-c61c493c-728f-4bd9-be60-7373574749af@10.0.5
> .229:15101
> Oct 04 14:58:25 ip-10-0-5-229.eu-central-1.compute.internal 
> mesos-master[4708]: I1004 14:58:25.210602  4748 master.cpp:5297] Processing 
> KILL call for task 'simple-pod.instance-3c0ffca4-a914-11e7-bcd5-e63c853d
> bf20.ct1' of framework bae11d5d-20c2-4d66-9ec3-773d1d717e58-0001 (marathon) 
> at scheduler-c61c493c-728f-4bd9-be60-7373574749af@10.0.5.229:15101
> Oct 04 14:58:25 ip-10-0-5-229.eu-central-1.compute.internal 
> mesos-master[4708]: I1004 14:58:25.210639  4748 master.cpp:5371] Telling 
> agent bae11d5d-20c2-4d66-9ec3-773d1d717e58-S1 at slave(1)@10.0.1.207:5051 (
> 10.0.1.207) to kill task 
> simple-pod.instance-3c0ffca4-a914-11e7-bcd5-e63c853dbf20.ct1 of framework 
> bae11d5d-20c2-4d66-9ec3-773d1d717e58-0001 (marathon) at 
> scheduler-c61c493c-728f-4bd9-be60-7373574749af@10.0.5
> .229:15101
> Oct 04 14:58:25 ip-10-0-5-229.eu-central-1.compute.internal 
> mesos-master[4708]: I1004 14:58:25.210932  4753