Cosmin Lehene created MESOS-8111:
------------------------------------
Summary: Mesos sees task as running, but cannot kill it because
the agent is offline
Key: MESOS-8111
URL: https://issues.apache.org/jira/browse/MESOS-8111
Project: Mesos
Issue Type: Bug
Components: master
Affects Versions: 1.2.3
Environment: DC/OS 1.9.4
Reporter: Cosmin Lehene
After scaling down a cluster, the master is reporting a task as running
although the slave has been long gone.
At the same time it reports it can't kill it because the agent is offline
{noformat}
I1018 16:55:22.000000 6976 master.cpp:4913] Processing KILL call for task
'spark.7b59a77b-b353-11e7-addd-b29ecbf071e1' of framework
4d2a982a-0e62-4471-88e8-8df9cc0ae437-0001 (marathon) at
[email protected]:15101
W1018 16:55:22.000000 6976 master.cpp:5000] Cannot kill task
spark.7b59a77b-b353-11e7-addd-b29ecbf071e1 of framework
4d2a982a-0e62-4471-88e8-8df9cc0ae437-0001 (marathon) at
[email protected]:15101 because the
agent 4d2a982a-0e62-4471-88e8-8df9cc0ae437-S129 at slave(1)@10.0.0.81:5051
(10.0.0.81) is disconnected. Kill will be retried if the agent re-registers
{noformat}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)