You are getting endless TASK_RUNNING messages because your framework scheduler never acknowledges it, so that the slave would resend it to you after a timeout. You need to either do the ack yourself by `SchedulerDriver::acknowledgeStatusUpdate`, or set `implicitAcknowledgements` to true when constructing the SchedulerDriver so th driver would do that for you automatically.
https://github.com/apache/mesos/blob/0.27.0/include/mesos/scheduler.hpp#L369-L373 On Fri, Feb 26, 2016 at 10:06 AM, suppandi <[email protected]> wrote: > My framework did not get a status update about the task beeing in > task_killed. All i get are a bunch of task_running updates > > On Thu, Feb 25, 2016 at 7:12 PM, Chris Baker <[email protected]> > wrote: > > > Does your framework acknowledge the task killed status update? > > > > On Thu, Feb 25, 2016, 17:34 suppandi <[email protected]> wrote: > > > > > Hi, > > > > > > I am experimenting with some framework code and when i want to reduce > the > > > number of running tasks, I do a killTask() to kill one of the tasks. > > After > > > that, the state in the mesos console says the state of the task is > > > TASK_KILLED, but it never moves into the 'completed tasks' list from > the > > > 'active tasks' list. > > > > > > I tried to follow the logs and i see that a message had been sent to > the > > > slave where the task is running and the (standard) executor on the > slave > > > cleaned up the task - so the app is not running anymore. But the slave > > > keeps reporting a state of task_running back to the master and the > > > framework. > > > > > > here are the relevant logs from the slave. mesos is at version 0.27.1. > Is > > > killTask the right way to scale the number of tasks or am i missing > > > something. > > > > > > Thanks > > > suppandi > > > > > > Feb 25 14:47:44 snode2-0 mesos-slave[13459]: I0225 14:47:44.961199 > 13482 > > > slave.cpp:3001] Handling status update TASK_KILLED (UUID: > > > b8583953-08c1-4360-a25b-035041b7abb5) for task > > > es.4f1380de-a062-4ae6-ac15-11e6f08c8ed7 of framework > > > 146a47b2-9470-4526-9a2c-f46c056cee65-0003 from executor(1)@ > > > 10.57.8.198:46526 > > > Feb 25 14:47:45 snode2-0 mesos-slave[13459]: I0225 14:47:45.038777 > 13480 > > > status_update_manager.cpp:320] Received status update TASK_KILLED > (UUID: > > > b8583953-08c1-4360-a25b-035041b7abb5) for task > > > es.4f1380de-a062-4ae6-ac15-11e6f08c8ed7 of framework > > > 146a47b2-9470-4526-9a2c-f46c056cee65-0003 > > > Feb 25 14:47:45 snode2-0 mesos-slave[13459]: I0225 14:47:45.038910 > 13480 > > > slave.cpp:3263] Sending acknowledgement for status update TASK_KILLED > > > (UUID: b8583953-08c1-4360-a25b-035041b7abb5) for task > > > es.4f1380de-a062-4ae6-ac15-11e6f08c8ed7 of framework > > > 146a47b2-9470-4526-9a2c-f46c056cee65-0003 to executor(1)@ > > 10.57.8.198:46526 > > > Feb 25 14:47:45 snode2-0 mesos-slave[13459]: I0225 14:47:45.967486 > 13482 > > > slave.cpp:3481] executor(1)@10.57.8.198:46526 exited > > > Feb 25 14:47:46 snode2-0 mesos-slave[13459]: I0225 14:47:46.043051 > 13485 > > > docker.cpp:1654] Executor for container > > > 'c6096776-8501-4812-8aac-c78ed295cadc' has exited > > > Feb 25 14:47:46 snode2-0 mesos-slave[13459]: I0225 14:47:46.043125 > 13485 > > > docker.cpp:1455] Destroying container > > > 'c6096776-8501-4812-8aac-c78ed295cadc' > > > Feb 25 14:47:46 snode2-0 mesos-slave[13459]: I0225 14:47:46.043182 > 13485 > > > docker.cpp:1557] Running docker stop on container > > > 'c6096776-8501-4812-8aac-c78ed295cadc' > > > Feb 25 14:47:46 snode2-0 mesos-slave[13459]: I0225 14:47:46.043356 > 13483 > > > slave.cpp:3816] Executor 'es.4f1380de-a062-4ae6-ac15-11e6f08c8ed7' of > > > framework 146a47b2-9470-4526-9a2c-f46c056cee65-0003 exited with status > 0 > > > Feb 25 14:48:10 snode2-0 mesos-slave[13459]: I0225 14:48:10.694290 > 13484 > > > slave.cpp:4304] Current disk usage 0.02%. Max allowed age: > > > 6.298540038254745days > > > Feb 25 14:48:18 snode2-0 mesos-slave[13459]: W0225 14:48:18.327950 > 13485 > > > status_update_manager.cpp:475] Resending status update TASK_RUNNING > > (UUID: > > > 08b270c5-a467-403f-8a0f-fc673577ea02) for task > > > es.4f1380de-a062-4ae6-ac15-11e6f08c8ed7 of framework > > > 146a47b2-9470-4526-9a2c-f46c056cee65-0003 > > > > > > and from then on, every 10 min i see this > > > ... > > > Feb 25 15:47:38 snode2-0 mesos-slave[13459]: W0225 15:47:38.333694 > 13481 > > > status_update_manager.cpp:475] Resending status update TASK_RUNNING > > (UUID: > > > 08b270c5-a467-403f-8a0f-fc673577ea02) for task > > > es.4f1380de-a062-4ae6-ac15-11e6f08c8ed7 of framework > > > 146a47b2-9470-4526-9a2c-f46c056cee65-0003 > > > Feb 25 15:47:38 snode2-0 mesos-slave[13459]: I0225 15:47:38.333880 > 13481 > > > slave.cpp:3353] Forwarding the update TASK_RUNNING (UUID: > > > 08b270c5-a467-403f-8a0f-fc673577ea02) for task > > > es.4f1380de-a062-4ae6-ac15-11e6f08c8ed7 of framework > > > 146a47b2-9470-4526-9a2c-f46c056cee65-0003 to [email protected]:5050 > > > > > >
