We use explicit ack from Scheduler. Here, is a snippet of the logs. Please see logs for Status Update UUID: a918f5ed-a604-415a-ad62-5a34fb6334ef
W0416 00:41:25.843505 124530 status_update_manager.cpp:761] Duplicate status update acknowledgment (UUID: 67f548b4-96cb-4b57-8720-2c8a4ba347e8) for update TASK_RUNNING (UUID: a918f5ed-a604-415a-ad62-5a34fb6334ef) for task node-12__aea9be06-3d26-42c1-ac00-e7b840d47699 of framework 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190 E0416 00:41:25.843559 124542 slave.cpp:2951] Failed to handle status update acknowledgement (UUID: 67f548b4-96cb-4b57-8720-2c8a4ba347e8) for task node-12__aea9be06-3d26-42c1-ac00-e7b840d47699 of framework 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190: Duplicate acknowledgement W0416 00:41:28.416702 124539 status_update_manager.cpp:478] Resending status update TASK_RUNNING (UUID: f320b354-067d-4421-a650-aba7a17b219e) for task node-12__50d38098-f026-4720-b374-a94e7171c830 of framework 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190 I0416 00:41:28.416910 124530 slave.cpp:4051] Forwarding the update TASK_RUNNING (UUID: f320b354-067d-4421-a650-aba7a17b219e) for task node-12__50d38098-f026-4720-b374-a94e7171c830 of framework 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190 to master@10.160.41.62:5050 W0416 00:41:28.425237 124537 status_update_manager.cpp:478] Resending status update TASK_RUNNING (UUID: a918f5ed-a604-415a-ad62-5a34fb6334ef) for task node-12__aea9be06-3d26-42c1-ac00-e7b840d47699 of framework 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190 I0416 00:41:28.425361 124525 slave.cpp:4051] Forwarding the update TASK_RUNNING (UUID: a918f5ed-a604-415a-ad62-5a34fb6334ef) for task node-12__aea9be06-3d26-42c1-ac00-e7b840d47699 of framework 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190 to master@10.160.41.62:5050 I0416 00:41:36.723466 124518 status_update_manager.cpp:395] Received status update acknowledgement (UUID: b99b15d6-d9af-4395-9c46-709f91db90e1) for task node-12__50d38098-f026-4720-b374-a94e7171c830 of framework 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190 W0416 00:41:36.723588 124518 status_update_manager.cpp:761] Duplicate status update acknowledgment (UUID: b99b15d6-d9af-4395-9c46-709f91db90e1) for update TASK_RUNNING (UUID: f320b354-067d-4421-a650-aba7a17b219e) for task node-12__50d38098-f026-4720-b374-a94e7171c830 of framework 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190 E0416 00:41:36.723650 124518 slave.cpp:2951] Failed to handle status update acknowledgement (UUID: b99b15d6-d9af-4395-9c46-709f91db90e1) for task node-12__50d38098-f026-4720-b374-a94e7171c830 of framework 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190: Duplicate acknowledgement I0416 00:41:36.730357 124527 status_update_manager.cpp:395] Received status update acknowledgement (UUID: 67f548b4-96cb-4b57-8720-2c8a4ba347e8) for task node-12__aea9be06-3d26-42c1-ac00-e7b840d47699 of framework 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190 W0416 00:41:36.730417 124527 status_update_manager.cpp:761] Duplicate status update acknowledgment (UUID: 67f548b4-96cb-4b57-8720-2c8a4ba347e8) for update TASK_RUNNING (UUID: a918f5ed-a604-415a-ad62-5a34fb6334ef) for task node-12__aea9be06-3d26-42c1-ac00-e7b840d47699 of framework 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190 E0416 00:41:36.730465 124518 slave.cpp:2951] Failed to handle status update acknowledgement (UUID: 67f548b4-96cb-4b57-8720-2c8a4ba347e8) for task node-12__aea9be06-3d26-42c1-ac00-e7b840d47699 of framework 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190: Duplicate acknowledgement I0416 00:41:43.769918 124532 http.cpp:277] HTTP GET for /slave(1)/state.json from 10.65.5.13:42600 with User-Agent='filebundle-agent' W0416 00:41:48.417913 124539 status_update_manager.cpp:478] Resending status update TASK_RUNNING (UUID: f320b354-067d-4421-a650-aba7a17b219e) for task node-12__50d38098-f026-4720-b374-a94e7171c830 of framework 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190 I0416 00:41:48.418166 124538 slave.cpp:4051] Forwarding the update TASK_RUNNING (UUID: f320b354-067d-4421-a650-aba7a17b219e) for task node-12__50d38098-f026-4720-b374-a94e7171c830 of framework 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190 to master@10.160.41.62:5050 W0416 00:41:48.425532 124539 status_update_manager.cpp:478] Resending status update TASK_RUNNING (UUID: a918f5ed-a604-415a-ad62-5a34fb6334ef) for task node-12__aea9be06-3d26-42c1-ac00-e7b840d47699 of framework 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190 I discussed with @zhitao. Here, is a plausible explanation of the bug. Executor sends, updates to agent every 60 seconds, and agents maintains them in pending <https://github.com/apache/mesos/blob/master/src/slave/task_status_update_manager.hpp#L173> queue. Now when the ack comes, they can be in any order for the status update but _handle section pops <https://github.com/apache/mesos/blob/master/src/slave/task_status_update_manager.cpp#L888> last update from the queue without making sure, ack was for that status update. On Tue, Apr 10, 2018 at 6:43 PM, Benjamin Mahler <bmah...@apache.org> wrote: > Do you have logs? Which acknowledgements did the agent receive? Which > TASK_RUNNING in the sequence was it re-sending? > > On Tue, Apr 10, 2018 at 6:41 PM, Benjamin Mahler <bmah...@apache.org> > wrote: > > > > Issue is that, *old executor reference is hold by slave* (assuming it > > did not receive acknowledgement, whereas master and scheduler have > > processed the status updates), so it continues to retry TASK_RUNNING > > infinitely. > > > > The agent only retries so long as it does not get an acknowledgement, is > > the scheduler acknowledging the duplicates updates or ignoring them? > > > > On Mon, Apr 9, 2018 at 12:10 PM, Varun Gupta <var...@uber.com> wrote: > > > >> Hi, > >> > >> We are running into an issue with slave status update manager. Below is > >> the > >> behavior I am seeing. > >> > >> Our use case is, we run Stateful container (Cassandra process), here > >> Executor polls JMX port at 60 second interval to get Cassandra State and > >> sends the state to agent -> master -> framework. > >> > >> *RUNNING Cassandra Process translates to TASK_RUNNING.* > >> *CRASHED or DRAINED Cassandra Process translates to TASK_FAILED.* > >> > >> At some point slave has multiple TASK_RUNNING status updates in stream > and > >> then followed by TASK_FAILED if acknowledgements are pending. We use > >> explicit acknowledgements, and I see Mesos Master receives, all > >> TASK_RUNNING and then TASK_FAILED as well as Framework also receives all > >> TASK_RUNNING updates followed up TASK_FAILED. After receiving > TASK_FAILED, > >> Framework restarts different executor on same machine using old > persistent > >> volume. > >> > >> Issue is that, *old executor reference is hold by slave* (assuming it > did > >> not receive acknowledgement, whereas master and scheduler have processed > >> the status updates), so it continues to retry TASK_RUNNING infinitely. > >> Here, old executor process is not running. As well as new executor > process > >> is running, and continues to work as-is. This makes be believe, some bug > >> with slave status update manager. > >> > >> I read slave status update manager code, recover > >> <https://github.com/apache/mesos/blob/master/src/slave/task_ > >> status_update_manager.cpp#L203> > >> has a constraint > >> <https://github.com/apache/mesos/blob/master/src/slave/task_ > >> status_update_manager.cpp#L239> > >> to ignore status updates from stream if the last executor run is > >> completed. > >> > >> I think, similar constraint should be applicable for status update > >> <https://github.com/apache/mesos/blob/master/src/slave/task_ > >> status_update_manager.cpp#L318> > >> and acknowledge > >> <https://github.com/apache/mesos/blob/master/src/slave/task_ > >> status_update_manager.cpp#L760> > >> . > >> > >> > >> Thanks, > >> Varun > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> On Fri, Mar 16, 2018 at 7:47 PM, Benjamin Mahler <bmah...@apache.org> > >> wrote: > >> > >> > (1) Assuming you're referring to the scheduler's acknowledgement of a > >> > status update, the agent will not forward TS2 until TS1 has been > >> > acknowledged. So, TS2 will not be acknowledged before TS1 is > >> acknowledged. > >> > FWICT, we'll ignore any violation of this ordering and log a warning. > >> > > >> > (2) To reverse the question, why would it make sense to ignore them? > >> > Assuming you're looking to reduce the number of round trips needed for > >> > schedulers to see the terminal update, I would point you to: > >> > https://issues.apache.org/jira/browse/MESOS-6941 > >> > > >> > (3) When the agent sees an executor terminate, it will transition all > >> > non-terminal tasks assigned to that executor to TASK_GONE (partition > >> aware > >> > framework) or TASK_LOST (non partition aware framework) or TASK_FAILED > >> (if > >> > the container OOMed). There may be other cases, it looks a bit > >> convoluted > >> > to me. > >> > > >> > On Thu, Mar 15, 2018 at 10:35 AM, Zhitao Li <zhitaoli...@gmail.com> > >> wrote: > >> > > >> > > Hi, > >> > > > >> > > While designing the correct behavior with one of our framework, we > >> > > encounters some questions about behavior of status update: > >> > > > >> > > The executor continuously polls the workload probe to get current > >> mode of > >> > > workload (a Cassandra server), and send various status update states > >> > > (STARTING, RUNNING, FAILED, etc). > >> > > > >> > > Executor polls every 30 seconds, and sends status update. Here, we > are > >> > > seeing congestion on task update acknowledgements somewhere (still > >> > > unknown). > >> > > > >> > > There are three scenarios that we want to understand. > >> > > > >> > > 1. Agent queue has task update TS1, TS2 & TS3 (in this order) > >> waiting > >> > on > >> > > acknowledgement. Suppose if TS2 receives an acknowledgement, then > >> what > >> > > will > >> > > happen to TS1 update in the queue. > >> > > > >> > > > >> > > 1. Agent queue has task update TS1, TS2, TS3 & TASK_FAILED. Here, > >> TS1, > >> > > TS2, TS3 are non-terminial updates. Once the agent has received a > >> > > terminal > >> > > status update, does it makes sense to ignore non-terminal updates > >> in > >> > the > >> > > queue? > >> > > > >> > > > >> > > 1. As per Executor Driver code comment > >> > > <https://github.com/apache/mesos/blob/master/src/java/src/ > >> > > org/apache/mesos/ExecutorDriver.java#L86>, > >> > > if the executor is terminated, does agent send TASK_LOST? If so, > >> does > >> > it > >> > > send once or for each unacknowledged status update? > >> > > > >> > > > >> > > I'll study the code in status update manager and agent separately > but > >> > some > >> > > official answer will definitely help. > >> > > > >> > > Many thanks! > >> > > > >> > > -- > >> > > Cheers, > >> > > > >> > > Zhitao Li > >> > > > >> > > >> > > > > >