Do you have logs? Which acknowledgements did the agent receive? Which TASK_RUNNING in the sequence was it re-sending?
On Tue, Apr 10, 2018 at 6:41 PM, Benjamin Mahler <bmah...@apache.org> wrote: > > Issue is that, *old executor reference is hold by slave* (assuming it > did not receive acknowledgement, whereas master and scheduler have > processed the status updates), so it continues to retry TASK_RUNNING > infinitely. > > The agent only retries so long as it does not get an acknowledgement, is > the scheduler acknowledging the duplicates updates or ignoring them? > > On Mon, Apr 9, 2018 at 12:10 PM, Varun Gupta <var...@uber.com> wrote: > >> Hi, >> >> We are running into an issue with slave status update manager. Below is >> the >> behavior I am seeing. >> >> Our use case is, we run Stateful container (Cassandra process), here >> Executor polls JMX port at 60 second interval to get Cassandra State and >> sends the state to agent -> master -> framework. >> >> *RUNNING Cassandra Process translates to TASK_RUNNING.* >> *CRASHED or DRAINED Cassandra Process translates to TASK_FAILED.* >> >> At some point slave has multiple TASK_RUNNING status updates in stream and >> then followed by TASK_FAILED if acknowledgements are pending. We use >> explicit acknowledgements, and I see Mesos Master receives, all >> TASK_RUNNING and then TASK_FAILED as well as Framework also receives all >> TASK_RUNNING updates followed up TASK_FAILED. After receiving TASK_FAILED, >> Framework restarts different executor on same machine using old persistent >> volume. >> >> Issue is that, *old executor reference is hold by slave* (assuming it did >> not receive acknowledgement, whereas master and scheduler have processed >> the status updates), so it continues to retry TASK_RUNNING infinitely. >> Here, old executor process is not running. As well as new executor process >> is running, and continues to work as-is. This makes be believe, some bug >> with slave status update manager. >> >> I read slave status update manager code, recover >> <https://github.com/apache/mesos/blob/master/src/slave/task_ >> status_update_manager.cpp#L203> >> has a constraint >> <https://github.com/apache/mesos/blob/master/src/slave/task_ >> status_update_manager.cpp#L239> >> to ignore status updates from stream if the last executor run is >> completed. >> >> I think, similar constraint should be applicable for status update >> <https://github.com/apache/mesos/blob/master/src/slave/task_ >> status_update_manager.cpp#L318> >> and acknowledge >> <https://github.com/apache/mesos/blob/master/src/slave/task_ >> status_update_manager.cpp#L760> >> . >> >> >> Thanks, >> Varun >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On Fri, Mar 16, 2018 at 7:47 PM, Benjamin Mahler <bmah...@apache.org> >> wrote: >> >> > (1) Assuming you're referring to the scheduler's acknowledgement of a >> > status update, the agent will not forward TS2 until TS1 has been >> > acknowledged. So, TS2 will not be acknowledged before TS1 is >> acknowledged. >> > FWICT, we'll ignore any violation of this ordering and log a warning. >> > >> > (2) To reverse the question, why would it make sense to ignore them? >> > Assuming you're looking to reduce the number of round trips needed for >> > schedulers to see the terminal update, I would point you to: >> > https://issues.apache.org/jira/browse/MESOS-6941 >> > >> > (3) When the agent sees an executor terminate, it will transition all >> > non-terminal tasks assigned to that executor to TASK_GONE (partition >> aware >> > framework) or TASK_LOST (non partition aware framework) or TASK_FAILED >> (if >> > the container OOMed). There may be other cases, it looks a bit >> convoluted >> > to me. >> > >> > On Thu, Mar 15, 2018 at 10:35 AM, Zhitao Li <zhitaoli...@gmail.com> >> wrote: >> > >> > > Hi, >> > > >> > > While designing the correct behavior with one of our framework, we >> > > encounters some questions about behavior of status update: >> > > >> > > The executor continuously polls the workload probe to get current >> mode of >> > > workload (a Cassandra server), and send various status update states >> > > (STARTING, RUNNING, FAILED, etc). >> > > >> > > Executor polls every 30 seconds, and sends status update. Here, we are >> > > seeing congestion on task update acknowledgements somewhere (still >> > > unknown). >> > > >> > > There are three scenarios that we want to understand. >> > > >> > > 1. Agent queue has task update TS1, TS2 & TS3 (in this order) >> waiting >> > on >> > > acknowledgement. Suppose if TS2 receives an acknowledgement, then >> what >> > > will >> > > happen to TS1 update in the queue. >> > > >> > > >> > > 1. Agent queue has task update TS1, TS2, TS3 & TASK_FAILED. Here, >> TS1, >> > > TS2, TS3 are non-terminial updates. Once the agent has received a >> > > terminal >> > > status update, does it makes sense to ignore non-terminal updates >> in >> > the >> > > queue? >> > > >> > > >> > > 1. As per Executor Driver code comment >> > > <https://github.com/apache/mesos/blob/master/src/java/src/ >> > > org/apache/mesos/ExecutorDriver.java#L86>, >> > > if the executor is terminated, does agent send TASK_LOST? If so, >> does >> > it >> > > send once or for each unacknowledged status update? >> > > >> > > >> > > I'll study the code in status update manager and agent separately but >> > some >> > > official answer will definitely help. >> > > >> > > Many thanks! >> > > >> > > -- >> > > Cheers, >> > > >> > > Zhitao Li >> > > >> > >> > >