> Issue is that, *old executor reference is hold by slave* (assuming it did not receive acknowledgement, whereas master and scheduler have processed the status updates), so it continues to retry TASK_RUNNING infinitely.
The agent only retries so long as it does not get an acknowledgement, is the scheduler acknowledging the duplicates updates or ignoring them? On Mon, Apr 9, 2018 at 12:10 PM, Varun Gupta <var...@uber.com> wrote: > Hi, > > We are running into an issue with slave status update manager. Below is the > behavior I am seeing. > > Our use case is, we run Stateful container (Cassandra process), here > Executor polls JMX port at 60 second interval to get Cassandra State and > sends the state to agent -> master -> framework. > > *RUNNING Cassandra Process translates to TASK_RUNNING.* > *CRASHED or DRAINED Cassandra Process translates to TASK_FAILED.* > > At some point slave has multiple TASK_RUNNING status updates in stream and > then followed by TASK_FAILED if acknowledgements are pending. We use > explicit acknowledgements, and I see Mesos Master receives, all > TASK_RUNNING and then TASK_FAILED as well as Framework also receives all > TASK_RUNNING updates followed up TASK_FAILED. After receiving TASK_FAILED, > Framework restarts different executor on same machine using old persistent > volume. > > Issue is that, *old executor reference is hold by slave* (assuming it did > not receive acknowledgement, whereas master and scheduler have processed > the status updates), so it continues to retry TASK_RUNNING infinitely. > Here, old executor process is not running. As well as new executor process > is running, and continues to work as-is. This makes be believe, some bug > with slave status update manager. > > I read slave status update manager code, recover > <https://github.com/apache/mesos/blob/master/src/slave/ > task_status_update_manager.cpp#L203> > has a constraint > <https://github.com/apache/mesos/blob/master/src/slave/ > task_status_update_manager.cpp#L239> > to ignore status updates from stream if the last executor run is > completed. > > I think, similar constraint should be applicable for status update > <https://github.com/apache/mesos/blob/master/src/slave/ > task_status_update_manager.cpp#L318> > and acknowledge > <https://github.com/apache/mesos/blob/master/src/slave/ > task_status_update_manager.cpp#L760> > . > > > Thanks, > Varun > > > > > > > > > > > > > > > > > > > On Fri, Mar 16, 2018 at 7:47 PM, Benjamin Mahler <bmah...@apache.org> > wrote: > > > (1) Assuming you're referring to the scheduler's acknowledgement of a > > status update, the agent will not forward TS2 until TS1 has been > > acknowledged. So, TS2 will not be acknowledged before TS1 is > acknowledged. > > FWICT, we'll ignore any violation of this ordering and log a warning. > > > > (2) To reverse the question, why would it make sense to ignore them? > > Assuming you're looking to reduce the number of round trips needed for > > schedulers to see the terminal update, I would point you to: > > https://issues.apache.org/jira/browse/MESOS-6941 > > > > (3) When the agent sees an executor terminate, it will transition all > > non-terminal tasks assigned to that executor to TASK_GONE (partition > aware > > framework) or TASK_LOST (non partition aware framework) or TASK_FAILED > (if > > the container OOMed). There may be other cases, it looks a bit convoluted > > to me. > > > > On Thu, Mar 15, 2018 at 10:35 AM, Zhitao Li <zhitaoli...@gmail.com> > wrote: > > > > > Hi, > > > > > > While designing the correct behavior with one of our framework, we > > > encounters some questions about behavior of status update: > > > > > > The executor continuously polls the workload probe to get current mode > of > > > workload (a Cassandra server), and send various status update states > > > (STARTING, RUNNING, FAILED, etc). > > > > > > Executor polls every 30 seconds, and sends status update. Here, we are > > > seeing congestion on task update acknowledgements somewhere (still > > > unknown). > > > > > > There are three scenarios that we want to understand. > > > > > > 1. Agent queue has task update TS1, TS2 & TS3 (in this order) > waiting > > on > > > acknowledgement. Suppose if TS2 receives an acknowledgement, then > what > > > will > > > happen to TS1 update in the queue. > > > > > > > > > 1. Agent queue has task update TS1, TS2, TS3 & TASK_FAILED. Here, > TS1, > > > TS2, TS3 are non-terminial updates. Once the agent has received a > > > terminal > > > status update, does it makes sense to ignore non-terminal updates in > > the > > > queue? > > > > > > > > > 1. As per Executor Driver code comment > > > <https://github.com/apache/mesos/blob/master/src/java/src/ > > > org/apache/mesos/ExecutorDriver.java#L86>, > > > if the executor is terminated, does agent send TASK_LOST? If so, > does > > it > > > send once or for each unacknowledged status update? > > > > > > > > > I'll study the code in status update manager and agent separately but > > some > > > official answer will definitely help. > > > > > > Many thanks! > > > > > > -- > > > Cheers, > > > > > > Zhitao Li > > > > > >