We are running into an issue with slave status update manager. Below is the
behavior I am seeing.

Our use case is, we run Stateful container (Cassandra process), here
Executor polls JMX port at 60 second interval to get Cassandra State and
sends the state to agent -> master -> framework.

*RUNNING Cassandra Process translates to TASK_RUNNING.*
*CRASHED or DRAINED Cassandra Process translates to TASK_FAILED.*

At some point slave has multiple TASK_RUNNING status updates in stream and
then followed by TASK_FAILED if acknowledgements are pending. We use
explicit acknowledgements, and I see Mesos Master receives, all
TASK_RUNNING and then TASK_FAILED as well as Framework also receives all
TASK_RUNNING updates followed up TASK_FAILED. After receiving TASK_FAILED,
Framework restarts different executor on same machine using old persistent

Issue is that, *old executor reference is hold by slave* (assuming it did
not receive acknowledgement, whereas master and scheduler have processed
the status updates), so it  continues to retry TASK_RUNNING infinitely.
Here, old executor process is not running. As well as new executor process
is running, and continues to work as-is. This makes be believe, some bug
with slave status update manager.

I read slave status update manager code, recover
 has a constraint
 to ignore status updates from stream if the last executor run is completed.

I think, similar constraint should be applicable for status update
 and acknowledge


On Fri, Mar 16, 2018 at 7:47 PM, Benjamin Mahler <bmah...@apache.org> wrote:

> (1) Assuming you're referring to the scheduler's acknowledgement of a
> status update, the agent will not forward TS2 until TS1 has been
> acknowledged. So, TS2 will not be acknowledged before TS1 is acknowledged.
> FWICT, we'll ignore any violation of this ordering and log a warning.
> (2) To reverse the question, why would it make sense to ignore them?
> Assuming you're looking to reduce the number of round trips needed for
> schedulers to see the terminal update, I would point you to:
> https://issues.apache.org/jira/browse/MESOS-6941
> (3) When the agent sees an executor terminate, it will transition all
> non-terminal tasks assigned to that executor to TASK_GONE (partition aware
> framework) or TASK_LOST (non partition aware framework) or TASK_FAILED (if
> the container OOMed). There may be other cases, it looks a bit convoluted
> to me.
> On Thu, Mar 15, 2018 at 10:35 AM, Zhitao Li <zhitaoli...@gmail.com> wrote:
> > Hi,
> >
> > While designing the correct behavior with one of our framework, we
> > encounters some questions about behavior of status update:
> >
> > The executor continuously polls the workload probe to get current mode of
> > workload (a Cassandra server), and send various status update states
> >
> > Executor polls every 30 seconds, and sends status update. Here, we are
> > seeing congestion on task update acknowledgements somewhere (still
> > unknown).
> >
> > There are three scenarios that we want to understand.
> >
> >    1. Agent queue has task update TS1, TS2 & TS3 (in this order) waiting
> on
> >    acknowledgement. Suppose if TS2 receives an acknowledgement, then what
> > will
> >    happen to TS1 update in the queue.
> >
> >
> >    1. Agent queue has task update TS1, TS2, TS3 & TASK_FAILED. Here, TS1,
> >    TS2, TS3 are non-terminial updates. Once the agent has received a
> > terminal
> >    status update, does it makes sense to ignore non-terminal updates in
> the
> >    queue?
> >
> >
> >    1. As per Executor Driver code comment
> >    <https://github.com/apache/mesos/blob/master/src/java/src/
> > org/apache/mesos/ExecutorDriver.java#L86>,
> >    if the executor is terminated, does agent send TASK_LOST? If so, does
> it
> >    send once or for each unacknowledged status update?
> >
> >
> > I'll study the code in status update manager and agent separately but
> some
> > official answer will definitely help.
> >
> > Many thanks!
> >
> > --
> > Cheers,
> >
> > Zhitao Li
> >

Reply via email to