I'm not following what the bug is.

The code you pointed to is called from here:
https://github.com/apache/mesos/blob/1.4.0/src/slave/status_update_manager.cpp#L762-L776

Where we ignore duplicates and also ensure that the ack matches the latest
update we've sent.
So, from the code you pointed to I don't see an issue yet.

Which update does the duplicate ack for 67f548b4-96cb-4b57-8720-2c8a4ba347e8
refer to?
The logs you provided don't show when the status updates arrive for these,
can you include that?

On Mon, Apr 16, 2018 at 2:08 PM, Varun Gupta <var...@uber.com> wrote:

> We use explicit ack from Scheduler.
>
> Here, is a snippet of the logs. Please see logs for Status Update UUID:
> a918f5ed-a604-415a-ad62-5a34fb6334ef
>
> W0416 00:41:25.843505 124530 status_update_manager.cpp:761] Duplicate
> status update acknowledgment (UUID: 67f548b4-96cb-4b57-8720-2c8a4ba347e8)
> for update TASK_RUNNING (UUID: a918f5ed-a604-415a-ad62-5a34fb6334ef) for
> task node-12__aea9be06-3d26-42c1-ac00-e7b840d47699 of framework
> 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190
>
> E0416 00:41:25.843559 124542 slave.cpp:2951] Failed to handle status update
> acknowledgement (UUID: 67f548b4-96cb-4b57-8720-2c8a4ba347e8) for task
> node-12__aea9be06-3d26-42c1-ac00-e7b840d47699 of framework
> 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190: Duplicate acknowledgement
>
> W0416 00:41:28.416702 124539 status_update_manager.cpp:478] Resending
> status update TASK_RUNNING (UUID: f320b354-067d-4421-a650-aba7a17b219e)
> for
> task node-12__50d38098-f026-4720-b374-a94e7171c830 of framework
> 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190
>
> I0416 00:41:28.416910 124530 slave.cpp:4051] Forwarding the update
> TASK_RUNNING (UUID: f320b354-067d-4421-a650-aba7a17b219e) for task
> node-12__50d38098-f026-4720-b374-a94e7171c830 of framework
> 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190 to master@10.160.41.62:5050
>
> W0416 00:41:28.425237 124537 status_update_manager.cpp:478] Resending
> status update TASK_RUNNING (UUID: a918f5ed-a604-415a-ad62-5a34fb6334ef)
> for
> task node-12__aea9be06-3d26-42c1-ac00-e7b840d47699 of framework
> 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190
>
> I0416 00:41:28.425361 124525 slave.cpp:4051] Forwarding the update
> TASK_RUNNING (UUID: a918f5ed-a604-415a-ad62-5a34fb6334ef) for task
> node-12__aea9be06-3d26-42c1-ac00-e7b840d47699 of framework
> 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190 to master@10.160.41.62:5050
>
> I0416 00:41:36.723466 124518 status_update_manager.cpp:395] Received status
> update acknowledgement (UUID: b99b15d6-d9af-4395-9c46-709f91db90e1) for
> task node-12__50d38098-f026-4720-b374-a94e7171c830 of framework
> 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190
>
> W0416 00:41:36.723588 124518 status_update_manager.cpp:761] Duplicate
> status update acknowledgment (UUID: b99b15d6-d9af-4395-9c46-709f91db90e1)
> for update TASK_RUNNING (UUID: f320b354-067d-4421-a650-aba7a17b219e) for
> task node-12__50d38098-f026-4720-b374-a94e7171c830 of framework
> 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190
>
> E0416 00:41:36.723650 124518 slave.cpp:2951] Failed to handle status update
> acknowledgement (UUID: b99b15d6-d9af-4395-9c46-709f91db90e1) for task
> node-12__50d38098-f026-4720-b374-a94e7171c830 of framework
> 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190: Duplicate acknowledgement
>
> I0416 00:41:36.730357 124527 status_update_manager.cpp:395] Received status
> update acknowledgement (UUID: 67f548b4-96cb-4b57-8720-2c8a4ba347e8) for
> task node-12__aea9be06-3d26-42c1-ac00-e7b840d47699 of framework
> 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190
>
> W0416 00:41:36.730417 124527 status_update_manager.cpp:761] Duplicate
> status update acknowledgment (UUID: 67f548b4-96cb-4b57-8720-2c8a4ba347e8)
> for update TASK_RUNNING (UUID: a918f5ed-a604-415a-ad62-5a34fb6334ef) for
> task node-12__aea9be06-3d26-42c1-ac00-e7b840d47699 of framework
> 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190
>
> E0416 00:41:36.730465 124518 slave.cpp:2951] Failed to handle status update
> acknowledgement (UUID: 67f548b4-96cb-4b57-8720-2c8a4ba347e8) for task
> node-12__aea9be06-3d26-42c1-ac00-e7b840d47699 of framework
> 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190: Duplicate acknowledgement
>
> I0416 00:41:43.769918 124532 http.cpp:277] HTTP GET for
> /slave(1)/state.json from 10.65.5.13:42600 with
> User-Agent='filebundle-agent'
>
> W0416 00:41:48.417913 124539 status_update_manager.cpp:478] Resending
> status update TASK_RUNNING (UUID: f320b354-067d-4421-a650-aba7a17b219e)
> for
> task node-12__50d38098-f026-4720-b374-a94e7171c830 of framework
> 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190
>
> I0416 00:41:48.418166 124538 slave.cpp:4051] Forwarding the update
> TASK_RUNNING (UUID: f320b354-067d-4421-a650-aba7a17b219e) for task
> node-12__50d38098-f026-4720-b374-a94e7171c830 of framework
> 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190 to master@10.160.41.62:5050
>
> W0416 00:41:48.425532 124539 status_update_manager.cpp:478] Resending
> status update TASK_RUNNING (UUID: a918f5ed-a604-415a-ad62-5a34fb6334ef)
> for
> task node-12__aea9be06-3d26-42c1-ac00-e7b840d47699 of framework
> 5f606c54-8caa-4e69-afdf-7f2c79acc7e6-1190
>
>
> I discussed with @zhitao. Here, is a plausible explanation of the bug.
> Executor sends, updates to agent every 60 seconds, and agents maintains
> them in pending
> <https://github.com/apache/mesos/blob/master/src/slave/task_
> status_update_manager.hpp#L173>
> queue. Now when the ack comes, they can be in any order for the status
> update but _handle section pops
> <https://github.com/apache/mesos/blob/master/src/slave/task_
> status_update_manager.cpp#L888>
> last update from the queue without making sure, ack was for that status
> update.
>
>
>
>
> On Tue, Apr 10, 2018 at 6:43 PM, Benjamin Mahler <bmah...@apache.org>
> wrote:
>
> > Do you have logs? Which acknowledgements did the agent receive? Which
> > TASK_RUNNING in the sequence was it re-sending?
> >
> > On Tue, Apr 10, 2018 at 6:41 PM, Benjamin Mahler <bmah...@apache.org>
> > wrote:
> >
> > > > Issue is that, *old executor reference is hold by slave* (assuming it
> > > did not receive acknowledgement, whereas master and scheduler have
> > > processed the status updates), so it  continues to retry TASK_RUNNING
> > > infinitely.
> > >
> > > The agent only retries so long as it does not get an acknowledgement,
> is
> > > the scheduler acknowledging the duplicates updates or ignoring them?
> > >
> > > On Mon, Apr 9, 2018 at 12:10 PM, Varun Gupta <var...@uber.com> wrote:
> > >
> > >> Hi,
> > >>
> > >> We are running into an issue with slave status update manager. Below
> is
> > >> the
> > >> behavior I am seeing.
> > >>
> > >> Our use case is, we run Stateful container (Cassandra process), here
> > >> Executor polls JMX port at 60 second interval to get Cassandra State
> and
> > >> sends the state to agent -> master -> framework.
> > >>
> > >> *RUNNING Cassandra Process translates to TASK_RUNNING.*
> > >> *CRASHED or DRAINED Cassandra Process translates to TASK_FAILED.*
> > >>
> > >> At some point slave has multiple TASK_RUNNING status updates in stream
> > and
> > >> then followed by TASK_FAILED if acknowledgements are pending. We use
> > >> explicit acknowledgements, and I see Mesos Master receives, all
> > >> TASK_RUNNING and then TASK_FAILED as well as Framework also receives
> all
> > >> TASK_RUNNING updates followed up TASK_FAILED. After receiving
> > TASK_FAILED,
> > >> Framework restarts different executor on same machine using old
> > persistent
> > >> volume.
> > >>
> > >> Issue is that, *old executor reference is hold by slave* (assuming it
> > did
> > >> not receive acknowledgement, whereas master and scheduler have
> processed
> > >> the status updates), so it  continues to retry TASK_RUNNING
> infinitely.
> > >> Here, old executor process is not running. As well as new executor
> > process
> > >> is running, and continues to work as-is. This makes be believe, some
> bug
> > >> with slave status update manager.
> > >>
> > >> I read slave status update manager code, recover
> > >> <https://github.com/apache/mesos/blob/master/src/slave/task_
> > >> status_update_manager.cpp#L203>
> > >>  has a constraint
> > >> <https://github.com/apache/mesos/blob/master/src/slave/task_
> > >> status_update_manager.cpp#L239>
> > >>  to ignore status updates from stream if the last executor run is
> > >> completed.
> > >>
> > >> I think, similar constraint should be applicable for status update
> > >> <https://github.com/apache/mesos/blob/master/src/slave/task_
> > >> status_update_manager.cpp#L318>
> > >>  and acknowledge
> > >> <https://github.com/apache/mesos/blob/master/src/slave/task_
> > >> status_update_manager.cpp#L760>
> > >> .
> > >>
> > >>
> > >> Thanks,
> > >> Varun
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> On Fri, Mar 16, 2018 at 7:47 PM, Benjamin Mahler <bmah...@apache.org>
> > >> wrote:
> > >>
> > >> > (1) Assuming you're referring to the scheduler's acknowledgement of
> a
> > >> > status update, the agent will not forward TS2 until TS1 has been
> > >> > acknowledged. So, TS2 will not be acknowledged before TS1 is
> > >> acknowledged.
> > >> > FWICT, we'll ignore any violation of this ordering and log a
> warning.
> > >> >
> > >> > (2) To reverse the question, why would it make sense to ignore them?
> > >> > Assuming you're looking to reduce the number of round trips needed
> for
> > >> > schedulers to see the terminal update, I would point you to:
> > >> > https://issues.apache.org/jira/browse/MESOS-6941
> > >> >
> > >> > (3) When the agent sees an executor terminate, it will transition
> all
> > >> > non-terminal tasks assigned to that executor to TASK_GONE (partition
> > >> aware
> > >> > framework) or TASK_LOST (non partition aware framework) or
> TASK_FAILED
> > >> (if
> > >> > the container OOMed). There may be other cases, it looks a bit
> > >> convoluted
> > >> > to me.
> > >> >
> > >> > On Thu, Mar 15, 2018 at 10:35 AM, Zhitao Li <zhitaoli...@gmail.com>
> > >> wrote:
> > >> >
> > >> > > Hi,
> > >> > >
> > >> > > While designing the correct behavior with one of our framework, we
> > >> > > encounters some questions about behavior of status update:
> > >> > >
> > >> > > The executor continuously polls the workload probe to get current
> > >> mode of
> > >> > > workload (a Cassandra server), and send various status update
> states
> > >> > > (STARTING, RUNNING, FAILED, etc).
> > >> > >
> > >> > > Executor polls every 30 seconds, and sends status update. Here, we
> > are
> > >> > > seeing congestion on task update acknowledgements somewhere (still
> > >> > > unknown).
> > >> > >
> > >> > > There are three scenarios that we want to understand.
> > >> > >
> > >> > >    1. Agent queue has task update TS1, TS2 & TS3 (in this order)
> > >> waiting
> > >> > on
> > >> > >    acknowledgement. Suppose if TS2 receives an acknowledgement,
> then
> > >> what
> > >> > > will
> > >> > >    happen to TS1 update in the queue.
> > >> > >
> > >> > >
> > >> > >    1. Agent queue has task update TS1, TS2, TS3 & TASK_FAILED.
> Here,
> > >> TS1,
> > >> > >    TS2, TS3 are non-terminial updates. Once the agent has
> received a
> > >> > > terminal
> > >> > >    status update, does it makes sense to ignore non-terminal
> updates
> > >> in
> > >> > the
> > >> > >    queue?
> > >> > >
> > >> > >
> > >> > >    1. As per Executor Driver code comment
> > >> > >    <https://github.com/apache/mesos/blob/master/src/java/src/
> > >> > > org/apache/mesos/ExecutorDriver.java#L86>,
> > >> > >    if the executor is terminated, does agent send TASK_LOST? If
> so,
> > >> does
> > >> > it
> > >> > >    send once or for each unacknowledged status update?
> > >> > >
> > >> > >
> > >> > > I'll study the code in status update manager and agent separately
> > but
> > >> > some
> > >> > > official answer will definitely help.
> > >> > >
> > >> > > Many thanks!
> > >> > >
> > >> > > --
> > >> > > Cheers,
> > >> > >
> > >> > > Zhitao Li
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Reply via email to