----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71343/#review217422 -----------------------------------------------------------
Patch looks great! Reviews applied: [71361, 71343] Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose --disable-libtool-wrappers --disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh - Mesos Reviewbot On Aug. 21, 2019, 5:53 p.m., Andrei Budnik wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71343/ > ----------------------------------------------------------- > > (Updated Aug. 21, 2019, 5:53 p.m.) > > > Review request for mesos, Gilbert Song, Greg Mann, and Qian Zhang. > > > Bugs: MESOS-9887 > https://issues.apache.org/jira/browse/MESOS-9887 > > > Repository: mesos > > > Description > ------- > > Previously, Mesos agent could send TASK_FAILED status update on > executor termination while processing of TASK_FINISHED status update > was in progress. Processing of task status updates involves sending > requests to the containerizer, which might finish processing of these > requests out-of-order, e.g. `MesosContainerizer::status`. Also, > the agent does not overwrite status of the terminal status update once > it's stored in the `terminatedTasks`. Hence, there was a race condition > between two terminal status updates. > > Note that V1 Executors are not affected by this problem because they > wait for an acknowledgement of the terminal status update by the agent > before terminating. > > This patch introduces a new data structure `pendingStatusUpdates`, > which holds a list of status updates that are being processed. This > data structure allows validating the order of processing of status > updates by the agent. > > > Diffs > ----- > > src/slave/slave.hpp a17bbee13cb8291ad694f1520b613764b57b046b > src/slave/slave.cpp 1d0ec9d2428c3ffa28ad3e960b74f171013cf0c2 > > > Diff: https://reviews.apache.org/r/71343/diff/2/ > > > Testing > ------- > > 1. manual testing described in MESOS-9887 > 2. internal CI > > > Thanks, > > Andrei Budnik > >
