----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/69980/#review213007 -----------------------------------------------------------
src/master/master.cpp Lines 8753 (patched) <https://reviews.apache.org/r/69980/#comment298859> Do we really want to check `!operation->info().has_id()` here? Should it be `operation->info().has_id()` instead? I think we want to wait until the timeout elapses to ACK orphans of non-completed frameworks when they DO request feedback, to ensure that the framework receives the requested update after it reregisters? - Greg Mann On Feb. 20, 2019, 12:47 a.m., Joseph Wu wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/69980/ > ----------------------------------------------------------- > > (Updated Feb. 20, 2019, 12:47 a.m.) > > > Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann. > > > Bugs: MESOS-9542 > https://issues.apache.org/jira/browse/MESOS-9542 > > > Repository: mesos > > > Description > ------- > > When dealing with orphaned operation status updates, there are two > cases the master must deal with: > - The simple case is when the master knows the framework is completed. > These status updates can be acknowledged by the master. > - However, a completed framework can be rotated out of the master's > memory. In addition, after master failover, if an agent reregisters > before the framework, an operation can appear to be orphaned until > the framework reregisters. > > This adds a fixed delay between agent reregistration and when the > master acknowledges operation status updates from unknown frameworks. > The delay should give frameworks ample time to reregister. > > The delay is based on agent reregistration in order to mitigate the > delay of acknowledging status updates of frameworks rotated out of > the completed frameworks buffer. > > > Diffs > ----- > > src/master/constants.hpp b0ab9187b8c672180e2ffb8b63cb7349dbe43ac4 > src/master/master.cpp 106d924bf16231b3bda3fb719db68c01d73644ee > > > Diff: https://reviews.apache.org/r/69980/diff/2/ > > > Testing > ------- > > TODO: This case needs unit tests. > > > Thanks, > > Joseph Wu > >