> On Feb. 15, 2019, 3:50 p.m., Greg Mann wrote: > > src/slave/slave.cpp > > Line 4711 (original), 4735 (patched) > > <https://reviews.apache.org/r/69978/diff/2/?file=2125182#file2125182line4735> > > > > By moving this line into the `.then()` continuation, we will no longer > > remove the operation if there is an error when calling > > `operationStatusUpdateManager.acknowledgement()`; is that what we want? > > Perhaps we should add a `removeOperation()` call in the `err()` helper as > > well?
I think the patch is good as-is — if we add a removeOperation() call in the err() helper, the framework won't be able to ack the update in the case that there is an error calling operationStatusUpdateManager.acknowledgement() which prevents the SUM from checkpointing the ack and cleaning up the stream, causing it to resend the status update. That means that the agent would keep resending the update until the stream is garbage collected the next time the agent recovers. Your comment however made me notice another corner case that the SLRP handles and I had missed: the agent could fail over right before executing this lambda. It should notice that during recovery and garbage collect the stream — I'm adding a new patch to this chain that does this. - Gastón ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/69978/#review212881 ----------------------------------------------------------- On Feb. 13, 2019, 3:15 p.m., Gastón Kleiman wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/69978/ > ----------------------------------------------------------- > > (Updated Feb. 13, 2019, 3:15 p.m.) > > > Review request for mesos, Chun-Hung Hsiao and Greg Mann. > > > Bugs: MESOS-9574 > https://issues.apache.org/jira/browse/MESOS-9574 > > > Repository: mesos > > > Description > ------- > > Make the agent garbage collect an operation status update stream once > a terminal status update is acknowledged. > > This patch also improves the logging of acknowledgement failures and the > readability of the `Slave::operationStatusAcknowledgement` method. > > > Diffs > ----- > > src/slave/slave.cpp e3c2c005d865b5c333e92e50e49ef398fe06ad79 > > > Diff: https://reviews.apache.org/r/69978/diff/2/ > > > Testing > ------- > > Manual testing + existing tests still pass. > > > Thanks, > > Gastón Kleiman > >