> On Feb. 15, 2019, 3:50 p.m., Greg Mann wrote:
> > src/slave/slave.cpp
> > Line 4711 (original), 4735 (patched)
> > <https://reviews.apache.org/r/69978/diff/2/?file=2125182#file2125182line4735>
> >
> >     By moving this line into the `.then()` continuation, we will no longer 
> > remove the operation if there is an error when calling 
> > `operationStatusUpdateManager.acknowledgement()`; is that what we want? 
> > Perhaps we should add a `removeOperation()` call in the `err()` helper as 
> > well?

I think the patch is good as-is — if we add a removeOperation() call in the 
err() helper, the framework won't be able to ack the update in the case that 
there is an error calling operationStatusUpdateManager.acknowledgement() which 
prevents the SUM from checkpointing the ack and cleaning up the stream, causing 
it to resend the status update.


That means that the agent would keep resending the update until the stream is 
garbage collected the next time the agent recovers.


Your comment however made me notice another corner case that the SLRP handles 
and I had missed: the agent could fail over right before executing this lambda. 
It should notice that during recovery and garbage collect the stream — I'm 
adding a new patch to this chain that does this.


- Gastón


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69978/#review212881
-----------------------------------------------------------


On Feb. 13, 2019, 3:15 p.m., Gastón Kleiman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69978/
> -----------------------------------------------------------
> 
> (Updated Feb. 13, 2019, 3:15 p.m.)
> 
> 
> Review request for mesos, Chun-Hung Hsiao and Greg Mann.
> 
> 
> Bugs: MESOS-9574
>     https://issues.apache.org/jira/browse/MESOS-9574
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Make the agent garbage collect an operation status update stream once
> a terminal status update is acknowledged.
> 
> This patch also improves the logging of acknowledgement failures and the
> readability of the `Slave::operationStatusAcknowledgement` method.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.cpp e3c2c005d865b5c333e92e50e49ef398fe06ad79 
> 
> 
> Diff: https://reviews.apache.org/r/69978/diff/2/
> 
> 
> Testing
> -------
> 
> Manual testing + existing tests still pass.
> 
> 
> Thanks,
> 
> Gastón Kleiman
> 
>

Reply via email to