Re: Review Request 48313: Creation and deletion of persistent volumes across agent restart.

Anindya Sinha Thu, 09 Jun 2016 10:34:10 -0700


> On June 8, 2016, 1:28 p.m., Neil Conway wrote:
> > Overall seems like a reasonable approach.
> > 
> > One thing that isn't clear to me: what is the advantage of updating the 
> > checkpoint to reflect any partial work that was done before exiting? It 
> > seems that adds a bunch of complexity and room for error. Why not only 
> > update the checkpoint if all changes were made successfully?
> 
> Anindya Sinha wrote:
>     We would need to maintain what was actually successful in any case since 
> in a DESTROY, a failed rmdir does not lead to the agent exiting. So, if we 
> were to do it at one place, we would still need to keep account of the 
> successful operations so as to not update the checkpoint based on a failed 
> rmdir as an example (and hence can be a partial update).
>     
>     Since we are keeping track of result of the operations anyway, I think it 
> is a good idea to update before exiting (only place we do that when CREATE 
> fails and the agent exits) so that the subsequent handling of 
> CheckpointResources does not need to redo such operations when the agent 
> reregisters.
> 
> Neil Conway wrote:
>     On reflection, I wonder whether we should be handling `CREATE` errors 
> differently from `DESTROY` errors. In both cases, the user has asked the 
> agent to do something it wasn't able to do. A failed `DESTROY` has the 
> addditional concern that we might have destroyed some but not all of the data 
> on the volume.
>     
>     Do you think handling `CREATE` vs. `DESTROY` errors differently is a good 
> idea?

Good point. Here is what I think are the use cases:
Say we have checkpointed resources (CR) as {V1,V2} where V1, V2 are 2 
persistent volumes. So, CR(master) = {V1,V2}, and CR(agent) = {V1,V2}.
If we now receive a DESTROY(V2): CR(master) = {V1} [since master's view is not 
dependent on what happened on the agent]. Suppose that fails on the agent, so 
CR(agent) = {V1,V2} [since we do not update checkpoint resources on agent on 
failure in DESTROY, which results in inconsistency between master and agent at 
this point of time].

Case 1 (current implementation): Agent does not restart on failure in DESTROY. 
Hence, CR(agent) = {V1,V2}. When the next CheckpointResources is received, ie. 
on a subsequent CREATE/DESTROY/RESERVE/UNRESERVE on a different resource, 
DESTROY(V2) will be reattempted and if that is successful, we will in sync 
between agent and master. However if the next CheckpointResources is due to a 
CREATE(V2) [that can happen since V2 is available as a resource based on offer 
from master], that would be a no-op on agent since agent does not treat it as a 
new resource based on the checkpoint since at this point CR(master) = {V1,V2}, 
and CR(agent) = {V1,V2}, which would be a problem.

Case 2 (if we exit agent on failure): The agent restarts which triggers a 
CheckpointResources from master->agent on ReregisterSlave. That would force a 
reattempt of DESTROY(V2) since current view is CR(master) = {V1} and CR(agent) 
= {V1,V2} which will reattempt to bring the checkpointed resources back in sync 
between master and agent.

So, I think it might be a better option to exit the agent on failure in DESTROY 
as well. However, I think we should still update the checkpoint based on the 
status of successful operations (other CREATE/DESTROY) on failure (when agent 
exits) so as to avoid these operations to be repeated in a subsequent 
CheckpointResources message. Does that sound reasonable to you?

Note: I think this use case probably is a good example to consider 
StatusUpdates (or something similar) for operations on reserved resources, viz. 
CREATE/DESTROY/RESERVE/UNRESERVE to ensure master and agent are in sync to 
ensure guaranteed view of offers (to frameworks) for reserved resources.

- Anindya

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48313/#review136638
-----------------------------------------------------------

On June 9, 2016, 12:22 a.m., Anindya Sinha wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48313/
> -----------------------------------------------------------
> 
> (Updated June 9, 2016, 12:22 a.m.)
> 
> 
> Review request for mesos, Neil Conway and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-5448
>     https://issues.apache.org/jira/browse/MESOS-5448
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> o Checkpoints on the agent are updated only after successful handling
>   of persistent volume creation and deletion to maintain consistency.
> o If volume creation or deletion fails, checkpoint is updated up until
>   that point, and the agent exits.
> o This ensures that after a agent restart, checkpoints are in sync
>   between the master and the agent after the reregistration workflow.
> 
> 
> Diffs
> -----
> 
>   include/mesos/resources.hpp a557e97c65194d4aad879fb88d8edefd1c95b8d8 
>   include/mesos/v1/resources.hpp a5ba8fec4c9c3643646308f75a4b28cefe0b3df3 
>   src/common/resources.cpp f6ff92b591c15bc8e93fd85e1896349c3a7bb968 
>   src/slave/slave.cpp d635dd2c6f6fce5a9eeefc5dcdf84e00cdc833b6 
>   src/v1/resources.cpp 8c3f2d1c1529915a59d47fe37bb3fc7a3267079a 
> 
> Diff: https://reviews.apache.org/r/48313/diff/
> 
> 
> Testing
> -------
> 
> All tests passed.
> 
> 
> Thanks,
> 
> Anindya Sinha
> 
>

Re: Review Request 48313: Creation and deletion of persistent volumes across agent restart.

Reply via email to