-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61946/#review183988
-----------------------------------------------------------




src/master/validation.cpp
Lines 2205 (patched)
<https://reviews.apache.org/r/61946/#comment260020>

    I think `checkpointedResources` should not be used for Resource Provider 
provided resources. It should only apply to agent default resources. The 
checkpointing should be done by the corresponding resource provider, not the 
agent for RP provided resources.
    
    As a result, for operations like RESERVE/UNRESERVE/CREATE/DESTROY, we need 
to send operation to the corresponding resource provider as well. This does 
make sense. If we ask agent to persist those information, what will be the 
semantics if the resource provider is marked as gone?
    
    However, this does get complicated if we want to guarantee ordering for 
operations in one `acceptOffers` call (for backwards compatibility), and we do 
want to allow frameworks to launch a task right after reserve operation (the 
current semantics).
    
    To support that, I think we need to speculatively assume the operation will 
be sucessful (thus allow a subsequent launch immediately at the master side). 
However, when the checkpointing fails, we need a way to abort the subsequent 
launch at the agent side. This is essentially why we CHECK fail if the 
checkpointing fails at the agent previously for `checkpointedResources`.
    
    For the resource provider case, we should do the same thing. We can abort 
the agent if a checkpointing fails. However, this only applies to the local 
resource provider that lives in the agent process. If a LRP is outside of the 
agent process, how to abort the subsequent task launch if a previous operation 
fails is something we should think about. For instance, always reject 
operations from the agent's RP manager if the operation is for a stale stream 
ID?


- Jie Yu


On Aug. 28, 2017, 3:28 p.m., Jan Schlicht wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61946/
> -----------------------------------------------------------
> 
> (Updated Aug. 28, 2017, 3:28 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Jie Yu.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Added validation of resource provider operations.
> 
> 
> Diffs
> -----
> 
>   src/master/validation.hpp f4925752f20ae8ca4de1d9b4a3d5ffc394db9585 
>   src/master/validation.cpp 7c3247d407c9e6aa8cce457d6c6be0c39f4b532f 
> 
> 
> Diff: https://reviews.apache.org/r/61946/diff/1/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Jan Schlicht
> 
>

Reply via email to