-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63732/#review191135
-----------------------------------------------------------




src/master/master.cpp
Line 7044 (original), 7044 (patched)
<https://reviews.apache.org/r/63732/#comment268748>

    Do you need to do that for resources in the operations?



src/master/master.cpp
Lines 7103-7120 (patched)
<https://reviews.apache.org/r/63732/#comment268746>

    I feel that long term, this way of removing all and then add back will 
probably not work.
    
    Removing offer operation means we'll need to send status update (if the 
current state is not terminal). I'd suggest we only remove those that are not 
in the new list, and add those that are not in the old list.
    
    Same comments apply to the agent chagne.



src/master/master.cpp
Lines 7107-7121 (patched)
<https://reviews.apache.org/r/63732/#comment268751>

    Do you need to also update allocator for added or removed new operations?
    
    For instance, the allocator currently think the new operation A uses 2cpus. 
Now, if A is removed (because it's dropped), do we need to tell the allocator 
that the 2cpus are no longer used and they can be allocated to others?



src/master/master.cpp
Lines 7108 (patched)
<https://reviews.apache.org/r/63732/#comment268753>

    Think about the case where agent crashes and restarts, not all RP has 
re-registered yet. In that case, some operation from some not yet re-registered 
RP will not be part of this operation list.
    
    I don't think we want to remove those operations just yet. I think we 
should remove those operations only if the corresponding RP has re-registered 
with the agent and show up in the offer operation list (or total resources).
    
    This is similar to we don't remove tasks when agent disconnects. In fact, 
you should follow the similar patter in reconcileKnownSlave here. If an 
operation is unknown, instead of calling `removeOfferOperation` directly, we 
should probably send a `reconcileOfferOperationMessage` to the agent to ask the 
RP to generate a status update, and rely on status update handler to properly 
handle the resource accounting.
    
    Also realized that we probably should also do the same in the agent code. 
Instead of directly calling removeOfferOperation and addOfferOperation, send a 
`RECONCILE` message to RP to asking the RP to generate a status udpate. If it's 
unknown to the RP, RP will send OFFER_OPERATION_DROPPED, which is terminal.


- Jie Yu


On Nov. 15, 2017, 5:31 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63732/
> -----------------------------------------------------------
> 
> (Updated Nov. 15, 2017, 5:31 p.m.)
> 
> 
> Review request for mesos, Jie Yu and Jan Schlicht.
> 
> 
> Bugs: MESOS-8207
>     https://issues.apache.org/jira/browse/MESOS-8207
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> See summary.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp 59a533940736f5cfd5ec31e0ed924f0b2ab13f9c 
> 
> 
> Diff: https://reviews.apache.org/r/63732/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`, still need to implement dedicated tests.
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>

Reply via email to