[ 
https://issues.apache.org/jira/browse/MESOS-8350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16311275#comment-16311275
 ] 

Benjamin Bannier commented on MESOS-8350:
-----------------------------------------

{noformat}
commit c29b7cd7b5627964ca75001dd3195656816f870c
Author: Benjamin Bannier <benjamin.bann...@mesosphere.io>
Date:   Thu Jan 4 11:12:27 2018 +0100

    Fixed handling of checkpointed resources for RP-capable agents.
    
    The master will not resend checkpointed resources when a resource
    provider-capable agent reregisters. Instead the checkpointed resources
    sent as part of the agent reregistration should be evaluated by the
    master and be used to update its state.
    
    This patch fixes the handling of checkpointed resources sent as part
    of the agent reregistration so that the resources are used to update
    the master state.
    
    Review: https://reviews.apache.org/r/64889/

commit 6f134c93e52120ad6f29ef9057e2045ad8f24c7c
Author: Benjamin Bannier <benjamin.bann...@mesosphere.io>
Date:   Thu Jan 4 11:12:17 2018 +0100

    Added test of handling of checkpointed resources in reregistration.
    
    This patch adds a test that confirms that the master resends
    checkpointed resources to the agent on reregistration.
    
    Review: https://reviews.apache.org/r/64888/

commit 4e5b8cfe86061451747899ef516aa0c4bea12bca
Author: Benjamin Bannier <benjamin.bann...@mesosphere.io>
Date:   Thu Jan 4 11:12:01 2018 +0100

    Future-proofed use of agent capabilities in tests.
    
    Even though currently the resource provider capability is the only
    capability which can be toggled by users, when examining agent flags
    we expect either no capabilities at all or a full set of capabilities
    (including both toggleable and required capabilities).
    
    This patch cleans up test code agent capabilities by delegating the
    bulk of the work to a helper function providing a set of
    default-enabled capabilities and only adds a single capability to that
    set. This not only makes it clearer which exact capability a test
    cares about, but also future-proofs the code for the case where we
    extend the set of required capabilities in the future.
    
    Review: https://reviews.apache.org/r/64891/
{noformat}

> Resource provider-capable agents not correctly synchronizing checkpointed 
> agent resources on reregistration
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-8350
>                 URL: https://issues.apache.org/jira/browse/MESOS-8350
>             Project: Mesos
>          Issue Type: Bug
>          Components: master
>            Reporter: Benjamin Bannier
>            Assignee: Benjamin Bannier
>            Priority: Critical
>             Fix For: 1.5.0, 1.6.0
>
>
> For resource provider-capable agents the master does not re-send checkpointed 
> resources on agent reregistration; instead the checkpointed resources sent as 
> part of the {{ReregisterSlaveMessage}} should be used.
> This is not what happens in reality. If e.g., checkpointing of an offer 
> operation fails and the agent fails over the checkpointed resources would, as 
> expected, not be reflected in the agent, but would still be assumed in the 
> master.
> A workaround is to fail over the master which would lead to the newly elected 
> master bootstrapping agent state from {{ReregisterSlaveMessage}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to