[ 
https://issues.apache.org/jira/browse/MESOS-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Hindman resolved MESOS-109.
------------------------------------

    Resolution: Fixed
      Assignee: Benjamin Hindman
    
> Master::failoverFramework should remove existing framework offers last
> ----------------------------------------------------------------------
>
>                 Key: MESOS-109
>                 URL: https://issues.apache.org/jira/browse/MESOS-109
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Benjamin Hindman
>            Assignee: Benjamin Hindman
>            Priority: Critical
>
> It looks like there is a bug in failing over the framework. As the master 
> goes to remove existing offers for the framwork it invokes the allocator's 
> "resourcesRecovered" callback. The current implementation of that callback is 
> to make new offers for any of those recovered resources to existing 
> frameworks. However, in this case, the only existing framework is currently 
> being failed over and has a bogus PID. Thus, when the allocator calls back 
> into the master to send an offer for the framework it uses said bogus PID, 
> and those offers get sent into oblivion.
> The short term fix is to remove the existing offers after all of the failover 
> logic has been performed (see Master::failoverFramework). The long term fix 
> is to actually get the allocator running independently of the master (as it's 
> own libprocess process) so that we don't have to think about complicated 
> control flow interactions between the two.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to