[
https://issues.apache.org/jira/browse/MESOS-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Hindman resolved MESOS-109.
------------------------------------
Resolution: Fixed
Assignee: Benjamin Hindman
> Master::failoverFramework should remove existing framework offers last
> ----------------------------------------------------------------------
>
> Key: MESOS-109
> URL: https://issues.apache.org/jira/browse/MESOS-109
> Project: Mesos
> Issue Type: Bug
> Reporter: Benjamin Hindman
> Assignee: Benjamin Hindman
> Priority: Critical
>
> It looks like there is a bug in failing over the framework. As the master
> goes to remove existing offers for the framwork it invokes the allocator's
> "resourcesRecovered" callback. The current implementation of that callback is
> to make new offers for any of those recovered resources to existing
> frameworks. However, in this case, the only existing framework is currently
> being failed over and has a bogus PID. Thus, when the allocator calls back
> into the master to send an offer for the framework it uses said bogus PID,
> and those offers get sent into oblivion.
> The short term fix is to remove the existing offers after all of the failover
> logic has been performed (see Master::failoverFramework). The long term fix
> is to actually get the allocator running independently of the master (as it's
> own libprocess process) so that we don't have to think about complicated
> control flow interactions between the two.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira