Correct, Sharma. I don't think this is documented anywhere yet, but it
would be a good Mesos FAQ topic.
When the master notices that the framework has exited or is deactivated, it
disables the framework in the allocator so no new offers will be made to
that framework, and removes any outstanding offers (but does not send a
RescindResourceOfferMessage to the framework, since the framework is
presumably failing over). When a framework reregisters, it is reactivated
in the allocator and will start receiving new offers again.
If you were to try to persist the 'ephemeral' offers to another framework
instance, and call launchTasks with one of the old offers, the master will
respond with TASK_LOST ("Task launched with invalid offers"), since the
master no longer knows about that offer. So don't bother trying. :)
Already running tasks (used offers) continue running, unless the framework
failover timeout is exceeded.On Mon, May 12, 2014 at 5:38 PM, Sharma Podila <[email protected]> wrote: > My understanding is that when a framework fails over (either new instance > starts after previous one fails, or the same instance restarts), Mesos > master would automatically cancel any unused offers it had given to the > previous framework instance. This is a good thing. Can someone confirm this > to be the case? Is such an expectation documented somewhere? I did look at > master.cpp and I hope I interpreted it right. > > Effectively then, the offers are 'ephemeral' and don't need to be > persisted by the framework scheduler to pass along to another of its > instance that may failover as the leader. > > Thank you. > > Sharma > >
