[ 
https://issues.apache.org/jira/browse/MESOS-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Qiao Wang updated MESOS-3403:
----------------------------------
    Description: 
For an external Mesos allocator which does not run with Mesos master in the 
same OS process, and maybe this allocator can be deployed in the different host 
with Mesos master, then the Mesos allocator module should be implemented as a 
proxy, which delegates calls to an actual allocator.

For this external allocator, the total resources and allocated resources will 
be stored in it. After Mesos master recovery (such as fail-over), it needs to 
sync up with Mesos master. Under normal circumstances, all slaves will 
reregister after Mesos master recovery, so we can sync up the total resources 
and used resource of each slave in allocator->addSlave function call. But for 
the abnormal case, a slave does not reregister after Mesos master recovery, 
then master will call function Master::removeSlave(const Registry::Slave& 
slave) to remove this slave from Registry after timeout 
(--slave_reregister_timeout), but this function does not call allocator to 
remove the related resources. So in order to support the resources sync up with 
the external allocator in this abnormal case, it needs to enhance function 
Master::removeSlave(const Registry::Slave& slave) to call 
allocator->removeSlave to remove the related resources from external allocator.

  was:
For an external Mesos allocator which does not run with Mesos master in the 
same OS process, and maybe this allocator can be deployed in the different host 
with Mesos master, then the Mesos allocator module should be implemented as a 
proxy, which delegates calls to an actual allocator.

For this external allocator, the total resources and allocated resources will 
be stored in it. After Mesos master recovery (such as fail-over), it needs to 
sync up with Mesos master. Under normal circumstances, all slaves will 
reregister after Mesos master recovery, so we can sync up the total resources 
and used resource of each slave in allocator->addSlave function call. But for 
the abnormal case, some slaves do not reregister after Mesos master recovery, 
then master will remove those slaves from Registry after timeout ()


> Add support for removing no re-registered slaves with 
> timeout(--slave_reregister_timeout) from an external allocator
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-3403
>                 URL: https://issues.apache.org/jira/browse/MESOS-3403
>             Project: Mesos
>          Issue Type: Improvement
>          Components: master
>            Reporter: Yong Qiao Wang
>            Assignee: Yong Qiao Wang
>
> For an external Mesos allocator which does not run with Mesos master in the 
> same OS process, and maybe this allocator can be deployed in the different 
> host with Mesos master, then the Mesos allocator module should be implemented 
> as a proxy, which delegates calls to an actual allocator.
> For this external allocator, the total resources and allocated resources will 
> be stored in it. After Mesos master recovery (such as fail-over), it needs to 
> sync up with Mesos master. Under normal circumstances, all slaves will 
> reregister after Mesos master recovery, so we can sync up the total resources 
> and used resource of each slave in allocator->addSlave function call. But for 
> the abnormal case, a slave does not reregister after Mesos master recovery, 
> then master will call function Master::removeSlave(const Registry::Slave& 
> slave) to remove this slave from Registry after timeout 
> (--slave_reregister_timeout), but this function does not call allocator to 
> remove the related resources. So in order to support the resources sync up 
> with the external allocator in this abnormal case, it needs to enhance 
> function Master::removeSlave(const Registry::Slave& slave) to call 
> allocator->removeSlave to remove the related resources from external 
> allocator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to