[
https://issues.apache.org/jira/browse/MESOS-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jura updated MESOS-6586:
-------------------------------
Description:
The Mesos {{/teardown}} endpoint is:
- Removing the framework on the mesos-master. As a result, the framework is in
state {{removed}}
- Shuts down all executors and tasks running on the Mesos agents
However, I'd also expect that a message from the mesos-master is sent to the
framework (Scheduler API) so that the framework processes can initiate a
shutdown as well. This is not the case. As a result, it is necessary to
manually {{suspend}} the framework, e.g. by using the DC/OS UI.
A possible solution would be to provide an additional callback {{teardown}} at
the scheduler API that will notify the framework that the mesos-master has
initiated a teardown. Mesos-master should only mark the framework as removed if
the framework has been successfully terminated, e.g. the framework could send a
message to mesos-master indicating that the termination was successful / has
been started.
This change will also affect the {{dcos service shutdown}} command which uses
the {{/teardown}} endpoint. From a DC/OS CLI perspective, I'd expect that the
{{dcos service shutdown service-id}} command shuts down all components of the
framework, not only the executors and tasks.
Also, for consistency reasons I'd expect that this shutdown action can also be
taken by using the DC/OS UI. So far on DC/OS, you can only {{Suspend}} a
service / framework which will stop the framework instances, but will not
remove the framework from mesos-master and terminate its executors.
Tested on DC/OS with the frameworks conductr and elasticsearch.
was:
The Mesos {{/teardown}} endpoint is:
- Removing the framework on the mesos-master. As a result, the framework is in
state {{removed}}
- Shuts down all executors and tasks running on the Mesos agents
However, I'd also expect that a message from the mesos-master is sent to the
framework (Scheduler API) so that the framework processes can initiate a
shutdown as well. This is not the case. As a result, it is necessary to
manually {{suspend}} the framework, e.g. by using the DC/OS UI.
A possible solution would be to provide an additional callback {{teardown}} at
the scheduler API that will notify the framework that the mesos-master has
initiated a teardown. Mesos-master should only mark the framework as removed if
the framework has been successfully terminated, e.g. the framework could send a
message to mesos-master indicating that the termination was successful / has
been started.
This change will also affect the {{dcos service shutdown}} command which uses
the {{/teardown}} endpoint. From a DC/OS CLI perspective, I'd expect that the
{{dcos service shutdown service-id}} command shuts down all components of the
framework, not only the executors and tasks.
Tested on DC/OS with the frameworks conductr and elasticsearch.
> Teardown endpoint should remove framework
> -----------------------------------------
>
> Key: MESOS-6586
> URL: https://issues.apache.org/jira/browse/MESOS-6586
> Project: Mesos
> Issue Type: Improvement
> Components: cli, framework api, HTTP API
> Affects Versions: 1.0.1
> Reporter: Markus Jura
> Labels: features
>
> The Mesos {{/teardown}} endpoint is:
> - Removing the framework on the mesos-master. As a result, the framework is
> in state {{removed}}
> - Shuts down all executors and tasks running on the Mesos agents
> However, I'd also expect that a message from the mesos-master is sent to the
> framework (Scheduler API) so that the framework processes can initiate a
> shutdown as well. This is not the case. As a result, it is necessary to
> manually {{suspend}} the framework, e.g. by using the DC/OS UI.
> A possible solution would be to provide an additional callback {{teardown}}
> at the scheduler API that will notify the framework that the mesos-master has
> initiated a teardown. Mesos-master should only mark the framework as removed
> if the framework has been successfully terminated, e.g. the framework could
> send a message to mesos-master indicating that the termination was successful
> / has been started.
> This change will also affect the {{dcos service shutdown}} command which uses
> the {{/teardown}} endpoint. From a DC/OS CLI perspective, I'd expect that the
> {{dcos service shutdown service-id}} command shuts down all components of the
> framework, not only the executors and tasks.
> Also, for consistency reasons I'd expect that this shutdown action can also
> be taken by using the DC/OS UI. So far on DC/OS, you can only {{Suspend}} a
> service / framework which will stop the framework instances, but will not
> remove the framework from mesos-master and terminate its executors.
> Tested on DC/OS with the frameworks conductr and elasticsearch.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)