[jira] [Commented] (MESOS-544) Mesos-slave support for "node drain"

Benjamin Mahler (JIRA) Thu, 06 Mar 2014 12:46:20 -0800

    [ 
https://issues.apache.org/jira/browse/MESOS-544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13923031#comment-13923031
 ]


Benjamin Mahler commented on MESOS-544:
---------------------------------------

To break this down into increments for GSOC students:

(1) Implement a signal handler for a clean shutdown of the slave, that kills 
all running tasks. This can likely re-use the existing slave shutdown mechanism.

(2) The problem with (1) is that the Master will possibly wait up to the health 
checking delay (~75 seconds) to notify the framework that the tasks were lost. 
We should consider sending an unregistration request vs. status updates to 
improve this.

(3) stretch: longer term, it may be beneficial to have Frameworks aware of 
operator induced drains. This introduces possibly unnecessary complexity so the 
point of (3) is to explore the tradeoffs of exposing explicit draining 
information to frameworks.

> Mesos-slave support for "node drain"
> ------------------------------------
>
>                 Key: MESOS-544
>                 URL: https://issues.apache.org/jira/browse/MESOS-544
>             Project: Mesos
>          Issue Type: Story
>          Components: framework, master, slave
>            Reporter: Tobias Weingartner
>              Labels: gsoc2014
>             Fix For: 0.19.0
>
>
> Given that multiple frameworks can be present on a machine at a time, and 
> writing "node drain" for each possible framework is an intractable task, it 
> would nice if the slave-master core had a means to tell frameworks that tasks 
> were killed to drain a host.  Or possibly that the slave was told to drain 
> the host of all tasks (graceful shutdown, etc).
> {noformat}
> # drain current host
> pkill -USR1 mesos-slave
> {noformat}
> This would make writing scripts for site-ops to do node maintenance much 
> easier... :)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (MESOS-544) Mesos-slave support for "node drain"

Reply via email to