[ 
https://issues.apache.org/jira/browse/SPARK-9008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesper Lundgren updated SPARK-9008:
-----------------------------------
    Description: 
The cluster will automatically restart failing drivers when launched in 
supervised cluster mode. However there is no official way for a operation team 
to stop and remove a driver from restarting in case  it is malfunctioning. 

I know there is "bin/spark-class org.apache.spark.deploy.Client kill" but this 
is undocumented and does not always work so well.

It would be great if there was a way to remove supervised mode to allow kill -9 
to work on a driver program.

The documentation surrounding this could also see some improvements. It would 
be nice to have some best practice examples on how to work with supervised 
mode, how to manage graceful shutdown and catch TERM signals. (TERM signal will 
end with an exit code that triggers restart in supervised mode unless you 
change the exit code in the application logic)

  was:
The cluster will automatically restart failing drivers when launched in 
supervised cluster mode. However there is no official way for a operation team 
to stop and remove a driver from restarting in case  it is malfunctioning. 

I know there is "bin/spark-class org.apache.spark.deploy.Client kill" but this 
is undocumented and does not always work so well.

It would be great if there was a way to remove supervised mode to allow kill -9 
to work on a driver program.

The documentation surrounding this could also see some improvements. It would 
be nice to have some best practice examples on how to work with supervised 
mode, how to manage graceful shutdown and catch TERM signals. (TERM signal will 
end with wrong exit code and trigger restart when using supervised mode unless 
you change the exit code in the application logic)


> Stop and remove driver from supervised mode in spark-master interface
> ---------------------------------------------------------------------
>
>                 Key: SPARK-9008
>                 URL: https://issues.apache.org/jira/browse/SPARK-9008
>             Project: Spark
>          Issue Type: New Feature
>            Reporter: Jesper Lundgren
>
> The cluster will automatically restart failing drivers when launched in 
> supervised cluster mode. However there is no official way for a operation 
> team to stop and remove a driver from restarting in case  it is 
> malfunctioning. 
> I know there is "bin/spark-class org.apache.spark.deploy.Client kill" but 
> this is undocumented and does not always work so well.
> It would be great if there was a way to remove supervised mode to allow kill 
> -9 to work on a driver program.
> The documentation surrounding this could also see some improvements. It would 
> be nice to have some best practice examples on how to work with supervised 
> mode, how to manage graceful shutdown and catch TERM signals. (TERM signal 
> will end with an exit code that triggers restart in supervised mode unless 
> you change the exit code in the application logic)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to