[
https://issues.apache.org/jira/browse/SPARK-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14017776#comment-14017776
]
Mark Hamstra commented on SPARK-2019:
-------------------------------------
Please don't leave the Affects Version/s selector on None. As with the SO
question, is this an issue that you are seeing with Spark 0.9.0? If so, then
the version of Spark that you are using is significantly out of date even on
the 0.9 branch. Several bug fixes are present in the 0.9.1 release of Spark,
which has been available for almost two months. There are a few more in the
current 0.9.2-SNAPSHOT code, and many more in the recent 1.0.0 release.
> Spark workers die/disappear when job fails for nearly any reason
> ----------------------------------------------------------------
>
> Key: SPARK-2019
> URL: https://issues.apache.org/jira/browse/SPARK-2019
> Project: Spark
> Issue Type: Bug
> Reporter: sam
>
> We either have to reboot all the nodes, or run 'sudo service spark-worker
> restart' across our cluster. I don't think this should happen - the job
> failures are often not even that bad. There is a 5 upvoted SO question here:
> http://stackoverflow.com/questions/22031006/spark-0-9-0-worker-keeps-dying-in-standalone-mode-when-job-fails
> We shouldn't be giving restart privileges to our devs, and therefore our
> sysadm has to frequently restart the workers. When the sysadm is not around,
> there is nothing our devs can do.
> Many thanks
--
This message was sent by Atlassian JIRA
(v6.2#6252)