Aaron Davidson created SPARK-1572:
-------------------------------------
Summary: Uncaught IO exceptions in Pyspark kill Executor
Key: SPARK-1572
URL: https://issues.apache.org/jira/browse/SPARK-1572
Project: Spark
Issue Type: Bug
Components: PySpark
Affects Versions: 1.0.0, 0.9.1
Reporter: Aaron Davidson
Assignee: Aaron Davidson
If an exception is thrown in the Python "stdin writer" thread during this line:
{code}
PythonRDD.writeIteratorToStream(parent.iterator(split, context), dataOut)
{code}
(e.g., while reading from an HDFS source) then the exception will be handled by
the default ThreadUncaughtExceptionHandler, which is set in Executor. The
default behavior is, unfortunately, to call System.exit().
Ideally, normal exceptions while running a task should not bring down all the
executors of a Spark cluster.
--
This message was sent by Atlassian JIRA
(v6.2#6252)