[
https://issues.apache.org/jira/browse/SPARK-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044981#comment-14044981
]
Josh Rosen commented on SPARK-1740:
-----------------------------------
The "Python daemon -> multiple workers" architecture was motivated by the high
cost of forking a JVM with a large heap, so this issue could also lead to
performance problems if we attempt to re-launch the daemon once a huge amount
of data has been cached in the Spark worker JVM.
> Pyspark cancellation kills unrelated pyspark workers
> ----------------------------------------------------
>
> Key: SPARK-1740
> URL: https://issues.apache.org/jira/browse/SPARK-1740
> Project: Spark
> Issue Type: Bug
> Components: PySpark
> Affects Versions: 1.0.0
> Reporter: Aaron Davidson
> Priority: Critical
>
> PySpark cancellation calls SparkEnv#destroyPythonWorker. Since there is one
> python worker per process, this would seem like a sensible thing to do.
> Unfortunately, this method actually destroys a python daemon, and all
> associated workers, which generally means that we can cause failures in
> unrelated Pyspark jobs.
> The severity of this bug is limited by the fact that the Pyspark daemon is
> easily recreated, so the tasks will succeed after being restarted.
--
This message was sent by Atlassian JIRA
(v6.2#6252)