[
https://issues.apache.org/jira/browse/SPARK-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080470#comment-14080470
]
Apache Spark commented on SPARK-2764:
-------------------------------------
User 'JoshRosen' has created a pull request for this issue:
https://github.com/apache/spark/pull/1680
> Simplify process structure of PySpark daemon / worker launching process
> -----------------------------------------------------------------------
>
> Key: SPARK-2764
> URL: https://issues.apache.org/jira/browse/SPARK-2764
> Project: Spark
> Issue Type: Improvement
> Components: PySpark
> Reporter: Josh Rosen
> Assignee: Josh Rosen
>
> PySpark's daemon-based worker factory has a very complicated process
> structure that I've always found confusing. The per-java-worker daemon.py
> process launches a numCores-sized pool of subprocesses, and those
> subprocesses launching the actual worker processes that process data.
> I think we can simplify this by having daemon.py launch the workers directly
> without this extra layer of indirection. See my comments on the pull request
> that introduced daemon.py: https://github.com/mesos/spark/pull/563
--
This message was sent by Atlassian JIRA
(v6.2#6252)