[ https://issues.apache.org/jira/browse/SPARK-32187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17177765#comment-17177765 ]
Fabian Höring commented on SPARK-32187: --------------------------------------- About this ticket: https://issues.apache.org/jira/browse/SPARK-13587 and those settings: spark-submit --deploy-mode cluster --master yarn --py-files parallelisation_hack-0.1-py2.7.egg --conf spark.pyspark.virtualenv.enabled=true --conf spark.pyspark.virtualenv.type=native --conf spark.pyspark.virtualenv.requirements=requirements.txt --conf spark.pyspark.virtualenv.bin.path=virtualenv --conf spark.pyspark.python=python3 pyspark_poc_runner.py I don't know they still work but personally I would close the ticket and not put this in the doc. I think it is not the right way to to it as it doens't scale to 100 executors and can produce race conditions for the task running on the same executor (multiple pip installs at the same time on the same node) > User Guide - Shipping Python Package > ------------------------------------ > > Key: SPARK-32187 > URL: https://issues.apache.org/jira/browse/SPARK-32187 > Project: Spark > Issue Type: Sub-task > Components: Documentation, PySpark > Affects Versions: 3.1.0 > Reporter: Hyukjin Kwon > Priority: Major > > - Zipped file > - Python files > - PEX \(?\) (see also SPARK-25433) -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org