yetanotherlogonfail opened a new pull request #34422:
URL: https://github.com/apache/spark/pull/34422


   Proposed change to User Guide documentation page
   
   "Python Package Management"
   URL
   
https://spark.apache.org/docs/latest/api/python/user_guide/python_packaging.html
   
   Reason: The paragraph is unclear
   user-facing change: Yes
   
   Change 
   Paragraph: Using PySpark Native Features
   
   From:
   PySpark allows to upload Python files (.py), zipped Python packages (.zip), 
and Egg files (.egg) to the executors by:
   
   Setting the configuration setting spark.submit.pyFiles
   
   Setting --py-files option in Spark scripts
   
   Directly calling pyspark.SparkContext.addPyFile() in applications
   
   This is a straightforward method to ship additional custom Python code to 
the cluster. You can just add individual files or zip whole packages and upload 
them. Using pyspark.SparkContext.addPyFile() allows to upload code even after 
having started your job.
   
   However, it does not allow to add packages built as Wheels and therefore 
does not allow to include dependencies with native code.
   
   TO:
   PySpark allows to upload Python files (.py), zipped Python packages (.zip), 
and Egg files (.egg) to the executors by:
   
   Setting the configuration setting spark.submit.pyFiles
   **OR**
   Setting --py-files option in Spark scripts
   **OR**
   Directly calling pyspark.SparkContext.addPyFile() in applications
   
   This is a straightforward method to ship additional custom Python code to 
the cluster. You can just add individual files or zip whole packages and upload 
them. Using pyspark.SparkContext.addPyFile() allows to upload code even after 
having started your job.
   
   However, it does not allow to add packages built as Wheels and therefore 
does not allow to include dependencies with native code.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to