Hyukjin Kwon created SPARK-33782:
------------------------------------
Summary: Place spark.files, spark.jars and spark.files under the
current working directory on the driver in K8S
Key: SPARK-33782
URL: https://issues.apache.org/jira/browse/SPARK-33782
Project: Spark
Issue Type: Bug
Components: Kubernetes
Affects Versions: 3.2.0
Reporter: Hyukjin Kwon
In Yarn cluster modes, the passed files are able to be accessed in the current
working directory. Looks like this is not the case in Kubernates cluset mode.
By doing this, users can, for example, leverage PEX to manage Python
dependences in Apache Spark:
{code}
pex pyspark==3.0.1 pyarrow==0.15.1 pandas==0.25.3 -o myarchive.pex
PYSPARK_PYTHON=./myarchive.pex spark-submit --files myarchive.pex
{code}
See also https://github.com/apache/spark/pull/30735/files#r540935585.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]