[jira] [Comment Edited] (SPARK-26404) set spark.pyspark.python or PYSPARK_PYTHON doesn't work in k8s client-cluster mode.

jingxiong zhong (Jira) Tue, 21 Dec 2021 09:53:07 -0800


    [ 
https://issues.apache.org/jira/browse/SPARK-26404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17463384#comment-17463384
 ]


jingxiong zhong edited comment on SPARK-26404 at 12/21/21, 5:52 PM:
--------------------------------------------------------------------

@gollum999Tim Sanders,hey sir,  I have a question about that how do I add my 
Python dependencies to Spark Job, as following 
{code:sh}
spark-submit \
--archives s3a://path/python3.6.9.tgz#python3.6.9 \
--conf "spark.pyspark.driver.python=python3.6.9/bin/python3" \
--conf "spark.pyspark.python=python3.6.9/bin/python3" \
--name "piroottest" \
./examples/src/main/python/pi.py 10
{code}
this can't run my job sucessfully，it throw error

{code:sh}
Traceback (most recent call last):
  File "/tmp/spark-63b77184-6e89-4121-bc32-6a1b793e0c85/pi.py", line 21, in 
<module>
    from pyspark.sql import SparkSession
  File "/opt/spark/python/lib/pyspark.zip/pyspark/__init__.py", line 121, in 
<module>
  File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/__init__.py", line 42, in 
<module>
  File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/types.py", line 27, in 
<module>
    async def _ag():
  File "/opt/spark/work-dir/python3.6.9/lib/python3.6/ctypes/__init__.py", line 
7, in <module>
    from _ctypes import Union, Structure, Array
ImportError: libffi.so.6: cannot open shared object file: No such file or 
directory
{code}

Or is there another way to add Python dependencies？



was (Author: JIRAUSER281124):
@gollum999Tim Sanders,hey sir,  I have a question about that how can I add my 
python dependency into spark job, as following 
{code:sh}
spark-submit \
--archives s3a://path/python3.6.9.tgz#python3.6.9 \
--conf "spark.pyspark.driver.python=python3.6.9/bin/python3" \
--conf "spark.pyspark.python=python3.6.9/bin/python3" \
--name "piroottest" \
./examples/src/main/python/pi.py 10
{code}
this can't run my job sucessfully，it throw error

{code:sh}
Traceback (most recent call last):
  File "/tmp/spark-63b77184-6e89-4121-bc32-6a1b793e0c85/pi.py", line 21, in 
<module>
    from pyspark.sql import SparkSession
  File "/opt/spark/python/lib/pyspark.zip/pyspark/__init__.py", line 121, in 
<module>
  File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/__init__.py", line 42, in 
<module>
  File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/types.py", line 27, in 
<module>
    async def _ag():
  File "/opt/spark/work-dir/python3.6.9/lib/python3.6/ctypes/__init__.py", line 
7, in <module>
    from _ctypes import Union, Structure, Array
ImportError: libffi.so.6: cannot open shared object file: No such file or 
directory
{code}

Or is there another way to add Python dependencies？


> set spark.pyspark.python or PYSPARK_PYTHON doesn't work in k8s client-cluster 
> mode.
> -----------------------------------------------------------------------------------
>
>                 Key: SPARK-26404
>                 URL: https://issues.apache.org/jira/browse/SPARK-26404
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes, Spark Core
>    Affects Versions: 2.4.0
>            Reporter: Dongqing  Liu
>            Priority: Major
>
> Neither
>    conf.set("spark.executorEnv.PYSPARK_PYTHON", "/opt/pythonenvs/bin/python")
> nor 
>   conf.set("spark.pyspark.python", "/opt/pythonenvs/bin/python") 
> works. 
> Looks like the executor always picks python from PATH.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-26404) set spark.pyspark.python or PYSPARK_PYTHON doesn't work in k8s client-cluster mode.

Reply via email to