GitHub user AzureQ opened a pull request:
https://github.com/apache/spark/pull/23037
[MINOR][k8s] Add Copy pyspark into corresponding dir cmd in pyspark
Dockerfile
When I try to run `./bin/pyspark` cmd in a pod in Kubernetes(image built
without change from pyspark Dockerfile), I'm getting an error:
```
$SPARK_HOME/bin/pyspark --deploy-mode client --master
k8s://https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT_HTTPS ...
Python 2.7.15 (default, Aug 22 2018, 13:24:18)
[GCC 6.4.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Could not open PYTHONSTARTUP
IOError: [Errno 2] No such file or directory:
'/opt/spark/python/pyspark/shell.py'
```
This is because `pyspark` folder doesn't exist under `/opt/spark/python/`
## What changes were proposed in this pull request?
Added `COPY python/pyspark ${SPARK_HOME}/python/pyspark` to pyspark
Dockerfile to resolve issue above.
## How was this patch tested?
Google Kubernetes Engine
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/AzureQ/spark master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/23037.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #23037
----
commit c2f782b5d9da68a7a207269d48b897a1f482e48b
Author: Qi Shao <qi.shao.nyu@...>
Date: 2018-11-14T20:04:37Z
Copy python/pyspark to ${SPARK_HOME}/python/pyspark to make bin/pyspark
work properly in in Docker container
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]