[ 
https://issues.apache.org/jira/browse/SPARK-31165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikolay Dimolarov updated SPARK-31165:
--------------------------------------
    Description: 
I am currently trying to follow the k8s instructions for Spark: 
[https://spark.apache.org/docs/latest/running-on-kubernetes.html] and when I 
clone apache/spark on GitHub on the master branch I saw multiple wrong folder 
references after trying to build my Docker image:

 

*Issue 1: The comments in the Dockerfile reference the wrong folder for the 
Dockerfile:*
{code:java}
# If this docker file is being used in the context of building your images from 
a Spark # distribution, the docker build command should be invoked from the top 
level directory # of the Spark distribution. E.g.: # docker build -t 
spark:latest -f kubernetes/dockerfiles/spark/Dockerfile .{code}
Well that docker build command simply won't run. I only got the following to 
run:
{code:java}
docker build -t spark:latest -f 
resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile . 
{code}
which is the actual path to the Dockerfile.

 

*Issue 2: jars folder does not exist*

After I read the tutorial I of course build spark first as per the instructions 
with:
{code:java}
./build/mvn -Pkubernetes -DskipTests clean package{code}
Nonetheless, in the Dockerfile I get this error when building:
{code:java}
Step 5/18 : COPY jars /opt/spark/jars
COPY failed: stat /var/lib/docker/tmp/docker-builder402673637/jars: no such 
file or directory{code}
 for which I may have found a similar issue here: 
[https://stackoverflow.com/questions/52451538/spark-for-kubernetes-test-on-mac]

I am new to Spark but I assume that this jars folder - if the build step would 
actually make it and I ran the maven build of the master branch successfully 
with the command I mentioned above - would exist in the root folder of the 
project. Turns out it's here:

spark/assembly/target/scala-2.12/jars

 

*Issue 3: missing entrypoint.sh and decom.sh due to wrong reference*

While Issue 2 remains unresolved as I can't wrap my head around the missing 
jars folder (bin and sbin got copied successfully after I made a dummy jars 
folder) I then got stuck on these 2 steps:
{code:java}
COPY kubernetes/dockerfiles/spark/entrypoint.sh /opt/ COPY 
kubernetes/dockerfiles/spark/decom.sh /opt/{code}
 
 with:
  
{code:java}
Step 8/18 : COPY kubernetes/dockerfiles/spark/entrypoint.sh /opt/
COPY failed: stat 
/var/lib/docker/tmp/docker-builder638219776/kubernetes/dockerfiles/spark/entrypoint.sh:
 no such file or directory{code}
 
 which makes sense since the path should actually be:
  
 resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh
 resource-managers/kubernetes/docker/src/main/dockerfiles/spark/decom.sh
  
 *Remark*
  
 I only created one issue since this seems like somebody cleaned up the repo 
and forgot to change these. Am I missing something here? If I am, I apologise 
in advance since I am new to the Spark project. I also saw that some of these 
references were handled through vars in previous branches: 
[https://github.com/apache/spark/blob/branch-2.4/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile]
 (e.g. 2.4) but that also does not run out of the box.
  
 I am also really not sure about the affected versions since that was not 
transparent enough for me on GH - feel free to edit that field :) 
  
 Thanks in advance!
  
  
  

  was:
I am currently trying to follow the k8s instructions for Spark: 
[https://spark.apache.org/docs/latest/running-on-kubernetes.html] and when I 
clone apache/spark on GitHub on the master branch I saw multiple wrong folder 
references after trying to build my Docker image:

 

*Issue 1: The comments in the Dockerfile reference the wrong folder for the 
Dockerfile:*
{code:java}
# If this docker file is being used in the context of building your images from 
a Spark # distribution, the docker build command should be invoked from the top 
level directory # of the Spark distribution. E.g.: # docker build -t 
spark:latest -f kubernetes/dockerfiles/spark/Dockerfile .{code}
Well that docker build command simply won't run. I only got the following to 
run:
{code:java}
docker build -t spark:latest -f 
resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile . 
{code}
which is the actual path to the Dockerfile.

 

*Issue 2: jars folder does not exist*

After I read the tutorial I of course build spark first as per the instructions 
with:
{code:java}
./build/mvn -Pkubernetes -DskipTests clean package{code}
Nonetheless, in the Dockerfile I get this error when building:
{code:java}
Step 5/18 : COPY jars /opt/spark/jars
COPY failed: stat /var/lib/docker/tmp/docker-builder402673637/jars: no such 
file or directory{code}
 for which I may have found a similar issue here: 
[https://stackoverflow.com/questions/52451538/spark-for-kubernetes-test-on-mac]

I am new to Spark but I assume that this jars folder - if the build step would 
actually make it and I ran the maven build of the master branch successfully 
with the command I mentioned above - would exist in the root folder of the 
project.

 

*Issue 3: missing entrypoint.sh and decom.sh due to wrong reference*

While Issue 2 remains unresolved as I can't wrap my head around the missing 
jars folder (bin and sbin got copied successfully after I made a dummy jars 
folder) I then got stuck on these 2 steps:
{code:java}
COPY kubernetes/dockerfiles/spark/entrypoint.sh /opt/ COPY 
kubernetes/dockerfiles/spark/decom.sh /opt/{code}
 
 with:
  
{code:java}
Step 8/18 : COPY kubernetes/dockerfiles/spark/entrypoint.sh /opt/
COPY failed: stat 
/var/lib/docker/tmp/docker-builder638219776/kubernetes/dockerfiles/spark/entrypoint.sh:
 no such file or directory{code}
 
 which makes sense since the path should actually be:
  
 resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh
 resource-managers/kubernetes/docker/src/main/dockerfiles/spark/decom.sh
  
 *Remark*
  
 I only created one issue since this seems like somebody cleaned up the repo 
and forgot to change these. Am I missing something here? If I am, I apologise 
in advance since I am new to the Spark project. I also saw that some of these 
references were handled through vars in previous branches: 
[https://github.com/apache/spark/blob/branch-2.4/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile]
 (e.g. 2.4) but that also does not run out of the box.
  
 I am also really not sure about the affected versions since that was not 
transparent enough for me on GH - feel free to edit that field :) 
  
 I can also create a PR and change these but I need help with Issue 2 and the 
jar files since I am not sure what the correct path for that one is. Would love 
some help on this :) 
  
 Thanks in advance!
  
  
  


> Multiple wrong references in Dockerfile for k8s 
> ------------------------------------------------
>
>                 Key: SPARK-31165
>                 URL: https://issues.apache.org/jira/browse/SPARK-31165
>             Project: Spark
>          Issue Type: Bug
>          Components: Kubernetes, Spark Core
>    Affects Versions: 2.4.5, 3.0.0
>            Reporter: Nikolay Dimolarov
>            Priority: Minor
>
> I am currently trying to follow the k8s instructions for Spark: 
> [https://spark.apache.org/docs/latest/running-on-kubernetes.html] and when I 
> clone apache/spark on GitHub on the master branch I saw multiple wrong folder 
> references after trying to build my Docker image:
>  
> *Issue 1: The comments in the Dockerfile reference the wrong folder for the 
> Dockerfile:*
> {code:java}
> # If this docker file is being used in the context of building your images 
> from a Spark # distribution, the docker build command should be invoked from 
> the top level directory # of the Spark distribution. E.g.: # docker build -t 
> spark:latest -f kubernetes/dockerfiles/spark/Dockerfile .{code}
> Well that docker build command simply won't run. I only got the following to 
> run:
> {code:java}
> docker build -t spark:latest -f 
> resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile . 
> {code}
> which is the actual path to the Dockerfile.
>  
> *Issue 2: jars folder does not exist*
> After I read the tutorial I of course build spark first as per the 
> instructions with:
> {code:java}
> ./build/mvn -Pkubernetes -DskipTests clean package{code}
> Nonetheless, in the Dockerfile I get this error when building:
> {code:java}
> Step 5/18 : COPY jars /opt/spark/jars
> COPY failed: stat /var/lib/docker/tmp/docker-builder402673637/jars: no such 
> file or directory{code}
>  for which I may have found a similar issue here: 
> [https://stackoverflow.com/questions/52451538/spark-for-kubernetes-test-on-mac]
> I am new to Spark but I assume that this jars folder - if the build step 
> would actually make it and I ran the maven build of the master branch 
> successfully with the command I mentioned above - would exist in the root 
> folder of the project. Turns out it's here:
> spark/assembly/target/scala-2.12/jars
>  
> *Issue 3: missing entrypoint.sh and decom.sh due to wrong reference*
> While Issue 2 remains unresolved as I can't wrap my head around the missing 
> jars folder (bin and sbin got copied successfully after I made a dummy jars 
> folder) I then got stuck on these 2 steps:
> {code:java}
> COPY kubernetes/dockerfiles/spark/entrypoint.sh /opt/ COPY 
> kubernetes/dockerfiles/spark/decom.sh /opt/{code}
>  
>  with:
>   
> {code:java}
> Step 8/18 : COPY kubernetes/dockerfiles/spark/entrypoint.sh /opt/
> COPY failed: stat 
> /var/lib/docker/tmp/docker-builder638219776/kubernetes/dockerfiles/spark/entrypoint.sh:
>  no such file or directory{code}
>  
>  which makes sense since the path should actually be:
>   
>  resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh
>  resource-managers/kubernetes/docker/src/main/dockerfiles/spark/decom.sh
>   
>  *Remark*
>   
>  I only created one issue since this seems like somebody cleaned up the repo 
> and forgot to change these. Am I missing something here? If I am, I apologise 
> in advance since I am new to the Spark project. I also saw that some of these 
> references were handled through vars in previous branches: 
> [https://github.com/apache/spark/blob/branch-2.4/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile]
>  (e.g. 2.4) but that also does not run out of the box.
>   
>  I am also really not sure about the affected versions since that was not 
> transparent enough for me on GH - feel free to edit that field :) 
>   
>  Thanks in advance!
>   
>   
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to