Repository: spark
Updated Branches:
  refs/heads/master 3f4bda728 -> 15747cfd3


[SPARK-24547][K8S] Allow for building spark on k8s docker images without cache 
and don't forget to push spark-py container.

## What changes were proposed in this pull request?

https://issues.apache.org/jira/browse/SPARK-24547

TL;DR from JIRA issue:

- First time I generated images for 2.4.0 Docker was using it's cache, so 
actually when running jobs, old jars where still in the Docker image. This 
produces errors like this in the executors:

`java.io.InvalidClassException: org.apache.spark.storage.BlockManagerId; local 
class incompatible: stream classdesc serialVersionUID = 6155820641931972169, 
local class serialVersionUID = -3720498261147521051`

- The second problem was that the spark container is pushed, but the spark-py 
container wasn't yet. This was just forgotten in the initial PR.

- A third problem I also ran into because I had an older docker was 
https://github.com/apache/spark/pull/21551 so I have not included a fix for 
that in this ticket.

## How was this patch tested?

I've tested it on my own Spark on k8s deployment.

Author: Ray Burgemeestre <ray.burgemees...@brightcomputing.com>

Closes #21555 from rayburgemeestre/SPARK-24547.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/15747cfd
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/15747cfd
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/15747cfd

Branch: refs/heads/master
Commit: 15747cfd3246385ffb23e19e28d2e4effa710bf6
Parents: 3f4bda7
Author: Ray Burgemeestre <ray.burgemees...@brightcomputing.com>
Authored: Wed Jun 20 17:09:37 2018 -0700
Committer: Anirudh Ramanathan <anir...@rockset.com>
Committed: Wed Jun 20 17:09:37 2018 -0700

----------------------------------------------------------------------
 bin/docker-image-tool.sh | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/15747cfd/bin/docker-image-tool.sh
----------------------------------------------------------------------
diff --git a/bin/docker-image-tool.sh b/bin/docker-image-tool.sh
index a871ab5..a3f1bcf 100755
--- a/bin/docker-image-tool.sh
+++ b/bin/docker-image-tool.sh
@@ -70,17 +70,18 @@ function build {
   local BASEDOCKERFILE=${BASEDOCKERFILE:-"$IMG_PATH/spark/Dockerfile"}
   local 
PYDOCKERFILE=${PYDOCKERFILE:-"$IMG_PATH/spark/bindings/python/Dockerfile"}
 
-  docker build "${BUILD_ARGS[@]}" \
+  docker build $NOCACHEARG "${BUILD_ARGS[@]}" \
     -t $(image_ref spark) \
     -f "$BASEDOCKERFILE" .
 
-    docker build "${BINDING_BUILD_ARGS[@]}" \
+  docker build $NOCACHEARG "${BINDING_BUILD_ARGS[@]}" \
     -t $(image_ref spark-py) \
     -f "$PYDOCKERFILE" .
 }
 
 function push {
   docker push "$(image_ref spark)"
+  docker push "$(image_ref spark-py)"
 }
 
 function usage {
@@ -99,6 +100,7 @@ Options:
   -r repo     Repository address.
   -t tag      Tag to apply to the built image, or to identify the image to be 
pushed.
   -m          Use minikube's Docker daemon.
+  -n          Build docker image with --no-cache
 
 Using minikube when building images will do so directly into minikube's Docker 
daemon.
 There is no need to push the images into minikube in that case, they'll be 
automatically
@@ -127,7 +129,8 @@ REPO=
 TAG=
 BASEDOCKERFILE=
 PYDOCKERFILE=
-while getopts f:mr:t: option
+NOCACHEARG=
+while getopts f:mr:t:n option
 do
  case "${option}"
  in
@@ -135,6 +138,7 @@ do
  p) PYDOCKERFILE=${OPTARG};;
  r) REPO=${OPTARG};;
  t) TAG=${OPTARG};;
+ n) NOCACHEARG="--no-cache";;
  m)
    if ! which minikube 1>/dev/null; then
      error "Cannot find minikube."


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to