[ https://issues.apache.org/jira/browse/SPARK-24547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Anirudh Ramanathan resolved SPARK-24547. ---------------------------------------- Resolution: Fixed Fix Version/s: 2.4.0 > Spark on K8s docker-image-tool.sh improvements > ---------------------------------------------- > > Key: SPARK-24547 > URL: https://issues.apache.org/jira/browse/SPARK-24547 > Project: Spark > Issue Type: Improvement > Components: Kubernetes > Affects Versions: 2.4.0 > Reporter: Ray Burgemeestre > Priority: Minor > Labels: docker, kubernetes, spark > Fix For: 2.4.0 > > > *Context* > PySpark support for Spark on k8s was merged with > [https://github.com/apache/spark/pull/21092/files] few days ago > There is a helper script that can be used to create docker containers to run > java and now also python jobs. It works like this: > {{/path/to/docker-image-tool.sh -r node001:5000/brightcomputing -t v2.4.0 > build}} > {{/path/to/docker-image-tool.sh -r node001:5000/brightcomputing -t v2.4.0 > push}} > *Problem* > I ran into three two issues. First time I generated images for 2.4.0 Docker > was using it's cache, so actually when running jobs, old jars where still in > the Docker image. This produces errors like this in the executors: > {code:java} > 2018-06-13 10:27:52 INFO NettyBlockTransferService:54 - Server created on > 172.29.3.4:44877^M 2018-06-13 10:27:52 INFO BlockManager:54 - Using > org.apache.spark.storage.RandomBlockReplicationPolicy for block replication > policy^M 2018-06-13 10:27:52 INFO BlockManagerMaster:54 - Registering > BlockManager BlockManagerId(1, 172.29.3.4, 44877, None)^M 2018-06-13 10:27:52 > ERROR CoarseGrainedExecutorBackend:91 - Executor self-exiting due to : Unable > to create executor due to Exception thrown in awaitResult: ^M > org.apache.spark.SparkException: Exception thrown in awaitResult: ^M ^Iat > org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)^M ^Iat > org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)^M ^Iat > org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:92)^M ^Iat > org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:76)^M ^Iat > org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:64)^M > ^Iat > org.apache.spark.storage.BlockManager.initialize(BlockManager.scala:241)^M > ^Iat org.apache.spark.executor.Executor.<init>(Executor.scala:116)^M ^Iat > org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:83)^M > ^Iat > org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:117)^M > ^Iat org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:205)^M ^Iat > org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:101)^M ^Iat > org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:221)^M > ^Iat > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)^M > ^Iat > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)^M > ^Iat java.lang.Thread.run(Thread.java:748)^M Caused by: > java.lang.RuntimeException: java.io.InvalidClassException: > org.apache.spark.storage.BlockManagerId; local class incompatible: stream > classdesc serialVersionUID = 6155820641931972169, local class > serialVersionUID = -3720498261147521051^M ^Iat > java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:687)^M ^Iat > java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1880)^M > ^Iat java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1746)^M > {code} > To avoid that Docker has to build without it's cache, but only if you have > build for an older version in the past... > The second problem was that the spark container is pushed, but the spark-py > container wasn't yet. This was just forgotten in the initial PR. > (A third problem I also ran into because I had an older docker was > [https://github.com/apache/spark/pull/21551] so I have not included a fix for > that in this ticket.) > Other than that it works great! > *Solution* > I've added an extra flag so it's possible to call build with `-n` for > --no-cache`. > And I've added the extra push for the spark-py container. > *Example* > ./bin/docker-image-tool.sh -r docker.io/myrepo -t v2.3.0 -n build > Snippet from the help output: > {code:java} > Options: > -f file Dockerfile to build for JVM based Jobs. By default builds the > Dockerfile shipped with Spark. > -p file Dockerfile with Python baked in. By default builds the Dockerfile > shipped with Spark. > -r repo Repository address. > -t tag Tag to apply to the built image, or to identify the image to be pushed. > -m Use minikube's Docker daemon. > -n Build docker image with --no-cache{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org