I start the Spark docker with this commands on the host machine
curl -LO 
https://raw.githubusercontent.com/bitnami/bitnami-docker-spark/master/docker-compose.yml
# edit and expose port 7077 to host machine
$ docker-compose up


then, i run docker run --net=host apache/beam_spark_job_server:latest 
--spark-master-url=spark://localhost:7077 on the host machine

lastly, i run this command on the host machine python -m 
apache_beam.examples.wordcount --input ./a_file_input \
                                         --output ./counts \
                                         --runner=PortableRunner 
--job_endpoint=localhost:8099 --environment_type=LOOPBACK


I get below errors from the beam_spark_job_server docker container

21/07/28 13:41:56 INFO org.apache.beam.runners.spark.SparkJobInvoker: Invoking 
job BeamApp-root-0728134156-a85a64c5_019bd04e-aabc-4deb-8b12-8c61f173cd5b
21/07/28 13:41:56 INFO org.apache.beam.runners.jobsubmission.JobInvocation: 
Starting job invocation 
BeamApp-root-0728134156-a85a64c5_019bd04e-aabc-4deb-8b12-8c61f173cd5b
21/07/28 13:41:56 INFO 
org.apache.beam.runners.core.construction.resources.PipelineResources: 
PipelineOptions.filesToStage was not specified. Defaulting to files from the 
classpath: will stage 6 files. Enable logging at DEBUG level to see which files 
will be staged.
21/07/28 13:41:57 INFO 
org.apache.beam.runners.spark.translation.SparkContextFactory: Creating a brand 
new Spark Context.
21/07/28 13:41:57 WARN org.apache.spark.util.Utils: Your hostname, cometstrike 
resolves to a loopback address: 127.0.1.1; using 192.168.<real_ip> instead (on 
interface wlo1)
21/07/28 13:41:57 WARN org.apache.spark.util.Utils: Set SPARK_LOCAL_IP if you 
need to bind to another address
21/07/28 13:41:57 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
21/07/28 13:42:57 ERROR 
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend: Application has 
been killed. Reason: All masters are unresponsive! Giving up.
21/07/28 13:42:57 WARN 
org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend: Application ID 
is not initialized yet.
21/07/28 13:42:57 WARN 
org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint: Drop 
UnregisterApplication(null) because has not yet connected to master
21/07/28 13:42:57 ERROR org.apache.spark.SparkContext: Error initializing 
SparkContext.
java.lang.NullPointerException
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:560)
    at 
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
    at 
org.apache.beam.runners.spark.translation.SparkContextFactory.createSparkContext(SparkContextFactory.java:101)
    at 
org.apache.beam.runners.spark.translation.SparkContextFactory.getSparkContext(SparkContextFactory.java:67)
    at 
org.apache.beam.runners.spark.SparkPipelineRunner.run(SparkPipelineRunner.java:118)
    at 
org.apache.beam.runners.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:86)
    at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
    at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
    at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
21/07/28 13:42:57 ERROR org.apache.spark.scheduler.AsyncEventQueue: Listener 
AppStatusListener threw an exception
java.lang.NullPointerException
    at 
org.apache.spark.status.AppStatusListener.onApplicationEnd(AppStatusListener.scala:167)
    at 
org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:57)
    at 
org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
    at 
org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
    at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:91)
    at 
org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$super$postToAll(AsyncEventQueue.scala:92)
    at 
org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply$mcJ$sp(AsyncEventQueue.scala:92)
    at 
org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply(AsyncEventQueue.scala:87)
    at 
org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply(AsyncEventQueue.scala:87)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
    at 
org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:87)
    at 
org.apache.spark.scheduler.AsyncEventQueue$$anon$1$$anonfun$run$1.apply$mcV$sp(AsyncEventQueue.scala:83)
    at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1302)
    at 
org.apache.spark.scheduler.AsyncEventQueue$$anon$1.run(AsyncEventQueue.scala:82)
21/07/28 13:42:57 ERROR org.apache.beam.runners.jobsubmission.JobInvocation: 
Error during job invocation 
BeamApp-root-0728134156-a85a64c5_019bd04e-aabc-4deb-8b12-8c61f173cd5b.
java.lang.NullPointerException
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:560)
    at 
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
    at 
org.apache.beam.runners.spark.translation.SparkContextFactory.createSparkContext(SparkContextFactory.java:101)
    at 
org.apache.beam.runners.spark.translation.SparkContextFactory.getSparkContext(SparkContextFactory.java:67)
    at 
org.apache.beam.runners.spark.SparkPipelineRunner.run(SparkPipelineRunner.java:118)
    at 
org.apache.beam.runners.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:86)
    at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
    at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
    at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)



Error in bitnami spark container
spark_1           | 21/07/28 13:46:20 ERROR TransportRequestHandler: Error 
while invoking RpcHandler#receive() for one-way message.
spark_1           | java.io.InvalidClassException: 
org.apache.spark.deploy.ApplicationDescription; local class incompatible: 
stream classdesc serialVersionUID = 6543101073799644159, local class 
serialVersionUID = 1574364215946805297
spark_1           |     at 
java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699)
spark_1           |     at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2003)
spark_1           |     at 
java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1850)
spark_1           |     at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2160)
spark_1           |     at 
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667)
spark_1           |     at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2405)
spark_1           |     at 
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2329)
spark_1           |     at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2187)
spark_1           |     at 
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667)
spark_1           |     at 
java.io.ObjectInputStream.readObject(ObjectInputStream.java:503)
spark_1           |     at 
java.io.ObjectInputStream.readObject(ObjectInputStream.java:461)
spark_1           |     at 
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:76)
spark_1           |     at 
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:109)
spark_1           |     at 
org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$deserialize$2(NettyRpcEnv.scala:299)
spark_1           |     at 
scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
spark_1           |     at 
org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:352)
spark_1           |     at 
org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$deserialize$1(NettyRpcEnv.scala:298)
spark_1           |     at 
scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
spark_1           |     at 
org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:298)
spark_1           |     at 
org.apache.spark.rpc.netty.RequestMessage$.apply(NettyRpcEnv.scala:647)
spark_1           |     at 
org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:698)
spark_1           |     at 
org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:690)
spark_1           |     at 
org.apache.spark.network.server.TransportRequestHandler.processOneWayMessage(TransportRequestHandler.java:255)
spark_1           |     at 
org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:111)
spark_1           |     at 
org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:140)
spark_1           |     at 
org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:53)
spark_1           |     at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
spark_1           |     at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
spark_1           |     at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
spark_1           |     at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
spark_1           |     at 
io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
spark_1           |     at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
spark_1           |     at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
spark_1           |     at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
spark_1           |     at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
spark_1           |     at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
spark_1           |     at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
spark_1           |     at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
spark_1           |     at 
org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:102)
spark_1           |     at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
spark_1           |     at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
spark_1           |     at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
spark_1           |     at 
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
spark_1           |     at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
spark_1           |     at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
spark_1           |     at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
spark_1           |     at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
spark_1           |     at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
spark_1           |     at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
spark_1           |     at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
spark_1           |     at 
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
spark_1           |     at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
spark_1           |     at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
spark_1           |     at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
spark_1           |     at java.lang.Thread.run(Thread.java:748)


Can advice on how to get this working? Any simple example of running python 
code + beam_spark_job_serve + spark in docker ?


Reply via email to