Re: Standalone client failing with docker deployed cluster
(Trying to bubble up the issue again...) Any insights (based on the enclosed logs) into why standalone client invocation might fail while issuing jobs through the spark console succeeded? Thanks, Bharath On Thu, May 15, 2014 at 5:08 PM, Bharath Ravi Kumar wrote: > Hi, > > I'm running the spark server with a single worker on a laptop using the > docker images. The spark shell examples run fine with this setup. However, > a standalone java client that tries to run wordcount on a local files (1 MB > in size), the execution fails with the following error on the stdout of the > worker: > > 14/05/15 10:31:21 INFO Slf4jLogger: Slf4jLogger started > 14/05/15 10:31:21 INFO Remoting: Starting remoting > 14/05/15 10:31:22 INFO Remoting: Remoting started; listening on addresses > :[akka.tcp://sparkExecutor@worker1:55924] > 14/05/15 10:31:22 INFO Remoting: Remoting now listens on addresses: > [akka.tcp://sparkExecutor@worker1:55924] > 14/05/15 10:31:22 INFO CoarseGrainedExecutorBackend: Connecting to driver: > akka.tcp://spark@R9FX97h.local:56720/user/CoarseGrainedScheduler > 14/05/15 10:31:22 INFO WorkerWatcher: Connecting to worker > akka.tcp://sparkWorker@worker1:50040/user/Worker > 14/05/15 10:31:22 WARN Remoting: Tried to associate with unreachable > remote address [akka.tcp://spark@R9FX97h.local:56720]. Address is now > gated for 6 ms, all messages to this address will be delivered to dead > letters. > 14/05/15 10:31:22 ERROR CoarseGrainedExecutorBackend: Driver Disassociated > [akka.tcp://sparkExecutor@worker1:55924] -> > [akka.tcp://spark@R9FX97h.local:56720] disassociated! Shutting down. > > I noticed the following messages on the worker console when I attached > through docker: > > 14/05/15 11:24:33 INFO Worker: Asked to launch executor > app-20140515112408-0005/7 for billingLogProcessor > 14/05/15 11:24:33 ERROR EndpointWriter: AssociationError > [akka.tcp://sparkWorker@worker1:50040] -> > [akka.tcp://sparkExecutor@worker1:42437]: Error [Association failed with > [akka.tcp://sparkExecutor@worker1:42437]] [ > akka.remote.EndpointAssociationException: Association failed with > [akka.tcp://sparkExecutor@worker1:42437] > Caused by: > akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: > Connection refused: worker1/172.17.0.4:42437 > ] > 14/05/15 11:24:33 ERROR EndpointWriter: AssociationError > [akka.tcp://sparkWorker@worker1:50040] -> > [akka.tcp://sparkExecutor@worker1:42437]: Error [Association failed with > [akka.tcp://sparkExecutor@worker1:42437]] [ > akka.remote.EndpointAssociationException: Association failed with > [akka.tcp://sparkExecutor@worker1:42437] > Caused by: > akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: > Connection refused: worker1/172.17.0.4:42437 > ] > 14/05/15 11:24:33 INFO ExecutorRunner: Launch command: > "/usr/lib/jvm/java-7-openjdk-amd64/bin/java" "-cp" > ":/opt/spark-0.9.0/conf:/opt/spark-0.9.0/assembly/target/scala-2.10/spark-assembly_2.10-0.9.0-incubating-hadoop1.0.4.jar" > "-Xms512M" "-Xmx512M" > "org.apache.spark.executor.CoarseGrainedExecutorBackend" > "akka.tcp://spark@R9FX97h.local:46986/user/CoarseGrainedScheduler" "7" > "worker1" "1" "akka.tcp://sparkWorker@worker1:50040/user/Worker" > "app-20140515112408-0005" > 14/05/15 11:24:35 INFO Worker: Executor app-20140515112408-0005/7 finished > with state FAILED message Command exited with code 1 exitStatus 1 > 14/05/15 11:24:35 INFO LocalActorRef: Message > [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying] from > Actor[akka://sparkWorker/deadLetters] to > Actor[akka://sparkWorker/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FsparkWorker%40172.17.0.4%3A33648-135#310170905] > was not delivered. [34] dead letters encountered. This logging can be > turned off or adjusted with configuration settings 'akka.log-dead-letters' > and 'akka.log-dead-letters-during-shutdown'. > 14/05/15 11:24:35 ERROR EndpointWriter: AssociationError > [akka.tcp://sparkWorker@worker1:50040] -> > [akka.tcp://sparkExecutor@worker1:56594]: Error [Association failed with > [akka.tcp://sparkExecutor@worker1:56594]] [ > akka.remote.EndpointAssociationException: Association failed with > [akka.tcp://sparkExecutor@worker1:56594] > Caused by: > akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: > Connection refused: worker1/172.17.0.4:56594 > ] > 14/05/15 11:24:35 ERROR EndpointWriter: AssociationError > [akka.tcp://sparkWorker@worker1:50040] -> > [akka.tcp://sparkExecutor@worker1:56594]: Error [Association failed with > [akka.tcp://sparkExecutor@worker1:56594]] [ > akka.remote.EndpointAssociationException: Association failed with > [akka.tcp://sparkExecutor@worker1:56594] > Caused by: > akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: > Connection refused: worker1/172.17.0.4:56594 > ] > > The significant code snippets from the standalone java client are as > follows: > > JavaSparkContext ctx = new JavaSparkContext(masterAddr
Standalone client failing with docker deployed cluster
Hi, I'm running the spark server with a single worker on a laptop using the docker images. The spark shell examples run fine with this setup. However, a standalone java client that tries to run wordcount on a local files (1 MB in size), the execution fails with the following error on the stdout of the worker: 14/05/15 10:31:21 INFO Slf4jLogger: Slf4jLogger started 14/05/15 10:31:21 INFO Remoting: Starting remoting 14/05/15 10:31:22 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkExecutor@worker1:55924] 14/05/15 10:31:22 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkExecutor@worker1:55924] 14/05/15 10:31:22 INFO CoarseGrainedExecutorBackend: Connecting to driver: akka.tcp://spark@R9FX97h.local:56720/user/CoarseGrainedScheduler 14/05/15 10:31:22 INFO WorkerWatcher: Connecting to worker akka.tcp://sparkWorker@worker1:50040/user/Worker 14/05/15 10:31:22 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://spark@R9FX97h.local:56720]. Address is now gated for 6 ms, all messages to this address will be delivered to dead letters. 14/05/15 10:31:22 ERROR CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@worker1:55924] -> [akka.tcp://spark@R9FX97h.local:56720] disassociated! Shutting down. I noticed the following messages on the worker console when I attached through docker: 14/05/15 11:24:33 INFO Worker: Asked to launch executor app-20140515112408-0005/7 for billingLogProcessor 14/05/15 11:24:33 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@worker1:50040] -> [akka.tcp://sparkExecutor@worker1:42437]: Error [Association failed with [akka.tcp://sparkExecutor@worker1:42437]] [ akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor@worker1:42437] Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: worker1/172.17.0.4:42437 ] 14/05/15 11:24:33 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@worker1:50040] -> [akka.tcp://sparkExecutor@worker1:42437]: Error [Association failed with [akka.tcp://sparkExecutor@worker1:42437]] [ akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor@worker1:42437] Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: worker1/172.17.0.4:42437 ] 14/05/15 11:24:33 INFO ExecutorRunner: Launch command: "/usr/lib/jvm/java-7-openjdk-amd64/bin/java" "-cp" ":/opt/spark-0.9.0/conf:/opt/spark-0.9.0/assembly/target/scala-2.10/spark-assembly_2.10-0.9.0-incubating-hadoop1.0.4.jar" "-Xms512M" "-Xmx512M" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "akka.tcp://spark@R9FX97h.local:46986/user/CoarseGrainedScheduler" "7" "worker1" "1" "akka.tcp://sparkWorker@worker1:50040/user/Worker" "app-20140515112408-0005" 14/05/15 11:24:35 INFO Worker: Executor app-20140515112408-0005/7 finished with state FAILED message Command exited with code 1 exitStatus 1 14/05/15 11:24:35 INFO LocalActorRef: Message [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying] from Actor[akka://sparkWorker/deadLetters] to Actor[akka://sparkWorker/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FsparkWorker%40172.17.0.4%3A33648-135#310170905] was not delivered. [34] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. 14/05/15 11:24:35 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@worker1:50040] -> [akka.tcp://sparkExecutor@worker1:56594]: Error [Association failed with [akka.tcp://sparkExecutor@worker1:56594]] [ akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor@worker1:56594] Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: worker1/172.17.0.4:56594 ] 14/05/15 11:24:35 ERROR EndpointWriter: AssociationError [akka.tcp://sparkWorker@worker1:50040] -> [akka.tcp://sparkExecutor@worker1:56594]: Error [Association failed with [akka.tcp://sparkExecutor@worker1:56594]] [ akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor@worker1:56594] Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: worker1/172.17.0.4:56594 ] The significant code snippets from the standalone java client are as follows: JavaSparkContext ctx = new JavaSparkContext(masterAddr, "log_processor", sparkHome, jarFileLoc); JavaRDD rawLog = ctx.textFile("/tmp/some.log"); List> topRecords = rawLog.map(fieldSplitter).map(fieldExtractor).top(5, tupleComparator); However, running the sample code provided on github (amplab docker page) over the spark shell went through fine with the following stdout message: 14/05/15 10:39:41 INFO Slf4jLogger: Slf4jLogger started 14/05/15 10:39:42 INFO Remoting: Starting remoting 14/05/15 1