Hi Paolo,
The custom classes and jars are distributed across the Spark cluster via an
HTTP server on the master when the absolute path of the application fat jar is
specified in the spark-submit script. The Advanced Dependency Management
section on https://spark.apache.org/docs/latest/submitting-applications.html
explains that.
Could that be the reason for the worker access the master? However I don’t know
the cause of the error.
Thanks,
Saket
On 27 Oct 2014, at 19:39, Paolo Platter wrote:
> Hi all,
>
> I’m submitting a simple task using the spark shell against a cassandraRDD (
> Datastax Environment ).
> I’m getting the following eception from one of the workers:
>
> INFO 2014-10-27 14:08:03 akka.event.slf4j.Slf4jLogger: Slf4jLogger started
> INFO 2014-10-27 14:08:03 Remoting: Starting remoting
> INFO 2014-10-27 14:08:03 Remoting: Remoting started; listening on addresses
> :[akka.tcp://sparkExecutor@10.105.111.130:50234]
> INFO 2014-10-27 14:08:03 Remoting: Remoting now listens on addresses:
> [akka.tcp://sparkExecutor@10.105.111.130:50234]
> INFO 2014-10-27 14:08:03
> org.apache.spark.executor.CoarseGrainedExecutorBackend: Connecting to driver:
> akka.tcp://sp...@srv02.pocbgsia.ats-online.it:39797/user/CoarseGrainedScheduler
> INFO 2014-10-27 14:08:03 org.apache.spark.deploy.worker.WorkerWatcher:
> Connecting to worker akka.tcp://sparkWorker@10.105.111.130:34467/user/Worker
> INFO 2014-10-27 14:08:04 org.apache.spark.deploy.worker.WorkerWatcher:
> Successfully connected to
> akka.tcp://sparkWorker@10.105.111.130:34467/user/Worker
> INFO 2014-10-27 14:08:04
> org.apache.spark.executor.CoarseGrainedExecutorBackend: Successfully
> registered with driver
> INFO 2014-10-27 14:08:04 org.apache.spark.executor.Executor: Using REPL class
> URI: http://159.8.18.11:51705
> INFO 2014-10-27 14:08:04 akka.event.slf4j.Slf4jLogger: Slf4jLogger started
> INFO 2014-10-27 14:08:04 Remoting: Starting remoting
> INFO 2014-10-27 14:08:04 Remoting: Remoting started; listening on addresses
> :[akka.tcp://spark@10.105.111.130:49243]
> INFO 2014-10-27 14:08:04 Remoting: Remoting now listens on addresses:
> [akka.tcp://spark@10.105.111.130:49243]
> INFO 2014-10-27 14:08:04 org.apache.spark.SparkEnv: Connecting to
> BlockManagerMaster:
> akka.tcp://sp...@srv02.pocbgsia.ats-online.it:39797/user/BlockManagerMaster
> INFO 2014-10-27 14:08:04 org.apache.spark.storage.DiskBlockManager: Created
> local directory at
> /usr/share/dse/spark/tmp/executor/spark-local-20141027140804-4d84
> INFO 2014-10-27 14:08:04 org.apache.spark.storage.MemoryStore: MemoryStore
> started with capacity 23.0 GB.
> INFO 2014-10-27 14:08:04 org.apache.spark.network.ConnectionManager: Bound
> socket to port 50542 with id = ConnectionManagerId(10.105.111.130,50542)
> INFO 2014-10-27 14:08:04 org.apache.spark.storage.BlockManagerMaster: Trying
> to register BlockManager
> INFO 2014-10-27 14:08:04 org.apache.spark.storage.BlockManagerMaster:
> Registered BlockManager
> INFO 2014-10-27 14:08:04 org.apache.spark.SparkEnv: Connecting to
> MapOutputTracker:
> akka.tcp://sp...@srv02.pocbgsia.ats-online.it:39797/user/MapOutputTracker
> INFO 2014-10-27 14:08:04 org.apache.spark.HttpFileServer: HTTP File server
> directory is
> /usr/share/dse/spark/tmp/executor/spark-a23656dc-efce-494b-875a-a1cf092c3230
> INFO 2014-10-27 14:08:04 org.apache.spark.HttpServer: Starting HTTP Server
> INFO 2014-10-27 14:08:27
> org.apache.spark.executor.CoarseGrainedExecutorBackend: Got assigned task 0
> INFO 2014-10-27 14:08:28 org.apache.spark.executor.Executor: Running task ID 0
> ERROR 2014-10-27 14:08:28 org.apache.spark.executor.Executor: Exception in
> task ID 0
> java.lang.ClassNotFoundException: com.datastax.bdp.spark.CassandraRDD
> at
> org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:49)
> at java.lang.ClassLoader.loadClass(Unknown Source)
> at java.lang.ClassLoader.loadClass(Unknown Source)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Unknown Source)
> at
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:37)
> at java.io.ObjectInputStream.readNonProxyDesc(Unknown Source)
> at java.io.ObjectInputStream.readClassDesc(Unknown Source)
> at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
> at java.io.ObjectInputStream.readObject0(Unknown Source)
> at java.io.ObjectInputStream.readObject(Unknown Source)
> at
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
> at
> org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:63)
> at
> org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:139)
> at java.io.ObjectInputStream.readExternalData(Unknown Source)
> at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
> at java.io.ObjectInputStream.readObjec