Problem of submitting Spark task to cluster from eclipse IDE on Windows

superbee84 Tue, 22 Dec 2015 20:33:07 -0800

Hi All,

   I'm new to Spark. Before I describe the problem, I'd like to let you know
the role of the machines that organize the cluster and the purpose of my
work. By reading and follwing the instructions and tutorials, I successfully
built up a cluster with 7 CentOS-6.5 machines. I installed Hadoop 2.7.1,
Spark 1.5.1, Scala 2.10.4 and ZooKeeper 3.4.5 on them. The details are
listed as below:



Host Name  |  IP Address  |  Hadoop 2.7.1         | Spark 1.5.1        | 
ZooKeeper 
hadoop00   | 10.20.17.70  | NameNode(Active)   | Master(Active)   |   none
hadoop01   | 10.20.17.71  | NameNode(Standby)| Master(Standby) |   none
hadoop02   | 10.20.17.72  | ResourceManager(Active)| none          |   none
hadoop03   | 10.20.17.73  | ResourceManager(Standby)| none        |  none
hadoop04   | 10.20.17.74  | DataNode              |  Worker              |
JournalNode
hadoop05   | 10.20.17.75  | DataNode              |  Worker              |
JournalNode
hadoop06   | 10.20.17.76  | DataNode              |  Worker              |
JournalNode

   Now my *purpose* is to develop Hadoop/Spark applications on my own
computer(IP: 10.20.6.23) and submit them to the remote cluster. As all the
other guys in our group are in the habit of eclipse on Windows, I'm trying
to work on this. I have successfully submitted the WordCount MapReduce job
to YARN and it run smoothly through eclipse and Windows. But when I tried to
run the Spark WordCount, it gives me the following error in the eclipse
console:

15/12/23 11:15:30 INFO AppClient$ClientEndpoint: Connecting to master
spark://10.20.17.70:7077...
15/12/23 11:15:50 ERROR SparkUncaughtExceptionHandler: Uncaught exception in
thread Thread[appclient-registration-retry-thread,5,main]
java.util.concurrent.RejectedExecutionException: Task
java.util.concurrent.FutureTask@29ed85e7 rejected from
java.util.concurrent.ThreadPoolExecutor@28f21632[Running, pool size = 1,
active threads = 0, queued tasks = 0, completed tasks = 1]
        at
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown
Source)
        at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
        at java.util.concurrent.AbstractExecutorService.submit(Unknown Source)
        at
org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1.apply(AppClient.scala:96)
        at
org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1.apply(AppClient.scala:95)
        at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
        at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
        at
org.apache.spark.deploy.client.AppClient$ClientEndpoint.tryRegisterAllMasters(AppClient.scala:95)
        at
org.apache.spark.deploy.client.AppClient$ClientEndpoint.org$apache$spark$deploy$client$AppClient$ClientEndpoint$$registerWithMaster(AppClient.scala:121)
        at
org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2$$anonfun$run$1.apply$mcV$sp(AppClient.scala:132)
        at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1119)
        at
org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2.run(AppClient.scala:124)
        at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
        at java.util.concurrent.FutureTask.runAndReset(Unknown Source)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown
Source)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)
15/12/23 11:15:50 INFO DiskBlockManager: Shutdown hook called
15/12/23 11:15:50 INFO ShutdownHookManager: Shutdown hook called

    Then I checked the Spark Master log, and find the following critical
statements:

15/12/23 11:15:33 ERROR ErrorMonitor: dropping message [class
akka.actor.ActorSelectionMessage] for non-local recipient
[Actor[akka.tcp://sparkMaster@10.20.17.70:7077/]] arriving at
[akka.tcp://sparkMaster@10.20.17.70:7077] inbound addresses are
[akka.tcp://sparkMaster@hadoop00:7077]
akka.event.Logging$Error$NoCause$
15/12/23 11:15:53 INFO Master: 10.20.6.23:56374 got disassociated, removing
it.
15/12/23 11:15:53 INFO Master: 10.20.6.23:56374 got disassociated, removing
it.
15/12/23 11:15:53 WARN ReliableDeliverySupervisor: Association with remote
system [akka.tcp://sparkDriver@10.20.6.23:56374] has failed, address is now
gated for [5000] ms. Reason: [Disassociated] 

    Here's my Scala code:

   object WordCount{
  def main(args: Array[String]){
    val conf = new SparkConf().setAppName("Scala
WordCount").setMaster("spark://10.20.17.70:7077").setJars(List("C:\\Temp\\test.jar"));
    val sc = new SparkContext(conf);
    val textFile = sc.textFile("hdfs://10.20.17.70:9000/wc/indata/wht.txt");
    textFile.flatMap(_.split(" ")).map((_,
1)).reduceByKey(_+_).collect().foreach(println);
  }
} 

    To solve the problem, I tried the following:

    (1) run spark-shell to check the Scala version, and proved that to be
2.10.4 and compatible with the eclipse-scala plugin.
    (2) run spark-submit on the SparkPi examle by specifying the --master
param to "10.20.17.70:7077", and it successfully worked out the result. I
was also able to see the application history on the Master's Web UI.
    (3) I turned off the firewall on my Windows machine.

    Unfortunately, the error message remains. Could anybody give me some
suggestions ? Thanks very much!

Yours Sincerely,
Yefeng



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Problem-of-submitting-Spark-task-to-cluster-from-eclipse-IDE-on-Windows-tp25778.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Problem of submitting Spark task to cluster from eclipse IDE on Windows

Reply via email to