I am using apache spark 0.8.0 to process a large data file and perform some
basic .map and.reduceByKey operations on the RDD.
Since I am using a single machine with multiple processors, I mention local[8]
in the Master URL field while creating SparkContext
val sc = new SparkContext("local[8]", "Tower-Aggs", SPARK_HOME )
But whenever I mention multiple processors, the job gets stuck (pauses/halts)
randomly. There is no definite place where it gets stuck, its just random.
Sometimes it won't happen at all. I am not sure if it continues after that but
it gets stuck for a long time after which I abort the job.
But when I just use local in place of local[8], the job runs seamlessly without
getting stuck ever.
val sc = new SparkContext("local", "Tower-Aggs", SPARK_HOME )
I am not able to understand where is the problem.
I am using Scala 2.9.3 and sbt to build and run the application
-
http://stackoverflow.com/questions/20187048/apache-spark-localk-master-url-job-gets-stuck
Thx
Vijay Gaikwad
University of Washington MSIM
[email protected]
(206) 261-5828