Thanks for your reply. The SparkContext is configured as below: sparkConf.setAppName("WikipediaPageRank") sparkConf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer") sparkConf.set("spark.kryo.registrator", classOf[PRKryoRegistrator].getName) val inputFile = args(0) val threshold = args(1).toDouble val numPartitions = args(2).toInt val usePartitioner = args(3).toBoolean
sparkConf.setAppName("WikipediaPageRank") sparkConf.set("spark.executor.memory", "60g") sparkConf.set("spark.cores.max", "48") sparkConf.set("spark.kryoserializer.buffer.mb", "24") val sc = new SparkContext(sparkConf) sc.addJar("~/Documents/Scala/WikiPageRank/target/scala-2.10/wikipagerank_2.10-1.0.jar") And I use spark-submit to run the application: ./bin/spark-submit --master spark://sing12:7077 --total-executor-cores 40 --executor-memory 40g --class org.apache.spark.examples.bagel.WikipediaPageRank ~/Documents/Scala/WikiPageRank/target/scala-2.10/wikipagerank_2.10-1.0.jar hdfs://192.168.1.12:9000/freebase-26G 1 200 True Regards, Wang Hao(王灏) CloudTeam | School of Software Engineering Shanghai Jiao Tong University Address:800 Dongchuan Road, Minhang District, Shanghai, 200240 Email:wh.s...@gmail.com On Wed, Jul 16, 2014 at 1:41 PM, Tathagata Das <tathagata.das1...@gmail.com> wrote: > Are you using classes from external libraries that have not been added to > the sparkContext, using sparkcontext.addJar()? > > TD > > > On Tue, Jul 15, 2014 at 8:36 PM, Hao Wang <wh.s...@gmail.com> wrote: > >> I am running the WikipediaPageRank in Spark example and share the same >> problem with you: >> >> 4/07/16 11:31:06 DEBUG DAGScheduler: submitStage(Stage 6) >> 14/07/16 11:31:06 ERROR TaskSetManager: Task 6.0:450 failed 4 times; >> aborting job >> 14/07/16 11:31:06 INFO DAGScheduler: Failed to run foreach at >> Bagel.scala:251 >> Exception in thread "main" 14/07/16 11:31:06 INFO TaskSchedulerImpl: >> Cancelling stage 6 >> org.apache.spark.SparkException: Job aborted due to stage failure: Task >> 6.0:450 failed 4 times, most recent failure: Exception failure in TID 1330 >> on host sing11: com.esotericsoftware.kryo.KryoException: Unable to find >> class: arl Fridtjof Rode >> >> com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138) >> >> com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115) >> com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610) >> com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:721) >> com.twitter.chill.TraversableSerializer.read(Traversable.scala:44) >> com.twitter.chill.TraversableSerializer.read(Traversable.scala:21) >> com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729) >> >> org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:115) >> >> org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:125) >> org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71) >> >> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) >> scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) >> >> org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:58) >> >> org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:96) >> >> org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:95) >> org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582) >> >> Anyone cloud help? >> >> Regards, >> Wang Hao(王灏) >> >> CloudTeam | School of Software Engineering >> Shanghai Jiao Tong University >> Address:800 Dongchuan Road, Minhang District, Shanghai, 200240 >> Email:wh.s...@gmail.com >> >> >> On Tue, Jun 3, 2014 at 8:02 PM, Denes <te...@outlook.com> wrote: >> >>> I tried to use Kryo as a serialiser isn spark streaming, did everything >>> according to the guide posted on the spark website, i.e. added the >>> following >>> lines: >>> >>> conf.set("spark.serializer", >>> "org.apache.spark.serializer.KryoSerializer"); >>> conf.set("spark.kryo.registrator", "MyKryoRegistrator"); >>> >>> I also added the necessary classes to the MyKryoRegistrator. >>> >>> However I get the following strange error, can someone help me out where >>> to >>> look for a solution? >>> >>> 14/06/03 09:00:49 ERROR scheduler.JobScheduler: Error running job >>> streaming >>> job 1401778800000 ms.0 >>> org.apache.spark.SparkException: Job aborted due to stage failure: >>> Exception >>> while deserializing and fetching task: >>> com.esotericsoftware.kryo.KryoException: Unable to find class: J >>> Serialization trace: >>> id (org.apache.spark.storage.GetBlock) >>> at >>> org.apache.spark.scheduler.DAGScheduler.org >>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033) >>> at >>> >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017) >>> at >>> >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015) >>> at >>> >>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) >>> at >>> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >>> at >>> >>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015) >>> at >>> >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633) >>> at >>> >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633) >>> at scala.Option.foreach(Option.scala:236) >>> at >>> >>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633) >>> at >>> >>> org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207) >>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) >>> at akka.actor.ActorCell.invoke(ActorCell.scala:456) >>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) >>> at akka.dispatch.Mailbox.run(Mailbox.scala:219) >>> at >>> >>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) >>> at >>> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) >>> at >>> >>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) >>> at >>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) >>> at >>> >>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) >>> >>> >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Kyro-deserialisation-error-tp6798.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >> >> >