I vaguely remember running into this same error. It says there "java.io.NotSerializableException: org.apache.spark.streaming.examples.clickstream.PageView"... can you check the PageView class in the examples and make sure it has the @serializable directive? I seem to remember having to add it.
good luck, Thunder On Tue, Oct 29, 2013 at 6:54 AM, dachuan <[email protected]> wrote: > Hi, > > I have tried the clickstream example, it runs into an exception, anybody met > this before? > > Since the program mentioned "local[2]", so I run it in my local machine. > > thanks in advance, > dachuan. > > Log Snippet 1: > > 13/10/29 08:50:25 INFO scheduler.DAGScheduler: Submitting 46 missing tasks > from Stage 12 (MapPartitionsRDD[63] at combineByKey at > ShuffledDStream.scala:41) > 13/10/29 08:50:25 INFO local.LocalTaskSetManager: Size of task 75 is 4230 > bytes > 13/10/29 08:50:25 INFO local.LocalScheduler: Running 75 > 13/10/29 08:50:25 INFO spark.CacheManager: Cache key is rdd_9_0 > 13/10/29 08:50:25 INFO spark.CacheManager: Computing partition > org.apache.spark.rdd.BlockRDDPartition@0 > 13/10/29 08:50:25 WARN storage.BlockManager: Putting block rdd_9_0 failed > 13/10/29 08:50:25 INFO local.LocalTaskSetManager: Loss was due to > java.io.NotSerializableException > java.io.NotSerializableException: > org.apache.spark.streaming.examples.clickstream.PageView > > Log Snippet 2: > org.apache.spark.SparkException: Job failed: Task 12.0:0 failed more than 4 > times; aborting job java.io.NotSerializableException: > org.apache.spark.streaming.examples.clickstream.PageView > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:760) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:758) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60) > at > scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at > org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:758) > at > org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:379) > at > org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$run(DAGScheduler.scala:441) > at > org.apache.spark.scheduler.DAGScheduler$$anon$1.run(DAGScheduler.scala:149) > > Two commands that run this app: > ./run-example > org.apache.spark.streaming.exampl.clickstream.PageViewGenerator 44444 10 > ./run-example org.apache.spark.streaming.examples.clickstream.PageViewStream > errorRatePerZipCode localhost 44444 >
