Hi,
I have tried the clickstream example, it runs into an exception, anybody
met this before?
Since the program mentioned "local[2]", so I run it in my local machine.
thanks in advance,
dachuan.
Log Snippet 1:
13/10/29 08:50:25 INFO scheduler.DAGScheduler: Submitting 46 missing tasks
from Stage 12 (MapPartitionsRDD[63] at combineByKey at
ShuffledDStream.scala:41)
13/10/29 08:50:25 INFO local.LocalTaskSetManager: Size of task 75 is 4230
bytes
13/10/29 08:50:25 INFO local.LocalScheduler: Running 75
13/10/29 08:50:25 INFO spark.CacheManager: Cache key is rdd_9_0
13/10/29 08:50:25 INFO spark.CacheManager: Computing partition
org.apache.spark.rdd.BlockRDDPartition@0
13/10/29 08:50:25 WARN storage.BlockManager: Putting block rdd_9_0 failed
13/10/29 08:50:25 INFO local.LocalTaskSetManager: Loss was due to
java.io.NotSerializableException
java.io.NotSerializableException:
org.apache.spark.streaming.examples.clickstream.PageView
Log Snippet 2:
org.apache.spark.SparkException: Job failed: Task 12.0:0 failed more than 4
times; aborting job java.io.NotSerializableException:
org.apache.spark.streaming.examples.clickstream.PageView
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:760)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:758)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60)
at
scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:758)
at
org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:379)
at org.apache.spark.scheduler.DAGScheduler.org
$apache$spark$scheduler$DAGScheduler$$run(DAGScheduler.scala:441)
at
org.apache.spark.scheduler.DAGScheduler$$anon$1.run(DAGScheduler.scala:149)
Two commands that run this app:
./run-example
org.apache.spark.streaming.exampl.clickstream.PageViewGenerator 44444 10
./run-example
org.apache.spark.streaming.examples.clickstream.PageViewStream
errorRatePerZipCode localhost 44444