It turns out that this happens when checkpoint is set to a local directory path. I have opened a JIRA SPARK-4862 for Spark streaming to output better error message.
Thanks, Aniket On Tue Dec 16 2014 at 20:08:13 Aniket Bhatnagar <aniket.bhatna...@gmail.com> wrote: > I am using spark 1.1.0 running a streaming job that uses updateStateByKey > and then (after a bunch of maps/flatMaps) does a foreachRDD to save data in > each RDD by making HTTP calls. The issue is that each time I attempt to > save the RDD (using foreach on RDD), it gives me the following exception: > > org.apache.spark.SparkException: Checkpoint RDD CheckpointRDD[467] at > apply at List.scala:318(0) has different number of partitions than original > RDD MapPartitionsRDD[461] at mapPartitions at StateDStream.scala:71(56) > > I read through the code but couldn't understand why this exception is > happening. Any help would be appreciated! > > Thanks, > Aniket >