Looking at the source codes of DStream.scala
> /**
> * Return a new DStream in which each RDD has a single element generated
> by counting each RDD
> * of this DStream.
> */
> def count(): DStream[Long] = {
> this.map(_ => (null, 1L))
> .transform(_.union(context.sparkContext.makeRDD(Seq((null, 0L)),
> 1)))
> .reduceByKey(_ + _)
> .map(_._2)
> }
transform is the line throwing the NullPointerException. Can anyone give
some hints as what would cause "_" to be null (it is indeed null)? This only
happens when there is no data to process.
When there's data, no NullPointerException is thrown, and all the
processing/counting proceeds correctly.
I am using my custom InputDStream. Could it be that this is the source of
the problem?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/NullPointerException-from-count-foreachRDD-tp2066p12461.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]