Hi, Is there any specific scenario which needs to know the RDD numbers in the DStream? According to my knowledge DStream will generate one RDD in each right batchDuration, some old rdd will be remembered for windowing-like function, and will be removed when useless. The hashmap generatedRDDs in DStream.scala contains the rdd as you wanted, though you cannot call it from app.
Besides the count() API returns the records number of this DStream's each RDD, not the number of RDD, the number of RDD should always be 1 as I understand. Thanks Jerry -----Original Message----- From: julyfire [mailto:[email protected]] Sent: Tuesday, September 09, 2014 2:42 PM To: [email protected] Subject: Spark streaming: size of DStream I want to implement the following logic: val stream = getFlumeStream() // a DStream if(size_of_stream > 0) // if the DStream contains some RDD stream.someTransfromation stream.count() can figure out the number of RDD in a DStream, but it return a DStream[Long] and can't compare with a number. does anyone know how to get the number of RDD in a DStream? Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-size-of-DStream-tp13769.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
