Did you consider UpdateStateByKey operation?
From: Sandeep Giri [mailto:[email protected]] Sent: Thursday, October 29, 2015 3:09 PM To: user <[email protected]>; dev <[email protected]> Subject: Maintaining overall cumulative data in Spark Streaming Dear All, If a continuous stream of text is coming in and you have to keep publishing the overall word count so far since 0:00 today, what would you do? Publishing the results for a window is easy but if we have to keep aggregating the results, how to go about it? I have tried to keep an StreamRDD with aggregated count and keep doing a fullouterjoin but didn't work. Seems like the StreamRDD gets reset. Kindly help. Regards, Sandeep Giri
