Re: Maintaining overall cumulative data in Spark Streaming

2015-10-30 Thread Silvio Fiorito
ber 30, 2015 at 9:29 AM To: skaarthik oss mailto:skaarthik@gmail.com>> Cc: dev mailto:d...@spark.apache.org>>, user mailto:user@spark.apache.org>> Subject: Re: Maintaining overall cumulative data in Spark Streaming How to we reset the aggregated statistics to null? Regards,

Re: Maintaining overall cumulative data in Spark Streaming

2015-10-30 Thread Sandeep Giri
worked. > > Though there are some more complications. > On Oct 30, 2015 8:27 AM, "skaarthik oss" wrote: > >> Did you consider UpdateStateByKey operation? >> >> >> >> *From:* Sandeep Giri [mailto:sand...@knowbigdata.com] >> *Sent:* Thursday, October

RE: Maintaining overall cumulative data in Spark Streaming

2015-10-29 Thread Sandeep Giri
5 3:09 PM > *To:* user ; dev > *Subject:* Maintaining overall cumulative data in Spark Streaming > > > > Dear All, > > > > If a continuous stream of text is coming in and you have to keep > publishing the overall word count so far since 0:00 today, what would you

RE: Maintaining overall cumulative data in Spark Streaming

2015-10-29 Thread Silvio Fiorito
lto:sand...@knowbigdata.com> Sent: ‎10/‎29/‎2015 6:08 PM To: user<mailto:user@spark.apache.org>; dev<mailto:d...@spark.apache.org> Subject: Maintaining overall cumulative data in Spark Streaming Dear All, If a continuous stream of text is coming in and you have to keep publishing the overall word

Maintaining overall cumulative data in Spark Streaming

2015-10-29 Thread Sandeep Giri
Dear All, If a continuous stream of text is coming in and you have to keep publishing the overall word count so far since 0:00 today, what would you do? Publishing the results for a window is easy but if we have to keep aggregating the results, how to go about it? I have tried to keep an StreamR