Suggested Method for Execution of Periodic Actions

2015-09-16 Thread Bryan Jeffrey
Hello. I have a streaming job that is processing data. I process a stream of events, taking actions when I see anomalous events. I also keep a count events observed using updateStateByKey to maintain a map of type to count. I would like to periodically (every 5 minutes) write the results of my

Re: Suggested Method for Execution of Periodic Actions

2015-09-16 Thread Ted Yu
bq. and check if 5 minutes have passed What if the duration for the window is longer than 5 minutes ? Cheers On Wed, Sep 16, 2015 at 1:25 PM, Adrian Tanase wrote: > If you don't need the counts in betweem the DB writes, you could simply > use a 5 min window for the

Re: Suggested Method for Execution of Periodic Actions

2015-09-16 Thread Adrian Tanase
If you don't need the counts in betweem the DB writes, you could simply use a 5 min window for the updateStateByKey and use foreachRdd on the resulting DStream. Even simpler, you could use reduceByKeyAndWindow directly. Lastly, you could keep a variable on the driver and check if 5 minutes have

Re: Suggested Method for Execution of Periodic Actions

2015-09-16 Thread Adrian Tanase
The window can be larger, the batch/slide interval has to be smaller (assuming every 5-10 secs?). You have a separate parameter on most default functions and you can override it as long as it's a multiple of streaming context batch interval. Sent from my iPhone On 16 Sep 2015, at 23:30, Ted Yu