bq. and check if 5 minutes have passed

What if the duration for the window is longer than 5 minutes ?

Cheers

On Wed, Sep 16, 2015 at 1:25 PM, Adrian Tanase <atan...@adobe.com> wrote:

> If you don't need the counts in betweem the DB writes, you could simply
> use a 5 min window for the updateStateByKey and use foreachRdd on the
> resulting DStream.
>
> Even simpler, you could use reduceByKeyAndWindow directly.
>
> Lastly, you could keep a variable on the driver and check if 5 minutes
> have passed
> in foreachRdd on the original DStream, even if the batch duration is
> shorter.
>
> Also, remember to cleanup the state in your updateStateByKey function or
> it will grow unbounded. I still believe one of the builtin ByKey functions
> are a simpler strategy.
>
> hope this helps.
>
> -adrian
>
> Sent from my iPhone
>
> > On 16 Sep 2015, at 22:33, Bryan Jeffrey <bryan.jeff...@gmail.com> wrote:
> >
> > Hello.
> >
> > I have a streaming job that is processing data.  I process a stream of
> events, taking actions when I see anomalous events.  I also keep a count
> events observed using updateStateByKey to maintain a map of type to count.
> I would like to periodically (every 5 minutes) write the results of my
> counts to a database.  Is there a built in mechanism or established pattern
> to execute periodic jobs in spark streaming?
> >
> > Regards,
> >
> > Bryan Jeffrey
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to