Hey everyone,

I've been contemplating an upcoming issue for my current project that I can't see an obvious solution for, so I was hoping someone else could point me in the right direction. I'm trying to window up live twitter data over a couple of different batch periods (1 hour, 4 hours, and 24 hours). But the windowed periods need to be started at specific times, ie. human readable times (for example right on the hour), and not in the middle of a 1 hour period. Also, whenever the system is restarted the first periods should all be partial periods so that data isn't ignored.

So is there a good way to create a DStream window that will create jobs around specific time intervals, and can create a job for the initial time interval as well? Something like dstream.window(windowDuration, slideDuration, timeLeftInCurrentDuration, shouldIgnoreFirstPartialJob)?

Thanks,
Chris Regnier
-------------------------
Visualization Developer
Oculus Info Inc.

Reply via email to