Spark streaming: Fixed time aggregation & handling driver failures

ffarozan Fri, 15 Jan 2016 19:14:34 -0800

I am implementing aggregation using spark streaming and kafka. My batch and
window size are same. And the aggregated data is persisted in Cassandra.


I want to aggregate for fixed time windows - 5:00, 5:05, 5:10, ...

But we cannot control when to run streaming job, we only get to specify the
batch interval. 

So the problem is - lets say if streaming job starts at 5:02, then I will
get results at 5:07, 5:12, etc. and not what I want.

Any suggestions?

thanks,
Firdousi



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-Fixed-time-aggregation-handling-driver-failures-tp25982.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Spark streaming: Fixed time aggregation & handling driver failures

Reply via email to