Iterative Programming by keeping data across micro-batches in spark-streaming?

Nipun Arora Wed, 17 Jun 2015 19:52:06 -0700

Hi,

Is there anyway in spark streaming to keep data across multiple
micro-batches? Like in a HashMap or something?
Can anyone make suggestions on how to keep data across iterations where
each iteration is an RDD being processed in JavaDStream?


This is especially the case when I am trying to update a model or compare
two sets of RDD's, or keep a global history of certain events etc which
will impact operations in future iterations?
I would like to keep some accumulated history to make calculations.. not
the entire dataset, but persist certain events which can be used in future
JavaDStream RDDs?

Thanks
Nipun

Iterative Programming by keeping data across micro-batches in spark-streaming?

Reply via email to