Hi, On Sat, Aug 16, 2014 at 3:29 AM, Yan Fang <yanfang...@gmail.com> wrote: > > If all consecutive data points are in one batch, it's not complicated > except that the order of data points is not guaranteed in the batch and so > I have to use the timestamp in the data point to reach my goal. However, > when the consecutive data points spread in two or more batches, how can I > do this? >
You *could* use window operations. If there is an upper limit to how many batches you might want to look at, you can instead consider a window that is large enough and thereby avoid using updateStateByKey. Tobias