Thanks for the help. I use a Fold and a WindowFunction in conjunction now and it works fine. Though I wish there would be a less complicated way to do this.
cheers Martin On Thu, May 12, 2016 at 11:59 AM, Fabian Hueske <fhue...@gmail.com> wrote: > Hi Martin, > > You can use a FoldFunction and a WindowFunction to process the same! > window. The FoldFunction is eagerly applied, so the window state is only > one element. When the window is closed, the aggregated element is given to > the WindowFunction where you can add start and end time. The iterator of > the WindowFunction will provide only one (the aggregated) element. > > See the apply method on WindowedStream with the following signature: > apply(initialValue: R, foldFunction: FoldFunction[T, R], function: > WindowFunction[R, R, K, W]): DataStream[R] > > Cheers, Fabian > > 2016-05-11 20:16 GMT+02:00 Martin Neumann <mneum...@sics.se>: > >> Hej, >> >> I have a windowed stream and I want to run a (generic) fold function on >> it. The result should have the start and the end time stamp of the window >> as fields (so I can relate it to the original data). *Is there a simple >> way to get the timestamps from within the fold function?* >> >> I could find the lowest and the highest ts as part of the fold function >> but that would not be very accurate especially when I the number of events >> in the window is low. Also, I want to write in a generic way so I can use >> it even if the data itself does not contain a time stamp field (running on >> processing time). >> >> I have looked into using a WindowFunction where I would have access to >> the start and end timestamp. I have not quite figured out how I would >> implement a fold function using this. Also, from my understanding this >> approach would require holding the whole window in memory which is not a >> good option since the window data can get very large. >> >> Is there a better way of doing this >> >> >> cheers Martin >> > >