Re: Merging N parallel/partitioned WindowedStreams together, one-to-one, into a global window stream
Thank you Fabian, I think that solves it. I'll need to rig up some tests to verify, but it looks good. I used a RichMapFunction to assign ids incrementally to windows (mapping STREAM_OBJECT to Tuple2using a private long value in the mapper that increments on every map call). It works, but by any chance is there a more succinct way to do it? On Thu, Oct 6, 2016 at 1:50 PM, Fabian Hueske wrote: > Maybe this can be done by assigning the same window id to each of the N > local windows, and do a > > .keyBy(windowId) > .countWindow(N) > > This should create a new global window for each window id and collect all > N windows. > > Best, Fabian > > 2016-10-06 22:39 GMT+02:00 AJ Heller : > >> The goal is: >> * to split data, random-uniformly, across N nodes, >> * window the data identically on each node, >> * transform the windows locally on each node, and >> * merge the N parallel windows into a global window stream, such that >> one window from each parallel process is merged into a "global window" >> aggregate >> >> I've achieved all but the last bullet point, merging one window from each >> partition into a globally-aggregated window output stream. >> >> To be clear, a rolling reduce won't work because it would aggregate over >> all previous windows in all partitioned streams, and I only need to >> aggregate over one window from each partition at a time. >> >> Similarly for a fold. >> >> The closest I have found is ParallelMerge for ConnectedStreams, but I >> have not found a way to apply it to this problem. Can flink achieve this? >> If so, I'd greatly appreciate a point in the right direction. >> >> Cheers, >> -aj >> > >
Re: DataStream csv reading
Hi Fabian, The program runs with no exceptions in an IDE but I can't see the datastream print messagesThanks -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/DataStream-csv-reading-tp9376p9382.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.
Re: Merging N parallel/partitioned WindowedStreams together, one-to-one, into a global window stream
Maybe this can be done by assigning the same window id to each of the N local windows, and do a .keyBy(windowId) .countWindow(N) This should create a new global window for each window id and collect all N windows. Best, Fabian 2016-10-06 22:39 GMT+02:00 AJ Heller: > The goal is: > * to split data, random-uniformly, across N nodes, > * window the data identically on each node, > * transform the windows locally on each node, and > * merge the N parallel windows into a global window stream, such that one > window from each parallel process is merged into a "global window" aggregate > > I've achieved all but the last bullet point, merging one window from each > partition into a globally-aggregated window output stream. > > To be clear, a rolling reduce won't work because it would aggregate over > all previous windows in all partitioned streams, and I only need to > aggregate over one window from each partition at a time. > > Similarly for a fold. > > The closest I have found is ParallelMerge for ConnectedStreams, but I have > not found a way to apply it to this problem. Can flink achieve this? If so, > I'd greatly appreciate a point in the right direction. > > Cheers, > -aj >