Re: windowAll and AggregateFunction

2019-01-10 Thread CPC
I converted to this SingleOutputStreamOperator> tuple2Stream = sourceStream.map(new RichMapFunction>() { @Override public Tuple2 map(XMPP value) throws Exception { return Tuple2.of(getRuntimeContext().getIndexOfThisSubtask(), value); } });

Re: windowAll and AggregateFunction

2019-01-09 Thread CPC
Hi Ken, I am doing a global distinct. What i want to achive is someting like below. With windowAll it sends all data to single operator which means shuffle all data and calculate with par 1. I dont want to shuffle data since i just want to feed it to hll instance and shuffle just hll instances at

Re: windowAll and AggregateFunction

2019-01-09 Thread Ken Krugler
> On Jan 9, 2019, at 3:10 PM, CPC wrote: > > Hi Ken, > > From regular time-based windows do you mean keyed windows? Correct. Without doing a keyBy() you would have a parallelism of 1. I think you want to key on whatever you’re counting for unique values, so that each window operator gets a

Re: windowAll and AggregateFunction

2019-01-09 Thread Ken Krugler
Hi there, You should be able to use a regular time-based window(), and emit the HyperLogLog binary data as your result, which then would get merged in your custom function (which you set a parallelism of 1 on). Note that if you are generating unique counts per non-overlapping time window,

Re: windowAll and AggregateFunction

2019-01-09 Thread CPC
Hi Stefan, Could i use "Reinterpreting a pre-partitioned data stream as keyed stream" feature for this? On Wed, 9 Jan 2019 at 17:50, Stefan Richter wrote: > Hi, > > I think your expectation about windowAll is wrong, from the method > documentation: “Note: This operation is inherently

Re: windowAll and AggregateFunction

2019-01-09 Thread Stefan Richter
Hi, I think your expectation about windowAll is wrong, from the method documentation: “Note: This operation is inherently non-parallel since all elements have to pass through the same operator instance” and I also cannot think of a way in which the windowing API would support your use case

windowAll and AggregateFunction

2019-01-09 Thread CPC
Hi all, In our implementation,we are consuming from kafka and calculating distinct with hyperloglog. We are using windowAll function with a custom AggregateFunction but flink runtime shows a little bit unexpected behavior at runtime. Our sources running with parallelism 4 and i expect add