I converted to this
SingleOutputStreamOperator> tuple2Stream =
sourceStream.map(new RichMapFunction>() {
@Override
public Tuple2 map(XMPP value) throws Exception {
return Tuple2.of(getRuntimeContext().getIndexOfThisSubtask(), value);
}
});
Hi Ken,
I am doing a global distinct. What i want to achive is someting like below.
With windowAll it sends all data to single operator which means shuffle all
data and calculate with par 1. I dont want to shuffle data since i just
want to feed it to hll instance and shuffle just hll instances at
> On Jan 9, 2019, at 3:10 PM, CPC wrote:
>
> Hi Ken,
>
> From regular time-based windows do you mean keyed windows?
Correct. Without doing a keyBy() you would have a parallelism of 1.
I think you want to key on whatever you’re counting for unique values, so that
each window operator gets a
Hi there,
You should be able to use a regular time-based window(), and emit the
HyperLogLog binary data as your result, which then would get merged in your
custom function (which you set a parallelism of 1 on).
Note that if you are generating unique counts per non-overlapping time window,
Hi Stefan,
Could i use "Reinterpreting a pre-partitioned data stream as keyed stream"
feature for this?
On Wed, 9 Jan 2019 at 17:50, Stefan Richter
wrote:
> Hi,
>
> I think your expectation about windowAll is wrong, from the method
> documentation: “Note: This operation is inherently
Hi,
I think your expectation about windowAll is wrong, from the method
documentation: “Note: This operation is inherently non-parallel since all
elements have to pass through the same operator instance” and I also cannot
think of a way in which the windowing API would support your use case
Hi all,
In our implementation,we are consuming from kafka and calculating distinct
with hyperloglog. We are using windowAll function with a custom
AggregateFunction but flink runtime shows a little bit unexpected behavior
at runtime. Our sources running with parallelism 4 and i expect add