I was trying to implement this (force flink to handle all values from
input) but had no success...
Probably I am not getting smth with flink windowing mechanism
I've created my 'finishing' trigger which is basically a copy of purging
trigger
But was not able to make it work:
No problems at all, there is not much flink people and a lot of asking guys
- it should be hard to understand each person's issues :)
Yes, it is not as easy as 'contains' operator: I need to collect some
amount of tuples in order to create a in-memory lucene index. After that I
will filter
Thanks for reply.
Maybe I would need some advise in this case. My situation: we have a stream
of data, generally speaking tuples where long is a unique key
(ie there are no tuples with the same key)
I need to filter out all tuples that do not match certain lucene query.
Creating
Hi,
if you are doing the windows not for their actual semantics I would suggest
not using count based windows and also not using the *All windows. The *All
windows are all non-parallel, i.e. you always only get one parallel
instance of your window operator even if you have a huge cluster.
Also,
Maybe if it is not the first time it worth considering adding this thing as
an option? ;-)
My usecase - I have a pretty big amount of data basically for ETL. It is
finite but it is big. I see it more as a stream not as a dataset. Also I
would re-use the same code for infinite stream later...
And
People have wondered about that a few times, yes. My opinion is that a
stream is potentially infinite and processing only stops for anomalous
reasons: when the job crashes, when stopping a job to later redeploy it. In
those cases you would not want to flush out your data but keep them and
restart
I have a pretty big but final stream and I need to be able to window it by
number of elements.
In this case from my observations flink can 'skip' the latest chunk of data
if it has lower amount of elements than window size:
StreamExecutionEnvironment env =