Thank you for the quick and informative reply, Piotrek!

On Thu, Oct 15, 2020 at 2:09 AM Piotr Nowojski <[email protected]> wrote:

> Hi Pankay,
>
> Yes, you can trigger a window per each element, take a look at the Window
> Triggers [1].
>
> Flink is always processing all records immediately. The only things that
> can delay processing elements are:
> - buffering elements on the operator's state (vide WindowOperator)
> - buffer-timeout (but that's on the output, so it's not delaying
> processing per se)
> - back pressure
> - exactly-once checkpointing (especially under the back pressure)
>
> > Also, is there any way I can change the execution.buffer-timeout or
> setbuffertimeout(milliseconds) dynamically while the job is running?
>
> No, sorry it's not possible :(
>
> Best,
> Piotrek
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/windows.html#triggers
>
> czw., 15 paź 2020 o 01:55 Pankaj Chand <[email protected]>
> napisał(a):
>
>> Hi Piotrek,
>>
>> Thank you for replying! I want to process each record as soon as it is
>> ingested (or reaches an operator) without waiting for a window for records
>> to arrive. However, by not using windows, I am not sure if each record gets
>> emitted immediately upon processing.
>>
>> > You still can use windowing, but you might want to emit updated value
>> of the window per every processed record.
>>
>> How do I do this?
>>
>> Also, is there any way I can change the execution.buffer-timeout or
>> setbuffertimeout(milliseconds) dynamically while the job is running?
>>
>> Thank you,
>>
>> Pankaj
>>
>> On Wed, Oct 14, 2020 at 9:42 AM Piotr Nowojski <[email protected]>
>> wrote:
>>
>>> Hi Pankaj,
>>>
>>> I'm not entirely sure if I understand your question.
>>>
>>> If you want to minimize latency, you should avoid using windows or any
>>> other operators, that are buffering data for long periods of time. You
>>> still can use windowing, but you might want to emit updated value of the
>>> window per every processed record.
>>>
>>> Other than that, you might also want to reduce
>>> `execution.buffer-timeout` from the default value of 100ms down to 1ms, or
>>> 0ms [1]. Is this what you are looking for?
>>>
>>> Piotrek
>>>
>>> [1]
>>> https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html
>>>
>>> śr., 14 paź 2020 o 12:38 Pankaj Chand <[email protected]>
>>> napisał(a):
>>>
>>>> Hi all,
>>>>
>>>> What is the recommended way to make a Flink job that processes each
>>>> event individually as soon as it comes and without waiting for a
>>>> window, in order to minimize latency in the entire DAG of operators?
>>>>
>>>> For example, here is some sample WordCount code (without windws),
>>>> followed by some known ways:
>>>>
>>>>  DataStream<WordWithCount> wordCounts = text
>>>>             .flatMap(new FlatMapFunction<String, WordWithCount>())
>>>>             .keyBy("word")
>>>>             .reduce(new ReduceFunction<WordWithCount>());
>>>>
>>>>
>>>>    1. Don't include any TimeWindow/CountWindow function (does this
>>>>    actually achieve what I want?)
>>>>    2. Use a CountWindow with a count of 1
>>>>    3. Make a Trigger that fires to process each event when it comes in
>>>>
>>>> I think the above methods only work at the processing level and latency
>>>> with respect to a single operator, but does not affect the latency of an
>>>> event in the entire Flink job's DAG since those ways do not affect the
>>>> buffertimeout value.
>>>>
>>>> Thanks,
>>>>
>>>> Pankaj
>>>>
>>>

Reply via email to