No problem :)

Piotrek

czw., 15 paź 2020 o 08:18 Pankaj Chand <[email protected]>
napisał(a):

> Thank you for the quick and informative reply, Piotrek!
>
> On Thu, Oct 15, 2020 at 2:09 AM Piotr Nowojski <[email protected]>
> wrote:
>
>> Hi Pankay,
>>
>> Yes, you can trigger a window per each element, take a look at the Window
>> Triggers [1].
>>
>> Flink is always processing all records immediately. The only things that
>> can delay processing elements are:
>> - buffering elements on the operator's state (vide WindowOperator)
>> - buffer-timeout (but that's on the output, so it's not delaying
>> processing per se)
>> - back pressure
>> - exactly-once checkpointing (especially under the back pressure)
>>
>> > Also, is there any way I can change the execution.buffer-timeout or
>> setbuffertimeout(milliseconds) dynamically while the job is running?
>>
>> No, sorry it's not possible :(
>>
>> Best,
>> Piotrek
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/windows.html#triggers
>>
>> czw., 15 paź 2020 o 01:55 Pankaj Chand <[email protected]>
>> napisał(a):
>>
>>> Hi Piotrek,
>>>
>>> Thank you for replying! I want to process each record as soon as it is
>>> ingested (or reaches an operator) without waiting for a window for records
>>> to arrive. However, by not using windows, I am not sure if each record gets
>>> emitted immediately upon processing.
>>>
>>> > You still can use windowing, but you might want to emit updated value
>>> of the window per every processed record.
>>>
>>> How do I do this?
>>>
>>> Also, is there any way I can change the execution.buffer-timeout or
>>> setbuffertimeout(milliseconds) dynamically while the job is running?
>>>
>>> Thank you,
>>>
>>> Pankaj
>>>
>>> On Wed, Oct 14, 2020 at 9:42 AM Piotr Nowojski <[email protected]>
>>> wrote:
>>>
>>>> Hi Pankaj,
>>>>
>>>> I'm not entirely sure if I understand your question.
>>>>
>>>> If you want to minimize latency, you should avoid using windows or any
>>>> other operators, that are buffering data for long periods of time. You
>>>> still can use windowing, but you might want to emit updated value of the
>>>> window per every processed record.
>>>>
>>>> Other than that, you might also want to reduce
>>>> `execution.buffer-timeout` from the default value of 100ms down to 1ms, or
>>>> 0ms [1]. Is this what you are looking for?
>>>>
>>>> Piotrek
>>>>
>>>> [1]
>>>> https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html
>>>>
>>>> śr., 14 paź 2020 o 12:38 Pankaj Chand <[email protected]>
>>>> napisał(a):
>>>>
>>>>> Hi all,
>>>>>
>>>>> What is the recommended way to make a Flink job that processes each
>>>>> event individually as soon as it comes and without waiting for a
>>>>> window, in order to minimize latency in the entire DAG of operators?
>>>>>
>>>>> For example, here is some sample WordCount code (without windws),
>>>>> followed by some known ways:
>>>>>
>>>>>  DataStream<WordWithCount> wordCounts = text
>>>>>             .flatMap(new FlatMapFunction<String, WordWithCount>())
>>>>>             .keyBy("word")
>>>>>             .reduce(new ReduceFunction<WordWithCount>());
>>>>>
>>>>>
>>>>>    1. Don't include any TimeWindow/CountWindow function (does this
>>>>>    actually achieve what I want?)
>>>>>    2. Use a CountWindow with a count of 1
>>>>>    3. Make a Trigger that fires to process each event when it comes in
>>>>>
>>>>> I think the above methods only work at the processing level and
>>>>> latency with respect to a single operator, but does not affect the latency
>>>>> of an event in the entire Flink job's DAG since those ways do not affect
>>>>> the buffertimeout value.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Pankaj
>>>>>
>>>>

Reply via email to