Hi Martijn,

yes, that's what I meant, the throughput in the process function(s) didn't
change, so even if they were busy 100% of the time with parallelism=2, they
were processing data quickly enough.

Regards,
Alexis.

Am Fr., 16. Dez. 2022 um 14:20 Uhr schrieb Martijn Visser <
martijnvis...@apache.org>:

> Hi,
>
> Backpressure implies that it's actually a later operator that is busy. So
> in this case, that would be your process function that can't handle the
> incoming load from your Kafka source.
>
> Best regards,
>
> Martijn
>
> On Tue, Dec 13, 2022 at 7:46 PM Alexis Sarda-Espinosa <
> sarda.espin...@gmail.com> wrote:
>
>> Hello,
>>
>> I have a Kafka source (the new one) in Flink 1.15 that's followed by a
>> process function with parallelism=2. Some days, I see long periods of
>> backpressure in the source. During those times, the pool-usage metrics of
>> all tasks stay between 0 and 1%, but the process function appears 100% busy.
>>
>> To try to avoid backpressure, I increased parallelism to 3. It seems to
>> help, and busy-time decreased to around 80%, but something that caught my
>> attention is that throughput remained unchanged. Concretely, if X is the
>> number of events being written to the Kafka topic every second, each
>> instance of the process function receives roughly X/2 events/s with
>> parallelism=2, and X/3 with parallelism=3.
>>
>> I'm wondering a couple of things.
>>
>> 1. Is it possible that backpressure in this case is essentially a "false
>> positive" because the function is busy 100% of the time even though it's
>> processing enough data?
>> 2. Does Flink expose any way to tune this type of backpressure mechanism?
>>
>> Regards,
>> Alexis.
>>
>

Reply via email to