Hi, Backpressure implies that it's actually a later operator that is busy. So in this case, that would be your process function that can't handle the incoming load from your Kafka source.
Best regards, Martijn On Tue, Dec 13, 2022 at 7:46 PM Alexis Sarda-Espinosa < sarda.espin...@gmail.com> wrote: > Hello, > > I have a Kafka source (the new one) in Flink 1.15 that's followed by a > process function with parallelism=2. Some days, I see long periods of > backpressure in the source. During those times, the pool-usage metrics of > all tasks stay between 0 and 1%, but the process function appears 100% busy. > > To try to avoid backpressure, I increased parallelism to 3. It seems to > help, and busy-time decreased to around 80%, but something that caught my > attention is that throughput remained unchanged. Concretely, if X is the > number of events being written to the Kafka topic every second, each > instance of the process function receives roughly X/2 events/s with > parallelism=2, and X/3 with parallelism=3. > > I'm wondering a couple of things. > > 1. Is it possible that backpressure in this case is essentially a "false > positive" because the function is busy 100% of the time even though it's > processing enough data? > 2. Does Flink expose any way to tune this type of backpressure mechanism? > > Regards, > Alexis. >