[FlinkRunner] KinesisIO stage report 100% busy time even with no load

Edgar H Thu, 20 Jun 2024 00:45:28 -0700

Hi all!

Recently I've been moving my jobs that use FlinkRunner to the latest
2.56.0 version from the 2.49.0. I've noticed the following:


- Busy time with 2.49.0: N/A literal - probably the 100% load wasn't
happening before and wasn't being reported due to lack of metrics?

- Busy time with 2.56.0: 100% even with no load. I'm using a 1 shard
configuration and the simplest Kinesis configuration and still, 100%
load shown.

I've got the same pipelines but using Kafka as a Source, and there is
no issue on that side of things, the load reports normal numbers and
updated accordingly.

This has become an issue when implementing autoscaling with
flink-operator in the jobs since with that stage constantly under load
I can't make it to downscale the number of paralellism and thus the
number of Task Managers.

How are those metrics being reported within IO classes? Is this
something related to being translated into Flink or the Source
connector itself?

[FlinkRunner] KinesisIO stage report 100% busy time even with no load

Reply via email to