[
https://issues.apache.org/jira/browse/FLINK-36939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17943940#comment-17943940
]
Keith Lee commented on FLINK-36939:
-----------------------------------
Refactored the changes for https://issues.apache.org/jira/browse/FLINK-36947 ,
making changes to KinesisShardSplitReaderBase so that both the issue here with
high CPU utilisation when on EFO mode and GetRecords throttling when on Polling
are addressed
See PR: https://github.com/apache/flink-connector-aws/pull/195
> High CPU Utilization with Flink Kinesis EFO Consumer
> ----------------------------------------------------
>
> Key: FLINK-36939
> URL: https://issues.apache.org/jira/browse/FLINK-36939
> Project: Flink
> Issue Type: Improvement
> Components: Connectors / Kinesis
> Affects Versions: 1.20.0, aws-connector-5.0.0
> Reporter: Keith Lee
> Priority: Major
> Attachments: Main.kt, Screenshot 1734584639640.png, Screenshot
> 1734584781285.png, image-2025-01-10-12-43-29-262.png,
> image-2025-01-10-12-44-48-869.png, image-2025-01-10-12-51-04-104.png,
> image-2025-01-10-12-51-36-141.png, image.png
>
>
> Observation: When EFO is enabled, the CPU usage spikes and stays elevated,
> regardless of record volume. If we switch back to the standard polling
> consumer (disabling EFO), CPU utilization returns to normal levels.
> Profiling Results: Local profiling and flamegraphs suggest the connector may
> be engaged in a busy-wait loop, continuously parking and un-parking threads
> even when no data is available. This behavior consumes CPU cycles
> unnecessarily.
> Performance Impact: While the job still processes records correctly when they
> arrive, the high baseline CPU consumption is concerning. It wastes resources
> and triggers unnecessary scaling, which doesn’t resolve the issue since new
> instances also experience the same CPU pattern.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)