[
https://issues.apache.org/jira/browse/FLINK-40038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated FLINK-40038:
-----------------------------------
Labels: pull-request-available (was: )
> [mysql][pipeline] Incremental sync throughput is low in hotspot UPDATE
> workloads due to deserialization overhead
> ----------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-40038
> URL: https://issues.apache.org/jira/browse/FLINK-40038
> Project: Flink
> Issue Type: Improvement
> Components: Flink CDC
> Affects Versions: cdc-3.7.0
> Reporter: 牛一凡
> Priority: Minor
> Labels: pull-request-available
> Attachments: image-2026-07-01-18-11-43-636.png
>
>
> *Motivation*
> We observed low incremental sync throughput on a MySQL-to-Doris pipeline when
> using Flink CDC in a large-table hotspot UPDATE workload. In this scenario,
> upstream and downstream started to lag behind and the job showed an obvious
> backlog during incremental synchronization.
> After collecting and analyzing the job flame graph, we found that a
> significant portion of the CPU time was spent in the MySQL pipeline
> deserialization path, especially around repeated schema/data type inference
> during row deserialization. This overhead becomes more noticeable when a
> table receives frequent UPDATE events.
> A related performance concern was mentioned in FLINK-35715, but in our
> workload this bottleneck still exists and is still impactful enough to cause
> chasing-lag behavior in production-like environments.
> It would be great to further investigate and optimize this issue.
> *Flame Graph*
> !image-2026-07-01-18-11-43-636.png!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)