[ 
https://issues.apache.org/jira/browse/FLINK-40038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

牛一凡 updated FLINK-40038:
------------------------
    Issue Type: Improvement  (was: Bug)

> [mysql][pipeline] Incremental sync throughput is low in hotspot UPDATE 
> workloads due to deserialization overhead
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-40038
>                 URL: https://issues.apache.org/jira/browse/FLINK-40038
>             Project: Flink
>          Issue Type: Improvement
>          Components: Flink CDC
>    Affects Versions: cdc-3.7.0
>            Reporter: 牛一凡
>            Priority: Minor
>         Attachments: image-2026-07-01-18-11-43-636.png
>
>
> ## Motivation
> We observed low incremental sync throughput on a MySQL-to-Doris pipeline when 
> using Flink CDC in a large-table hotspot UPDATE workload. In this scenario, 
> upstream and downstream started to lag behind and the job showed an obvious 
> backlog during incremental synchronization.
> After collecting and analyzing the job flame graph, we found that a 
> significant portion of the CPU time was spent in the MySQL pipeline 
> deserialization path, especially around repeated schema/data type inference 
> during row deserialization. This overhead becomes more noticeable when a 
> table receives frequent UPDATE events.
> A related performance concern was mentioned in 
> [FLINK-35715|https://issues.apache.org/jira/browse/FLINK-35715], but in our 
> workload this bottleneck still exists and is still impactful enough to cause 
> chasing-lag behavior in production-like environments.
> It would be great to further investigate and optimize this issue.
> ## Flame Graph
> !image-2026-07-01-18-11-43-636.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to