[ 
https://issues.apache.org/jira/browse/FLINK-40038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-40038:
-----------------------------------
    Labels: pull-request-available  (was: )

> [mysql][pipeline] Incremental sync throughput is low in hotspot UPDATE 
> workloads due to deserialization overhead
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-40038
>                 URL: https://issues.apache.org/jira/browse/FLINK-40038
>             Project: Flink
>          Issue Type: Improvement
>          Components: Flink CDC
>    Affects Versions: cdc-3.7.0
>            Reporter: 牛一凡
>            Priority: Minor
>              Labels: pull-request-available
>         Attachments: image-2026-07-01-18-11-43-636.png
>
>
> *Motivation*
> We observed low incremental sync throughput on a MySQL-to-Doris pipeline when 
> using Flink CDC in a large-table hotspot UPDATE workload. In this scenario, 
> upstream and downstream started to lag behind and the job showed an obvious 
> backlog during incremental synchronization.
> After collecting and analyzing the job flame graph, we found that a 
> significant portion of the CPU time was spent in the MySQL pipeline 
> deserialization path, especially around repeated schema/data type inference 
> during row deserialization. This overhead becomes more noticeable when a 
> table receives frequent UPDATE events.
> A related performance concern was mentioned in FLINK-35715, but in our 
> workload this bottleneck still exists and is still impactful enough to cause 
> chasing-lag behavior in production-like environments.
> It would be great to further investigate and optimize this issue.
> *Flame Graph*
> !image-2026-07-01-18-11-43-636.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to