[
https://issues.apache.org/jira/browse/FLINK-36081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ruan Hang resolved FLINK-36081.
-------------------------------
Resolution: Fixed
Fixed in master(3.3-SNAPSHOT): 77c63385d947f3bb8e726561a7f01cd383941a96
> Flink CDC MySQL source connector missing some columns data of newly added
> tables
> --------------------------------------------------------------------------------
>
> Key: FLINK-36081
> URL: https://issues.apache.org/jira/browse/FLINK-36081
> Project: Flink
> Issue Type: Bug
> Components: Flink CDC
> Affects Versions: cdc-3.1.1
> Environment: jdk 11
> flink 1.17
> flinkcdc 3.0.0
> Reporter: Mingya Wang
> Assignee: Mingya Wang
> Priority: Major
> Labels: mysql-cdc-connector, pull-request-available
> Fix For: cdc-3.3.0
>
>
> *Problem Description:*
> When adding a new table, the Flink CDC MySQL source connector experiences
> missing data for some columns of the newly added table.
> *Reproduction Scenario:*
> # Remove a table from a cdc job that is running normally, then start the job
> with resume functionality.
> # Perform a column addition operation on the removed table.
> # Add the table back to the job. The job continues to run without
> interruption upon table addition, but data for the newly added columns is
> missing in the synchronized data.
> *Cause Analysis:*
> The issue arises because the MySQL CDC Source maintains the table schema in
> state. When adding a new table, it recovers the schema from the previous
> state. Since the prior schema exists and represents the structure before the
> column addition, the MySQL CDC Source provides the downstream with data based
> on the schema cached in the state. Consequently, records outputted to
> downstream systems are missing the fields corresponding to the newly added
> columns.
> *Proposed Solution:*
> Upon removing a table from the cdc job, it is necessary to also
> correspondingly remove the table from the MySQLBinlogSplit.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)