[
https://issues.apache.org/jira/browse/FLINK-37065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated FLINK-37065:
-----------------------------------
Labels: Flink-CDC pull-request-available (was: Flink-CDC)
> MySQL cdc can lose/skip data during recovering from the checkpoint
> ------------------------------------------------------------------
>
> Key: FLINK-37065
> URL: https://issues.apache.org/jira/browse/FLINK-37065
> Project: Flink
> Issue Type: Bug
> Affects Versions: cdc-3.2.0, cdc-3.2.1
> Reporter: Ihor Mielientiev
> Priority: Major
> Labels: Flink-CDC, pull-request-available
>
> During each pipeline start (e.g., failover or restart), the Flink CDC
> connector retrieves the current GTID sets from the MySQL server and merges it
> with the pipeline's current state. This merged GTID set is then sent back to
> the MySQL server to indicate which transactions the Flink pipeline has
> already processed, ensuring that the server doesn’t resend processed
> transactions.
>
> Flink CDC MySQL Connector uses the
> [fixRestoredGtidSet|https://github.com/apache/flink-cdc/blob/master/flink-cdc-connect/flink-cdc-source-connectors/flink-connector-mysql-cdc/src/main/java/io/debezium/connector/mysql/GtidUtils.java#L36]
> method to merge the GTID sets from the server with the GTID sets from the
> checkpoint. The method ensures that Flink will "tell" MySQL to skip over
> transactions it has already processed, avoiding duplication. However, the
> current implementation of this method doesn’t handle gaps caused by MySQL
> parallel execution. For example, if the restored GTID set is 1-80:83-90:92-98
> and the server GTID set is 1-100, the method will merge gaps together and
> result will be 1-98, since it selects the highest gtid from checkpoint
> So in case if the pipeline has been restarted during any “gaps”, Flink CDC
> won’t process “gapped” transactions and will lose the data
--
This message was sent by Atlassian Jira
(v8.20.10#820010)