Re: [PR] [FLINK-37128][pipeline-connector/mysql] Binlog reader skip `shouldEmit` filter when snapshot reader skip backfill [flink-cdc]

via GitHub Mon, 14 Apr 2025 20:02:35 -0700


lvyanquan commented on code in PR #3874:
URL: https://github.com/apache/flink-cdc/pull/3874#discussion_r2043424004



##########
flink-cdc-connect/flink-cdc-source-connectors/flink-connector-mysql-cdc/src/main/java/org/apache/flink/cdc/connectors/mysql/debezium/reader/BinlogSplitReader.java:
##########
@@ -257,6 +259,10 @@ private boolean shouldEmit(SourceRecord sourceRecord) {
 
             // only the table who captured snapshot splits need to filter
             if (finishedSplitsInfo.containsKey(tableId)) {
+                // if backfill skipped, don't need to filter
+                if (isBackfillSkipped) {
+                    return true;

Review Comment:
   Currently, there are relatively few tests on Skip backfill. Can you add 
necessary tests to verify its correctness？



##########
flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-mysql/src/main/java/org/apache/flink/cdc/connectors/mysql/source/MySqlDataSourceOptions.java:
##########
@@ -322,4 +322,12 @@ public class MySqlDataSourceOptions {
                             .defaultValue(false)
                             .withDescription(
                                     "Whether to assign the unbounded chunks 
first during snapshot reading phase. This might help reduce the risk of the 
TaskManager experiencing an out-of-memory (OOM) error when taking a snapshot of 
the largest unbounded chunk.  Defaults to false.");
+
+    @Experimental
+    public static final ConfigOption<Boolean> 
SCAN_INCREMENTAL_SNAPSHOT_BACKFILL_SKIP =
+            ConfigOptions.key("scan.incremental.snapshot.backfill.skip")

Review Comment:
   And update doc to add this config.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [FLINK-37128][pipeline-connector/mysql] Binlog reader skip `shouldEmit` filter when snapshot reader skip backfill [flink-cdc]

Reply via email to