lvyanquan commented on code in PR #3874:
URL: https://github.com/apache/flink-cdc/pull/3874#discussion_r2043424004
##########
flink-cdc-connect/flink-cdc-source-connectors/flink-connector-mysql-cdc/src/main/java/org/apache/flink/cdc/connectors/mysql/debezium/reader/BinlogSplitReader.java:
##########
@@ -257,6 +259,10 @@ private boolean shouldEmit(SourceRecord sourceRecord) {
// only the table who captured snapshot splits need to filter
if (finishedSplitsInfo.containsKey(tableId)) {
+ // if backfill skipped, don't need to filter
+ if (isBackfillSkipped) {
+ return true;
Review Comment:
Currently, there are relatively few tests on Skip backfill. Can you add
necessary tests to verify its correctness?
##########
flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-mysql/src/main/java/org/apache/flink/cdc/connectors/mysql/source/MySqlDataSourceOptions.java:
##########
@@ -322,4 +322,12 @@ public class MySqlDataSourceOptions {
.defaultValue(false)
.withDescription(
"Whether to assign the unbounded chunks
first during snapshot reading phase. This might help reduce the risk of the
TaskManager experiencing an out-of-memory (OOM) error when taking a snapshot of
the largest unbounded chunk. Defaults to false.");
+
+ @Experimental
+ public static final ConfigOption<Boolean>
SCAN_INCREMENTAL_SNAPSHOT_BACKFILL_SKIP =
+ ConfigOptions.key("scan.incremental.snapshot.backfill.skip")
Review Comment:
And update doc to add this config.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]