[GitHub] [hudi] LinMingQiang commented on a diff in pull request #6520: [HUDI-4726] Incremental input splits result is not as expected when f…

GitBox Tue, 30 Aug 2022 06:00:43 -0700


LinMingQiang commented on code in PR #6520:
URL: https://github.com/apache/hudi/pull/6520#discussion_r958448326



##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/IncrementalInputSplits.java:
##########
@@ -128,12 +128,14 @@ public Result inputSplits(
       return Result.EMPTY;
     }
 
+    // The value may be 'earliest' or Null or outOfRange.
     final String startCommit = 
this.conf.getString(FlinkOptions.READ_START_COMMIT);
-    final String endCommit = this.conf.getString(FlinkOptions.READ_END_COMMIT);

Review Comment:
   Now, our problem is that when read-start-commit is outOfRange, no matter 
what the value of read.end-commit is set, the whole table will be scanned, 
which causes some redundant data to be scanned and finally leads to incorrect 
results. Therefore, I think it is necessary to scan the entire table only when 
the ((start-commit value is null / outofrange / earliest) & & (end-commit value 
is null / outofrange). (The current logic is:  startFromEarliest || 
startOutOfRange || endOutOfRange)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] LinMingQiang commented on a diff in pull request #6520: [HUDI-4726] Incremental input splits result is not as expected when f…

Reply via email to