[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7338: Flink: use correct scan mode when in TABLE_SCAN_THEN_INCREMENTAL mode

via GitHub Thu, 13 Apr 2023 07:54:11 -0700


stevenzwu commented on code in PR #7338:
URL: https://github.com/apache/iceberg/pull/7338#discussion_r1165658208



##########
flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/source/FlinkSplitPlanner.java:
##########
@@ -136,12 +137,21 @@ static CloseableIterable<CombinedScanTask> planTasks(
     }
   }
 
-  private enum ScanMode {
+  @VisibleForTesting
+  enum ScanMode {
     BATCH,
     INCREMENTAL_APPEND_SCAN
   }
 
-  private static ScanMode checkScanMode(ScanContext context) {
+  @VisibleForTesting
+  static ScanMode checkScanMode(ScanContext context) {

Review Comment:
   @chenjunjiedada  thx for catching the bug and creating the PR fix.
   
   For the conditions here, is there any other simpler logic? E.g., is it 
enough to just remove the `context.isStreaming()` condition in the original if 
clause?
   
   Also I think it is better safer/more clear to construct a new `ScanContext` 
object and set the `useSnapshotId`.
   ```
       if (scanContext.streamingStartingStrategy()
           == StreamingStartingStrategy.TABLE_SCAN_THEN_INCREMENTAL) {
         // do a batch table scan first
         splits = FlinkSplitPlanner.planIcebergSourceSplits(table, scanContext, 
workerPool);
         LOG.info(
             "Discovered {} splits from initial batch table scan with snapshot 
Id {}",
             splits.size(),
             startSnapshot.snapshotId());
         // For TABLE_SCAN_THEN_INCREMENTAL, incremental mode starts exclusive 
from the startSnapshot
         toPosition =
             IcebergEnumeratorPosition.of(startSnapshot.snapshotId(), 
startSnapshot.timestampMillis());
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7338: Flink: use correct scan mode when in TABLE_SCAN_THEN_INCREMENTAL mode

Reply via email to