aokolnychyi commented on code in PR #4382:
URL: https://github.com/apache/iceberg/pull/4382#discussion_r842005427


##########
core/src/main/java/org/apache/iceberg/BaseMetadataTableScan.java:
##########
@@ -31,11 +37,30 @@ protected BaseMetadataTableScan(TableOperations ops, Table 
table, Schema schema,
     super(ops, table, schema, context);
   }
 
+  /**
+   * @return if metadata table scan is for all snapshots, ie 'all_x' metadata 
tables
+   */
+  protected boolean allScan() {
+    return false;
+  }
+
   @Override
   public long targetSplitSize() {
     long tableValue = tableOps().current().propertyAsLong(
         TableProperties.METADATA_SPLIT_SIZE,
         TableProperties.METADATA_SPLIT_SIZE_DEFAULT);
     return PropertyUtil.propertyAsLong(options(), TableProperties.SPLIT_SIZE, 
tableValue);
   }
+
+  @Override
+  public CloseableIterable<FileScanTask> planFiles() {

Review Comment:
   I think we should override `planFiles` in `BaseAllMetadataTableScan` instead.



##########
core/src/main/java/org/apache/iceberg/AllDataFilesTable.java:
##########
@@ -108,30 +85,15 @@ public TableScan asOfTime(long timestampMillis) {
     }
 
     @Override
-    protected CloseableIterable<FileScanTask> planFiles(
-        TableOperations ops, Snapshot snapshot, Expression rowFilter,
-        boolean ignoreResiduals, boolean caseSensitive, boolean colStats) {
-      CloseableIterable<ManifestFile> manifests = allDataManifestFiles(
-          ops.current().snapshots(), context().planExecutor());
-      String schemaString = SchemaParser.toJson(schema());
-      String specString = 
PartitionSpecParser.toJson(PartitionSpec.unpartitioned());
-      Expression filter = ignoreResiduals ? Expressions.alwaysTrue() : 
rowFilter;
-      ResidualEvaluator residuals = ResidualEvaluator.unpartitioned(filter);
-
-      return CloseableIterable.transform(manifests, manifest ->
-          new ManifestReadTask(ops.io(), ops.current().specsById(), manifest, 
schema(),
-              schemaString, specString, residuals));
-    }
-  }
-
-  private static CloseableIterable<ManifestFile> allDataManifestFiles(
-      List<Snapshot> snapshots, ExecutorService workerPool) {
-    try (CloseableIterable<ManifestFile> iterable = new ParallelIterable<>(
-        Iterables.transform(snapshots, snapshot -> (Iterable<ManifestFile>) () 
-> snapshot.dataManifests().iterator()),
-        workerPool)) {
-      return CloseableIterable.withNoopClose(Sets.newHashSet(iterable));
-    } catch (IOException e) {
-      throw new RuntimeIOException(e, "Failed to close parallel iterable");
+    protected CloseableIterable<ManifestFile> manifests() {
+      try (CloseableIterable<ManifestFile> iterable = new ParallelIterable<>(
+          Iterables.transform(table().snapshots(),
+              snapshot -> (Iterable<ManifestFile>) () -> 
snapshot.dataManifests().iterator()),

Review Comment:
   nit: I think you can drop the cast (which seems redundant) and move the 
entire call to `Iterables` on one line.



##########
core/src/main/java/org/apache/iceberg/BaseFilesTable.java:
##########
@@ -108,23 +108,21 @@ public TableScan appendsAfter(long fromSnapshotId) {
     }
 
     /**
-     * @return list of manifest files to explore for this files metadata table 
scan
+     * @return iterable of manifest files to explore for this files metadata 
table scan
      */
-    protected abstract List<ManifestFile> manifests();
+    protected abstract CloseableIterable<ManifestFile> manifests();
 
-    private CloseableIterable<ManifestFile> filterManifests(List<ManifestFile> 
manifests,
+    private CloseableIterable<ManifestFile> 
filterManifests(CloseableIterable<ManifestFile> manifests,

Review Comment:
   nit: not necessarily related but could you fix the arg alignment in 
`planFiles` above?



##########
core/src/main/java/org/apache/iceberg/BaseMetadataTableScan.java:
##########
@@ -31,11 +37,30 @@ protected BaseMetadataTableScan(TableOperations ops, Table 
table, Schema schema,
     super(ops, table, schema, context);
   }
 
+  /**
+   * @return if metadata table scan is for all snapshots, ie 'all_x' metadata 
tables
+   */
+  protected boolean allScan() {

Review Comment:
   +1, I think `allScan` is not very descriptive



##########
core/src/main/java/org/apache/iceberg/BaseMetadataTableScan.java:
##########
@@ -31,11 +37,30 @@ protected BaseMetadataTableScan(TableOperations ops, Table 
table, Schema schema,
     super(ops, table, schema, context);
   }
 
+  /**
+   * @return if metadata table scan is for all snapshots, ie 'all_x' metadata 
tables
+   */
+  protected boolean allScan() {

Review Comment:
   In fact, what about removing this flag completely and overriding `planFiles` 
in `BaseAllMetadataTableScan`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to