[GitHub] [iceberg] rdblue commented on a change in pull request #1508: Use schema at the time of the snapshot when reading a snapshot.

GitBox Tue, 28 Sep 2021 13:19:20 -0700


rdblue commented on a change in pull request #1508:
URL: https://github.com/apache/iceberg/pull/1508#discussion_r717078453




##########
File path: spark2/src/main/java/org/apache/iceberg/spark/source/Reader.java
##########
@@ -157,22 +128,88 @@
       this.localityPreferred = false;
     }
 
-    this.schema = table.schema();
-    this.caseSensitive = caseSensitive;
     this.batchSize = 
options.get(SparkReadOptions.VECTORIZATION_BATCH_SIZE).map(Integer::parseInt).orElseGet(()
 ->
         PropertyUtil.propertyAsInt(table.properties(),
           TableProperties.PARQUET_BATCH_SIZE, 
TableProperties.PARQUET_BATCH_SIZE_DEFAULT));
     RuntimeConfig sessionConf = SparkSession.active().conf();
     this.readTimestampWithoutZone = 
SparkUtil.canHandleTimestampWithoutZone(options.asMap(), sessionConf);
   }
 
+  private void validateOptions(

Review comment:
       I guess so, since this is already done.

##########
File path: core/src/main/java/org/apache/iceberg/BaseTableScan.java
##########
@@ -123,8 +124,14 @@ public TableScan useSnapshot(long scanSnapshotId) {
         "Cannot override snapshot, already set to id=%s", 
context.snapshotId());
     Preconditions.checkArgument(ops.current().snapshot(scanSnapshotId) != null,
         "Cannot find snapshot with ID %s", scanSnapshotId);
-    return newRefinedScan(
-        ops, table, schema, context.useSnapshotId(scanSnapshotId));
+    if (this instanceof DataTableScan) {

Review comment:
       The purpose of calling `newScan` was to use the argument validation in 
`BaseTableScan`. I'd rather not update that and not know that it needs to be 
updated here as well.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] rdblue commented on a change in pull request #1508: Use schema at the time of the snapshot when reading a snapshot.

Reply via email to