[GitHub] [iceberg] wypoon commented on a change in pull request #1508: Use schema at the time of the snapshot when reading a snapshot.

GitBox Mon, 27 Sep 2021 12:20:57 -0700


wypoon commented on a change in pull request #1508:
URL: https://github.com/apache/iceberg/pull/1508#discussion_r716976453




##########
File path: spark2/src/main/java/org/apache/iceberg/spark/source/Reader.java
##########
@@ -157,22 +128,88 @@
       this.localityPreferred = false;
     }
 
-    this.schema = table.schema();
-    this.caseSensitive = caseSensitive;
     this.batchSize = 
options.get(SparkReadOptions.VECTORIZATION_BATCH_SIZE).map(Integer::parseInt).orElseGet(()
 ->
         PropertyUtil.propertyAsInt(table.properties(),
           TableProperties.PARQUET_BATCH_SIZE, 
TableProperties.PARQUET_BATCH_SIZE_DEFAULT));
     RuntimeConfig sessionConf = SparkSession.active().conf();
     this.readTimestampWithoutZone = 
SparkUtil.canHandleTimestampWithoutZone(options.asMap(), sessionConf);
   }
 
+  private void validateOptions(

Review comment:
       There are two checks here. The first check does not seem to be tested by 
any existing unit test. The second check is tested by an existing unit test. 
Both checks also appear in the spark3 `SparkBatchQueryScan`. For that reason, I 
am preserving them here. I commented in 
https://github.com/wypoon/iceberg/pull/1 that I'd like to defer the clean up of 
duplicate checks to a separate PR, as it should be done for both spark2 and 
spark3 (and here we're only refactoring the spark2 `Reader`). Can that cleanup 
be done after this is merged, since as you remarked before, the change is 
already quite large?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[GitHub] [iceberg] wypoon commented on a change in pull request #1508: Use schema at the time of the snapshot when reading a snapshot.

Reply via email to