wypoon commented on a change in pull request #1508: URL: https://github.com/apache/iceberg/pull/1508#discussion_r716976453
########## File path: spark2/src/main/java/org/apache/iceberg/spark/source/Reader.java ########## @@ -157,22 +128,88 @@ this.localityPreferred = false; } - this.schema = table.schema(); - this.caseSensitive = caseSensitive; this.batchSize = options.get(SparkReadOptions.VECTORIZATION_BATCH_SIZE).map(Integer::parseInt).orElseGet(() -> PropertyUtil.propertyAsInt(table.properties(), TableProperties.PARQUET_BATCH_SIZE, TableProperties.PARQUET_BATCH_SIZE_DEFAULT)); RuntimeConfig sessionConf = SparkSession.active().conf(); this.readTimestampWithoutZone = SparkUtil.canHandleTimestampWithoutZone(options.asMap(), sessionConf); } + private void validateOptions( Review comment: There are two checks here. The first check does not seem to be tested by any existing unit test. The second check is tested by an existing unit test. Both checks also appear in the spark3 `SparkBatchQueryScan`. For that reason, I am preserving them here. I commented in https://github.com/wypoon/iceberg/pull/1 that I'd like to defer the clean up of duplicate checks to a separate PR, as it should be done for both spark2 and spark3 (and here we're only refactoring the spark2 `Reader`). Can that cleanup be done after this is merged, since as you remarked before, the change is already quite large? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org