rdblue commented on a change in pull request #3722:
URL: https://github.com/apache/iceberg/pull/3722#discussion_r768101609
##########
File path:
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkScanBuilder.java
##########
@@ -65,16 +68,22 @@
this.spark = spark;
this.table = table;
this.readConf = new SparkReadConf(spark, table, options);
+ this.snapshotId = readConf.snapshotId();
+ this.asOfTimestamp = readConf.asOfTimestamp();
this.caseSensitive = readConf.caseSensitive();
}
+ private Schema snapshotSchema() {
Review comment:
I think the schema should be passed into this builder, not resolved
here. The problem with this is that Spark has already analyzed the query using
the schema returned by `SparkTable`. Whatever `SparkTable` reported as the
schema must be what this class uses as the basis for projection, or else
Iceberg could break resolution -- and that's worse than using a different
projection schema.
For cases where `snapshot-id` or `as-of-timestamp` are passed through read
options, I think we have to use the current table schema.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]