wypoon commented on a change in pull request #1508: URL: https://github.com/apache/iceberg/pull/1508#discussion_r717215294
########## File path: spark3/src/main/java/org/apache/iceberg/spark/source/IcebergSource.java ########## @@ -101,24 +103,35 @@ public Table getTable(StructType schema, Transform[] partitioning, Map<String, S SparkSession spark = SparkSession.active(); setupDefaultSparkCatalog(spark); String path = options.get("path"); + Long snapshotId = Spark3Util.propertyAsLong(options, SparkReadOptions.SNAPSHOT_ID, null); + Long asOfTimestamp = Spark3Util.propertyAsLong(options, SparkReadOptions.AS_OF_TIMESTAMP, null); Review comment: It turns out that you are mistaken. The [`DataSourceV2Relation`](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Utils.scala#L131) is [created](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala#L175-L176) with the output attributes from the schema given by `SparkTable#schema()`. The `SparkTable` is loaded using the catalog and identifier, and that is why I need the `SnapshotAwareIdentifier` when loading it, so that I can return the snapshot schema in `SparkTable#schema()`. Otherwise the `SparkScanBuilder` does have the options for `snapshot-id` or `as-of-timestamp`, but its `pruneColumns()` will be called by Spark with a `requestedSchema` that is a subset of the table schema. Then its `build()` will return a `SparkBatchQueryScan` with an incorrect schema. Once I remove the modifications to `IcebergSource`, `SparkCatalog` and `SparkTable`, then the unit tests I added all fail, as I suspected. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org