syun64 opened a new issue, #6978: URL: https://github.com/apache/iceberg/issues/6978
### Apache Iceberg version 1.1.0 (latest release) ### Query engine Spark ### Please describe the bug 🐞 Time travel / reading as of certain snapshot ID fails on Metadata Tables if there was ever a schema evolution introduced in the iceberg table. This seems like it could be an unwanted side effect of this PR that allows us to user the snapshot schema when reading a snapshot: #3722 Since schema evolution is not supported on metadata tables, we could patch this bug by using a condition that checks if the iceberg table is an instance of [BaseMetadataTable](https://github.com/wypoon/iceberg/blob/03d80eb735f89c8318a7d83ec3baa1b3119642de/core/src/main/java/org/apache/iceberg/BaseMetadataTable.java#L163) before making the [snapshotSchema](https://github.com/apache/iceberg/blob/master/spark/v3.1/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java#L133) call Example query: `spark.read.format("iceberg").option("snapshot-id", 10963874102873L).load("db.table.files")` Example Error after Schema evolution: ``` Py4JJavaError: An error occurred while calling o373.load. : java.lang.IllegalStateException: Cannot find schema with schema id 1 at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkState(Preconditions.java:590) at org.apache.iceberg.util.SnapshotUtil.schemaFor(SnapshotUtil.java:363) at org.apache.iceberg.util.SnapshotUtil.schemaFor(SnapshotUtil.java:388) at org.apache.iceberg.spark.source.SparkTable.snapshotSchema(SparkTable.java:127) at org.apache.iceberg.spark.source.SparkTable.schema(SparkTable.java:133) at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation$.create(DataSourceV2Relation.scala:176) at org.apache.spark.sql.DataFrameReader.$anonfun$load$1(DataFrameReader.scala:303) at scala.Option.map(Option.scala:230) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:265) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:239) at jdk.internal.reflect.GeneratedMethodAccessor210.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.base/java.lang.Thread.run(Thread.java:829) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
