rdblue opened a new pull request #3089: URL: https://github.com/apache/iceberg/pull/3089
This updates Spark 3 to create metadata tables directly inside of actions, instead of loading them through a SparkCatalog. This avoids a problem where metadata tables used by the expire snapshots action were using `HadoopFileIO` instead of a custom `FileIO` implementation. The problem happened in the expire snapshots action, where metadata tables are used for file reachability datasets, but are based on a `StaticTableOperations` that points directly to a metadata file path. `SparkTableUtil` would create the metadata table by translating the table back to an identifier and loading it. For static tables, the identifier is the location of the metadata file. This causes Spark to load the metadata table using HadoopTables instead of a catalog, which then uses `HadoopFileIO`. The solution is to construct a metadata table directly from the `Table` instance passed into `SparkTableUtil` for Spark 3. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
