manuzhang opened a new pull request, #5392: URL: https://github.com/apache/iceberg/pull/5392
Currently, during Spark's rewrite data files procedure with bin pack strategy, `SparkSession` is cloned to disable AQE in each `rewriteFiles`. Since a cloned `SparkSession` has its own state, V2SessionCatalog is reloaded every time and a separate table cache is created. That means each file group has its own table cache and effectively disables the table cache. This PR fixes it by cloning `SparkSession` when creating `SparkBinPackStrategy`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
