HeartSaVioR opened a new issue #1431: URL: https://github.com/apache/iceberg/issues/1431
In the Spark page doc (https://iceberg.apache.org/spark/), there's a note regarding querying/writing with DataFrame that it initializes an isolated table reference which will not be updated automatically when other query updates the table. That said, other use cases would share the table reference which will be updated automatically. That is great, but it doesn't hold true for metadata tables, especially when these tables are cached in CachingCatalog. Spark catalog leverages CachingCatalog by default, so the result of querying metadata table will not be updated even you update the base table. The result will be updated once you explicitly call `refresh table` for the metadata table. We can improve this via invalidating metadata tables in CachingCatalog when the base table is updated. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
