kbendick opened a new pull request #3543: URL: https://github.com/apache/iceberg/pull/3543
Occasionally, users have reported issues that they have very specific problems with table caching. Namely, if they're trying to read from a table, and others have written to it from another program, the cache won't be refreshed and users need to manually call `.refresh` on the table. This can be a source of confusion. In general, the cache makes things much more efficient. So it makes sense to keep it. But for things like long-lived session clusters or scenarios as described above, it would be desirable to expire the tables in the cache and then have them be reloaded if they're accessed via the catalog again. This allows for the catalog to still benefit from caching, but to discard cached entries after a configurable period of time. Details: - Adds a new catalog property, `cache.expiration-enabled`. - Adds a new catalog property, `cache.expiration-interval-ms`. Defaults to 15 minutes. Presently, it's set to expire after a configured time period after write or access via the catalog. So if a user tries to call catalog.loadTable(tableIdent) many times, this will reset the cache time. Additionally, metadata tables are always expired when the main table is expired (so that they don't drift in the cache). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
