kbendick opened a new pull request #3543:
URL: https://github.com/apache/iceberg/pull/3543


   Occasionally, users have reported issues that they have very specific 
problems with table caching.
   
   Namely, if they're trying to read from a table, and others have written to 
it from another program, the cache won't be refreshed and users need to 
manually call `.refresh` on the table. This can be a source of confusion.
   
   In general, the cache makes things much more efficient. So it makes sense to 
keep it.
   
   But for things like long-lived session clusters or scenarios as described 
above, it would be desirable to expire the tables in the cache and then have 
them be reloaded if they're accessed via the catalog again.
   
   This allows for the catalog to still benefit from caching, but to discard 
cached entries after a configurable period of time.
   
   Details:
   - Adds a new catalog property, `cache.expiration-enabled`.
   - Adds a new catalog property, `cache.expiration-interval-ms`. Defaults to 
15 minutes.
   
   Presently, it's set to expire after a configured time period after write or 
access via the catalog. So if a user tries to call 
catalog.loadTable(tableIdent) many times, this will reset the cache time.
   
   Additionally, metadata tables are always expired when the main table is 
expired (so that they don't drift in the cache).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to