kbendick opened a new pull request #3413:
URL: https://github.com/apache/iceberg/pull/3413


   Occasionally, users have reported issues that they have very specific 
problems with table caching.
   
   Namely, if they're trying to read from a table, and others have written to 
it from another program, the cache won't be refreshed.
   
   In general, the cache makes things much more efficient. But for things like 
long-lived session clusters or scenarios as described above, it would be 
desirable to expire the tables in the cache and force them to be reloaded after 
a user-configurable time period.
   
   This allows for the catalog to still benefit from caching, but to discard 
cached entries after a configurable period of time.
   
   Presently, it's set to expire after a configured time period after write. So 
if a user tries to call catalog.loadTable(tableIdent) many times, this won't 
necessarily reset the cache time. This allows for users to attempt to access on 
read, and then still get updates from other writers after a period of time.
   
   However, we can also expire the tables after access instead (open to 
discussion on this front). There's also a much more fine-grained Expiry class 
that will give us full control over when tables are expired (depending on which 
accesses etc).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to