piotrrzysko opened a new issue, #15413:
URL: https://github.com/apache/iceberg/issues/15413

   ### Feature Request / Improvement
   
   I'm trying to understand what "loaded" means in the context of 
`TableOperations#current()`. The javadoc for `current()` says:
   
   > Return the currently loaded table metadata, without checking for updates.
   
   This raises the question: when is metadata considered "loaded" and where 
does the loading actually happen?
   
   ### Where loading happens
   
   I traced through the code and found three places where metadata gets loaded:
   
   #### 1. Initialization of `TableOperations`
   
   The first `BaseMetastoreTableOperations::current()` call triggers loading 
because `shouldRefresh` starts as `true`.
   
   #### 2. `TableOperations#refresh()`
   
   This loads fresh metadata that affects what `current()` returns.
   
   #### 3. `TableOperations#commit()`
   
   `BaseMetastoreTableOperations#commit()` calls `requestRefresh()` so that the 
next call to `current()` will trigger a load of fresh metadata. This means that 
after a commit, we expect table metadata to be reloaded.
   
   ### When loading happens
   
   Based on the above, we can say that after `commit()` and `refresh()`, 
subsequent calls to `current()` will return metadata from the point after those 
operations.
   
   What is less clear is when the initial load happens. We might expect that 
`Catalog#loadTable()` would trigger loading, but if caching is enabled, 
`Catalog#loadTable()` might return a cached `Table` object that has stale 
metadata, and the first call to `current()` would not trigger a load.
   
   ### What the contract seems to be
   
   Based on my research, the key insight is:
   
   **Only `refresh()` and `commit()` provide guarantees about metadata 
freshness.** After calling either of these methods, subsequent calls to 
`current()` will return metadata at least as fresh as what was loaded or 
committed. However, there is no guarantee about how fresh the metadata returned 
by `current()` is *before* calling `refresh()` or `commit()` - it depends on 
when the initial load happened, which is implementation-defined.
   
   ---- 
   
   I'd appreciate feedback on whether my understanding is correct. Happy to 
contribute documentation update if this helps.
   
   
   ### Query engine
   
   None
   
   ### Willingness to contribute
   
   - [x] I can contribute this improvement/feature independently
   - [ ] I would be willing to contribute this improvement/feature with 
guidance from the Iceberg community
   - [ ] I cannot contribute this improvement/feature at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to