CodingCat opened a new pull request, #4795:
URL: https://github.com/apache/iceberg/pull/4795

   this PR implements the functionality to expose the latest snapshot id within 
a thread. 
   
   The implementation is motivated by our internal use cases where we have 
multiple threads submitting jobs with a shared SparkContext and we need to do 
some work with the latest snapshot within the thread as input 
   
   I think the scenario is more pervasive than our own case, e.g. each notebook 
attached to the Databricks' notebook cluster is basically handled by a thread. 
In such an scenario, users may fall into some race condition to get the 
snapshot id committed by their own notebook with just 
`currentSnapshot().snapshotId`. Because currentSnapshot() will just trigger the 
refresh of metadata and may show the snapshot id committed by someone else in 
another thread


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to