CodingCat opened a new pull request, #4795: URL: https://github.com/apache/iceberg/pull/4795
this PR implements the functionality to expose the latest snapshot id within a thread. The implementation is motivated by our internal use cases where we have multiple threads submitting jobs with a shared SparkContext and we need to do some work with the latest snapshot within the thread as input I think the scenario is more pervasive than our own case, e.g. each notebook attached to the Databricks' notebook cluster is basically handled by a thread. In such an scenario, users may fall into some race condition to get the snapshot id committed by their own notebook with just `currentSnapshot().snapshotId`. Because currentSnapshot() will just trigger the refresh of metadata and may show the snapshot id committed by someone else in another thread -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
