mahdibh commented on issue #6514: URL: https://github.com/apache/iceberg/issues/6514#issuecomment-1593653436
I think one way to approach this problem (which we are also running into) is to expose a new commit API that doesn't automatically refresh the table metadata. This would allow clients to rely on the atomic snapshot CAS to also synchronize the state changes to custom properties (whether it's on the table or the snapshot). This is a very desirable capability since it allows us to update in one single atomic operation both the current snapshot and the properties of that snapshot. In our particular case, we would like to store some kafka offsets as part of the snapshot properties. Even if we had a single writer, there is always a case where we may end up with two writer (for example due to a network partition) and in that case, we don't want the last writer's state to overwrite the first one. We would like to use the CAS nature of iceberg as the last resort to detect concurrent writes. Another way to think about this is that each iceberg committer is moving the state of a table from `state-A` to `state-B`. Most clients do not care about what `state-A` is. All they care about is that it gets moved to `state-B`. So if another client gets in between and moves `state-A` to `state-C`, the first client will happily commit by doing the the implicit refresh (which will pick up `state-C`) and then doing the commit effectively moving the system from `state-C` to `state-B`. There are cases where clients want to be explicit about that state transition. For example, because there are some attributes in `state-A` that they use to determine what `state-B` looks like (ie, the watermark example in the original description or kafka offsets in our case). Another argument for not doing the refresh is that it adds extra latency (due to the retrieval of the current snapshot). In the single writer case, it is not needed since the common case is that there would be no conflicts and the writer always has the latest snapshot. I think providing the caller the option to decide on whether they care about the original state or not when they do a commit is a very powerful capability. It also seems like a rather straightforward change that can be applied in a backward compatible way. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
