mahdibh commented on issue #6514:
URL: https://github.com/apache/iceberg/issues/6514#issuecomment-1593653436

   I think one way to approach this problem (which we are also running into) is 
to expose a new commit API that doesn't automatically refresh the table 
metadata. This would allow clients to rely on the atomic snapshot CAS to also 
synchronize the state changes to custom properties (whether it's on the table 
or the snapshot). This is a very desirable capability since it allows us to 
update in one single atomic operation both the current snapshot and the 
properties of that snapshot.
   
   In our particular case, we would like to store some kafka offsets as part of 
the snapshot properties. Even if we had a single writer, there is always a case 
where we may end up with two writer (for example due to a network partition) 
and in that case, we don't want the last writer's state to overwrite the first 
one. We would like to use the CAS nature of iceberg as the last resort to 
detect concurrent writes.
   
   Another way to think about this is that each iceberg committer is moving the 
state of a table from `state-A` to `state-B`. Most clients do not care about 
what `state-A` is. All they care about is that it gets moved to `state-B`. So 
if another client gets in between and moves `state-A` to `state-C`, the first 
client will happily commit by doing the the implicit refresh (which will pick 
up `state-C`) and then doing the commit effectively moving the system from 
`state-C` to `state-B`.
   
   There are cases where clients want to be explicit about that state 
transition. For example, because there are some attributes in `state-A` that 
they use to determine what `state-B` looks like (ie, the watermark example in 
the original description or kafka offsets in our case).
   
   Another argument for not doing the refresh is that it adds extra latency 
(due to the retrieval of the current snapshot). In the single writer case, it 
is not needed since the common case is that there would be no conflicts and the 
writer always has the latest snapshot.
   
   I think providing the caller the option to decide on whether they care about 
the original state or not when they do a commit is a very powerful capability. 
It also seems like a rather straightforward change that can be applied in a 
backward compatible way.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to