stevenzwu commented on issue #2808:
URL: https://github.com/apache/iceberg/issues/2808#issuecomment-913751666


   We need to come up with some good hypothesis. Right now, none makes sense. 
   
   Let me recap the two observations from the thread
   1. both transactions have the same `max-committed-checkpointid`
   2. those two duplicate transactions have linear relationship, as we can see 
from the parent snapshot id
   
   > Just quit the flink streaming job when encountering 
CommitStateUnknownException and let people to check whether it's OK to restart 
the flink job.
   
   I thought this already the case as @coolderli configured 
`execution.checkpointing.tolerable-failed-checkpoints=0`
   
   > Catch the CommitStateUnknownException in commitOperation, and retry to 
check the iceberg table whether it has been committed the stale txn.
   
   This is also our solution when we encountered this problem. Note that this 
is not bullet proof as re-check can also fail. But it should reduce the chance 
of happening a lot. However, this still doesn't explain why @coolderli were 
seeing the duplicate transactions.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to