openinx commented on issue #2308:
URL: https://github.com/apache/iceberg/issues/2308#issuecomment-800847413


   > We may need to update sequence number of the files introduced from 
RowDelta commits to ensure they are larger than the rewrite operation's 
sequence number. 
   
   I think we are thinking the similar apporach to fix the semantic issue 
caused by commit order. You are trying to increase the seqNum of delete files 
that were written in the previous RowDelta when doing the rewrite action 
commit, I'm trying to commit the rewrite action with the stale seqNum that the 
table wrote before start the rewrite action. The key point is: Let the 
RowDelta's equality deletions could be applied to the following Rewriten data 
files. 
   
   The second appoach seems to be more simplier because we don't have to find 
all the delete files that was introduced by the previous RowDelta txn. we could 
just use the current sequence number as the newly introduced rewrite manifest's 
sequence number.
   
   As I comment before, the equality write path don't introduce any position 
delete files that will be applied to the old files. So we don't have to 
fallback to re-do the whole rewrite action if conflicts happen. But for the 
batch row-level delete job, we will have to fallback to re-do the rewrite 
action because we have to re-evaluate the row set that are not deleted by the 
previous position deletion.
   
   
   
   
   
   
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to