rdblue commented on issue #2308:
URL: https://github.com/apache/iceberg/issues/2308#issuecomment-844349495


   I just read through this so I think I'm caught up. Sounds like we are 
discussing two problems, one short term and one long term.
   
   The short term issue is that we have a correctness problem when a row delete 
is committed while a rewrite is happening. I think we should fix this problem 
right away using the approach that @stevenzwu suggested: rewrite should 
validate that there are no new delete files for the partitions that are 
rewritten. If there are new delete files, then the rewrite should fail. This 
will prevent the rewrite from succeeding right now, but I think it is important 
to fix correctness issues as soon as possible.
   
   The long-term issue is how to modify the rewrite operation so that we can 
avoid the conflict with new delete files. The approach discussed above is what 
we originally intended to use when we were designing the v2 metadata: a rewrite 
should track data files with an _older_ sequence number so that new delete 
files are applied correctly. That works for equality deletes, but not for 
position deletes. Since we're primarily concerned about streaming use cases, I 
think that is okay. We can continue to fail the rewrite if new position deletes 
are added.
   
   Also, we can't update the sequence number of the new delete files. We have 
to change the sequence number of the rewritten data files. That's because there 
may be other data files that the new delete files would apply to if the delete 
file's sequence number was increased. For example, if the delete were an upsert 
instead -- a delete and an insert for the same row ID -- then moving the delete 
forward would cause Iceberg to drop the inserted replacement row.
   
   @yyanyy brings up a good point about how this limits the use of sequence 
numbers in other cases. I think that's okay. It is better to allow the table to 
be compacted. We'll figure out that problem as we design the secondary 
metadata. I think it would be okay because the new files would just not be 
covered by the secondary index and would need to be scanned. So it wouldn't 
introduce a correctness issue, just some potential staleness.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to