yyanyy commented on issue #2308:
URL: https://github.com/apache/iceberg/issues/2308#issuecomment-800782826


   My understanding of what @stevenzwu referred to is that, if the operation 
fails due new RowDelta commits coming in, the rewrite operation could 
potentially  takes whatever changes introduced in the new RowDelta commits and 
make them part of the rewrite commit along with the old job output originally 
produced, so that we don't have to go through the expensive recompute while 
ensure data correctness? e.g. in your case from the email, it might be 
something like:
   
   Seq1:  (RowDelta 1)
   INSERT,  <1, A> 
   INSERT,  <2, B>
   DELETE, <1, A>
   
   Seq2: (RowDelta 2)
   DELETE, <2, B>
   
   Seq3: (Rewrite)
   INSERT, <2,B> 
   DELETE, <2, B>   <--- directly taking from RowDelta 2
   
   although because equality deletes only apply to sequence numbers that are 
less than itself, we may need to update sequence number of the files introduced 
from RowDelta commits to ensure they are larger than the rewrite operation's 
sequence number, which means we will increment sequence numbers for more than 1 
in a commit, which might not be ideal.
   
   I think the approach sounds good to me, with some caveat such as we 
shouldn't blindly assume no conflicts when we do such rewrite operation, as 
positional deletes with sequence number greater than the sequence number being 
used for rewritten will break as you mentioned. A similar hypothetical case is 
that if the rewrite only works for a given partition, and there is a positional 
delete that affects multiple partitions with a larger sequence number, or if 
the rewrite only works against files with sequence number up to N and there's 
an equality write with sequence number N + 1 then rewrite on the files that are 
pointed by this positional delete will break. 
   
   In the meanwhile I think this change does break the current assumption that 
the file is always written with the same sequence number as the commit/snapshot 
itself, and it might be possible that people are using file and manifest's 
sequence number to find out when they are introduced to the table, but now we 
are breaking this behavior. But I guess it might be fine since there are other 
more reliable sources like metadata tables that could help with the same 
information. I wonder if other people have thoughts on assumptions like this 
that could be break by this change. 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to