jackye1995 commented on pull request #2354:
URL: https://github.com/apache/iceberg/pull/2354#issuecomment-809539515


   @openinx for your example, my understanding is that at t3, the equality 
delete file would have a row like `account_id=xxx`, and at t5, the new equality 
file would have a row like `account_id=yyy, profile_id=zzz`, basically the row 
key information at that point of time is already in the delete file written at 
that time. When time traveling to t4, the delete file written at that time can 
still work without the need to consult the latest row key.
   
   @rdblue I actually mostly agree with what you mention, as I don't see why 
the example mentioned by openinx would not work, but maybe I missed something 
there. But I decided to go with the versioned approach because I think it can 
potentially be used to provide some uniqueness guarantee at read time in the 
future by merging rows, given the fact that now we basically have a primary key 
concept through RowKey and a sort key concept through SortOrder. And at that 
time, we will need this information to be present in the specific snapshot that 
we time travel to.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to