openinx commented on pull request #2354:
URL: https://github.com/apache/iceberg/pull/2354#issuecomment-811609723


   Okay,  I think everyone has reached a consensus on this issue `Keeping table 
metadata and data separate (and only versioning data) is the right behavior`.   
Then let's keep this consensus.  @aokolnychyi 's suggestion about `replacing 
the current pointer in the catalog to an old JSON file rather than by calling 
the table rollback API.`  looks good to me, I think this way we can also 
achieve the rollback of the table metadata (for now, this priority does not 
sound that high because people could change table state as they want by calling 
table API).
   
   > I support the idea of a row identifier as long as Iceberg does not enforce 
it
   
   As a common iceberg table specification,  the row identifier don't have to 
be enforced. (I've left a comment 
[here](https://github.com/apache/iceberg/pull/2010#issuecomment-800769586)).
   
   > We plan to leverage it in some MERGE INTO use cases, where the we can 
derive the delete column from the ON clause and merge columns can vary from 
operation to operation.
   
   I don't know much about this point, I guess you may want to use row 
identifier to achieve some optimizations at the spark engine level. Can you 
provide more information?
   
   @jackye1995 ,  I think we could update this PR now, thanks for the great work
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to