HonahX commented on issue #864:
URL: https://github.com/apache/iceberg-python/issues/864#issuecomment-2240476365

   I think this is related to the `CreateTableTransaction` use-case and the 
current limitation of `update_table_metadata`, which is used in all non-rest 
catalog's `_commit_table`. 
   
   As @kevinjqliu mentioned in the issue description
   
   > TableMetadata is initialized with the default Pydantic object for schema, 
partition_spec, and sort_order, which does not play well with table updates.
   
   This causes problem in the `CreateTransaction` case. Currently, our 
workaround is to initialize the table metadata
   
https://github.com/apache/iceberg-python/blob/846fc0886d0e5cdfea9448e7b769f6e660a5c786/pyiceberg/catalog/__init__.py#L913-L924
   
   and make some special note to some of `TableUpdate` that this is the 
`initial_change`
   
https://github.com/apache/iceberg-python/blob/fd2af567f225754bd3e4ece69fac05e0dd7dc7dc/pyiceberg/table/__init__.py#L723-L725
   
   such that `update_table_metadata` will handle those updates in a separate 
way to accommodate the default value in the "empty" metadata.
   
   However, this won't work in the case of RestCatalog server implementation, 
where the `TableUpdates` come from arbitrary Rest Client (for example Spark) do 
not contain the extra special note.
   
   So, there is some room for improvement for `update_table_metadata` or 
`TableMetadata` here. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to