jackye1995 opened a new pull request #2365: URL: https://github.com/apache/iceberg/pull/2365
I received some feedback from users about the current Spark SQL extension not able to directly update partition field. Currently it has to first drop and then add the new field, which (1) is not straight-forward for the common use case that updates the granularity of timestamp or bucket transform, (2) creates a time period between 2 commits that is not locked and might cause writer to write data with a wrong partition spec. This PR introduces the syntax of `ALTER TABLE table CHANGE PARTITION FIELD transform TO transform` that drops the old transform and adds the new transform in a single commit to solve the issue above. There is no similar syntax as reference in other systems, Delta lake took the route of directly adding or dropping the entire partition spec so I could not use that as a basis. I chose the current syntax based on the following reasons: 1. keyword `CHANGE` is chosen based on the Hive syntax of `CHANGE COLUMN col ...`, I think we might be able to reuse this keyword in the future for column DDL extensions. 2. keyword `TO` is chosen to be consistent with a similar syntax for `RENAME col TO col` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
