jackye1995 commented on a change in pull request #2365:
URL: https://github.com/apache/iceberg/pull/2365#discussion_r601817722
##########
File path:
spark3-extensions/src/main/antlr/org.apache.spark.sql.catalyst.parser.extensions/IcebergSqlExtensions.g4
##########
@@ -69,6 +69,7 @@ statement
: CALL multipartIdentifier '(' (callArgument (',' callArgument)*)? ')'
#call
| ALTER TABLE multipartIdentifier ADD PARTITION FIELD transform (AS
name=identifier)? #addPartitionField
| ALTER TABLE multipartIdentifier DROP PARTITION FIELD transform
#dropPartitionField
+ | ALTER TABLE multipartIdentifier REPLACE PARTITION FIELD transform TO
transform (AS name=identifier)? #replacePartitionField
Review comment:
I think there are 2 use cases that have contradicting behaviors:
1. `ADD PARTITION FIELD bucket(id, 16) AS shard`, then `REPLACE PARTITION
FIELD shard WITH bucket(id, 32)`
2. `ADD PARTITION FIELD days(ts) AS days_col`, then `REPLACE PARTITION FIELD
days_col WITH hours(ts)`
For case 1, we do want the `bucket(id, 32)` to also be called `shard`, but
we don't really want to call the `hours(ts)` partition as `days_col`.
So here are a couple of observations for `REPLACE transformFrom WITH
transformTo`:
1. if `transformFrom` is an expression, the default partition field has very
specific meanings such as `ts_days`, `id_bucket_16`, and the replaced partition
field `transformTo` should not inherit that name
2. if there is a custom name for the `transformFrom` partition field, the
behavior really depends. The 2 examples above shows this contradicting
expectations.
So I think the safest approach is to not infer the behavior for the custom
partition name. If the caller wants to use the same name, just use the `AS`
clause to specify it again, such as `REPLACE PARTITION FIELD shard WITH
bucket(id, 32) AS shard`.
What do you think?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]