jackye1995 commented on a change in pull request #2365:
URL: https://github.com/apache/iceberg/pull/2365#discussion_r601817722



##########
File path: 
spark3-extensions/src/main/antlr/org.apache.spark.sql.catalyst.parser.extensions/IcebergSqlExtensions.g4
##########
@@ -69,6 +69,7 @@ statement
     : CALL multipartIdentifier '(' (callArgument (',' callArgument)*)? ')'     
             #call
     | ALTER TABLE multipartIdentifier ADD PARTITION FIELD transform (AS 
name=identifier)?   #addPartitionField
     | ALTER TABLE multipartIdentifier DROP PARTITION FIELD transform           
             #dropPartitionField
+    | ALTER TABLE multipartIdentifier REPLACE PARTITION FIELD transform TO 
transform (AS name=identifier)? #replacePartitionField

Review comment:
       I think there are 2 use cases that have contradicting behaviors:
   1. `ADD PARTITION FIELD bucket(id, 16) AS shard`, then `REPLACE PARTITION 
FIELD shard WITH bucket(id, 32)`
   2. `ADD PARTITION FIELD days(ts) AS days_col`, then `REPLACE PARTITION FIELD 
days_col WITH hours(ts)`
   
   For case 1, we do want the `bucket(id, 32)` to also be called `shard`, but 
we don't really want to call the `hours(ts)` partition as `days_col`. 
   
   So here are a couple of observations for `REPLACE transformFrom WITH 
transformTo`:
   1. if `transformFrom` is an expression, the default partition field has very 
specific meanings such as `ts_days`, `id_bucket_16`, and the replaced partition 
field `transformTo` should not inherit that name
   2. if there is a custom name for the `transformFrom` partition field, the 
behavior really depends. The 2 examples above shows this contradicting 
expectations.
   
   So I think the safest approach is to not infer the behavior for the custom 
partition name. If the caller wants to use the same name, just use the `AS` 
clause to specify it again, such as `REPLACE PARTITION FIELD shard WITH 
bucket(id, 32) AS shard`.
   
   What do you think?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to