advancedxy opened a new issue, #8258:
URL: https://github.com/apache/iceberg/issues/8258

   ### Feature Request / Improvement
   
   As discussed in #5626, it would be nice to have multi-arg transform 
supported in Iceberg, especially for bucket transform.
   
   I wrote up a design doc for this improvement: 
https://docs.google.com/document/d/1aDoZqRgvDOOUVAGhvKZbp5vFstjsAMY4EFCyjlxpaaw/edit?usp=sharing
   
   Quoted background from the doc:
   > Iceberg uses a transform to produce partitioning value from a source 
value. Currently the supported transforms are: `Years`, `Months`, `Days`, 
`Hours`, `Identity`, `Void`, `Truncate`, `Bucket`. Since the current spec 
requires that each partitioning field consists of a source column id in the 
table’s schema, the above transforms only accept one argument as its input. 
However, it’s possible and quite common to use multiple arguments to produce a 
partitioning value, especially for the `Bucket` transform. Other transforms 
might require multiple arguments in the future. This document tries to add 
multi-arg transform support in Iceberg, especially for the bucket transform.
   
   ------
   I also did a poc version of how multiple arg bucket would be supported in 
Spark: . Some places are not modified yet, such as UpdatePartitionSpec, 
TableMetadata related.
   I'd like to get feedbacks from the community before going too much further. 
   
   If we have reached the consensus that we should support multi-arg transform 
and the spec changes are stabilized after reviewing. I would update my code 
accordingly, and extend the Flink engine support.
   
   ### Query engine
   
   Spark


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to