rdblue commented on issue #1482:
URL: https://github.com/apache/iceberg/issues/1482#issuecomment-756492770


   @samidalouche, sorry for the late reply. I didn't see this until today.
   
   Right now, we have a fixed set of transforms because we want to have 
consistent support across engines. To do that, we want a small set of 
transforms that are well-defined so that there isn't a huge cost to 
implementing support. Otherwise, we'd end up with inconsistent features across 
processing engines, which we really want to avoid.
   
   It is possible to add new transforms and possibly even to provide a way to 
customize transforms, but we would need a way to plug in code that is used by 
Iceberg. The hardest thing to implement is projection, which takes a predicate 
for rows in a table and converts it to a predicate that matches partition 
values. That's difficult to write and can cause problems if you get it wrong, 
but I think we can generalize it into two cases: monotonic transforms (like 
year, month, day, hour, truncate) and other transforms (like bucket). Then 
you'd only have to supply the `apply` implementation and a few booleans and 
Iceberg could handle the rest.
   
   I think it is reasonable for users to need additional partition functions, 
but I hope that we can add functions to Iceberg when they are needed so that 
everyone benefits.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to