rdblue commented on pull request #1368: URL: https://github.com/apache/iceberg/pull/1368#issuecomment-692968192
@zhangdove, I brought up this PR at the last Iceberg sync to talk through the possible effects with the broader community. The general consensus was that it isn't a good idea to parameterize the date/time transforms. The main concern was that this would appear to be correct, but daylight savings time would unexpectedly result in a different required offset and we would have the same problem. We came up with a few alternatives: * Use hourly partitioning to ensure you can drop data at any hour boundary * Add the ability to register custom partition functions * Use v2 tables and delete files to avoid the need to align deletes or overwrites with a partitions. This option seems like the best in the long term, but it isn't very helpful right now I think that those are good alternatives to parameterizing these functions. What do you think? Would any of those work? The last two options will require some work, but I think would be better solutions in the long term. If you want to discuss this more, maybe you could start a thread on the dev list to discuss any issues that you see with these options. I think it would be great to have a wider discussion on this, since I found the comments during the sync to be valuable. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
