rdblue commented on issue #417: Adding support for time-based partitioning on long column type URL: https://github.com/apache/incubator-iceberg/issues/417#issuecomment-526384694 @shardulm94, do you intend to use this for tables with existing data? If you do intend to use this with existing tables, then I'm not sure that you want to use the time-based hidden partitioning transforms. The problem is that changing the partitioning for existing tables that use identity partitioning is that your queries may start fail because you're no longer producing the old partition columns (e.g., `ts_date=cast(cast(ts as date) as string)`. And if you are producing the old partition columns, then there's not much of a point to add extra time-based partitioning (splits will also be pruned using time ranges from min/max metadata). If you don't intend to use existing data, then do normal timestamps work? I guess there's another case, where you want to rebuild the table metadata, but use old data files. In that case, is there anything to distinguish the data in these columns from timestamps with a different format, like long values that store microseconds from epoch? The problem is correctness when other people start using this. If Iceberg supports interpreting a long column as an instant, then it must be obvious what the unit of the long type is. Maybe we could allow this if the column name includes some clue, like `timestamp_millis` vs `timestamp_micros`, but that sounds hacky to me. Another solution is to add a way to promote from long to timestamp type and store the units of the long in metadata somewhere. Then you would be able to use old data as real timestamp columns.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
