velvia commented on issue #686: URL: https://github.com/apache/arrow-datafusion/issues/686#issuecomment-876625232
@adamhopper personally I'd be OK with your recommendations, but some thoughts: - There is some interop considerations between `TIMESTAMP WITH TIMEZONE` ie TimestampType(_, UTC) vs TimestampType(_, None). The reason is that so far DataFusion has been generating `TimestampType(_, None)`. If this changes to UTC, which makes sense as far as conventions go, then we need some way for people with data with Timestamp(_, None) to be able to convert it over to (_, UTC). - I'm not sure it would be acceptable to throw an error at `isAdjustedToUTC = false` as I'm sure people have Parquet files out there with this, and then DataFusion/Arrow wouldn't be able to read it. But maybe it's OK if you give people a way to cast / convert the timestamps over. Also, I see your point about `to_timestamp()` having a timezone argument being confusing. However there is a second use of `to_timestamp()` right now, which is for conversion purposes, from other types to Timestamp(resolution, tz). The reason it exists like this is because there is no standard SQL/PG way to work with multiple timestamp resolutions, which are required and supported by all Arrow implementations. Thus, while `TO_TIMESTAMP('2021-01-01T00:00:00', 'America/Montreal')` might be confusing, I believe the following would be more clear: * `to_timestamp_millis(my_timestamp_col, 'UTC')` -> cast my_timestamp_col from int or another timestamp resolution to millisecond resolution, and make sure it has UTC timezone -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org