velvia commented on issue #686: URL: https://github.com/apache/arrow-datafusion/issues/686#issuecomment-881755559
@alamb @westonpace and others: just to continue this conversation and try and bring it to a close. 1) It sounds like the Arrow/Rust community would like to transition to supporting `Timestamp(_, UTC)` as the standard for UTC-based processing, rather than `Timestamp(_, None)`. 2) Since some people already use the existing `Timestamp(_, None)` to really mean `Timestamp(_, UTC)` (including us in some cases), and there is also data (in Parquet) that has at least UTC timezone info, we should provide interop to convert between the two, or at least cast things between. 3) A `timezone()` function or `with_timezone()` or `at_timezone()` function would be clearer than adding a second argument to `to_timestamp(...)` Do the above sound about right? I would also like to observe current obstacles to 1) and 2) in the Arrow/DF codebase (there are more, just what I've found in the last ~2 weeks): - `ScalarValue` in DataFusion only supports Timestamp types with None timezone - Coercion of Timestamp(_, UTC) to Timestamp(_, None) (or the other way around) is potentially wrong and also has some schema type check issues -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
