velvia commented on issue #686:
URL: 
https://github.com/apache/arrow-datafusion/issues/686#issuecomment-876625232


   @adamhopper personally I'd be OK with your recommendations, but some 
thoughts:
   - There is some interop considerations between `TIMESTAMP WITH TIMEZONE` ie 
TimestampType(_, UTC) vs TimestampType(_, None).   The reason is that so far 
DataFusion has been generating `TimestampType(_, None)`.  If this changes to 
UTC, which makes sense as far as conventions go, then we need some way for 
people with data with Timestamp(_, None) to be able to convert it over to (_, 
UTC).
   - I'm not sure it would be acceptable to throw an error at `isAdjustedToUTC 
= false` as I'm sure people have Parquet files out there with this, and then 
DataFusion/Arrow wouldn't be able to read it.   But maybe it's OK if you give 
people a way to cast / convert the timestamps over.
   
   Also, I see your point about `to_timestamp()` having a timezone argument 
being confusing.    However there is a second use of `to_timestamp()` right 
now, which is for conversion purposes, from other types to 
Timestamp(resolution, tz).  The reason it exists like this is because there is 
no standard SQL/PG way to work with multiple timestamp resolutions, which are 
required and supported by all Arrow implementations.    
   
   Thus, while `TO_TIMESTAMP('2021-01-01T00:00:00', 'America/Montreal')` might 
be confusing, I believe the following would be more clear:
   
   * `to_timestamp_millis(my_timestamp_col, 'UTC')` -> cast my_timestamp_col 
from int or another timestamp resolution to millisecond resolution, and make 
sure it has UTC timezone
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to