avantgardnerio commented on issue #3100: URL: https://github.com/apache/arrow-datafusion/issues/3100#issuecomment-1213230893
@waitingkuo I think 95% of what you propose is very sensible. The other 5% I don't think is incorrect, but does raise cause for consideration. Some random thoughts: 1. I generally think we should try to stick to parity with postgres, as it is ubiquitous & mature, makes integration testing easier, and arbitrates disputes about desired behavior 2. counterpoint to #1: arrow is the foundation for datafusion, and as such I think we should go with it over postgres when pragmatic / sensible to do so 3. when possible I think we should also delegate date / time things to chrono, as these things are very difficult, and it would be nice to not have to maintain a date/time library as well as a parser, query engine, etc 4. as we push towards higher performance, I think #3 will become increasingly less possible (unless we want to merge GPU/CIMD support into chrono) 5. We should be clear that there are two types of timezone: Pacific Time (a legal entity) and PST/PDT (UTC+8/7 hours). It appears postgres only supports the later? Which is kind of strange because it limits the purpose of timezones IMO - having a meeting at a recurring time throughout the year requires application logic to switch between PST & PDT when appropriate. 6. The above legal manifestations add a lot of complexity. On linux chrono pulls its database from tzinfo on linux, and I'm not sure if there's an equivalent on windows. Supporting those means having an always-updated database outside of our codebase. (so I hope to either defer to chrono, or not do it). 7. Its important to remember that a timezone (of any kind) is not related to time, but rather location. It's where something is happening, not when. 8. Due to all of the above I've found it useful in OLTP systems to keep all timestamps as UTC and convert to the user locale when displaying. That being said, datafusion is designed for OLAP, so I can see uses where the user has a SQL console open and there is no app layer to convert, which makes a good case for supporting timezones in the query engine, but it comes with a high cost of implementation, so we should be very considerate before adding each layer of complexity (UTC offsets, calender operations, timezones). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
