[GitHub] [arrow] jorisvandenbossche edited a comment on pull request #10457: ARROW-12980: [C++] Kernels to extract datetime components should be timezone aware

2021-06-24 Thread GitBox


jorisvandenbossche edited a comment on pull request #10457:
URL: https://github.com/apache/arrow/pull/10457#issuecomment-867391225


   > We're using `tz.h` library which needs an updated timezone database to 
correctly handle timezoned timestamps. See [installation 
instructions](https://howardhinnant.github.io/date/tz.html#Installation).
   > 
   > We have the following options (if I understand correctly) for getting a 
timezone database:
   > 
   > 1. local (non-windows) OS timezone database  - no work required.
   > 2. arrow bundled folder - we could bundle the database at build time for 
windows. Database would slowly go stale.
   > 3. download it from IANA Time Zone Database at runtime - `tz.h` gets the 
database at runtime, but curl (and 7-zip on windows) are required.
   > 4. local user-provided folder - user could provide a location at 
buildtime. Nice to have.
   
   Would a 5th option to allow runtime configuration be possible as well? 
(which I assume would need some modification to tz.h?)
   
   For reference, the recent Python PEP on this topic: 
https://www.python.org/dev/peps/pep-0615/#sources-for-time-zone-data
   
   > For now it would probably be best to create a Jira for this and keep these 
test disabled on windows.
   
   For me that's fine. But to be explicit, it is currently using the default 
configuration of tz.h (which is to download the latest version)? I would maybe 
use the `USE_OS_TZDB` compile flag for now, if we postpone the discussion of 
how to install the tzdata sources for a follow-up JIRA.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] jorisvandenbossche edited a comment on pull request #10457: ARROW-12980: [C++] Kernels to extract datetime components should be timezone aware

2021-06-10 Thread GitBox


jorisvandenbossche edited a comment on pull request #10457:
URL: https://github.com/apache/arrow/pull/10457#issuecomment-858395074


   I think it would be good to write some tests in python as well, as currently 
the C++ tests are very hard to verify since we don't yet have the ability to 
parse strings localized in the timezone (I mean: the strings in the tests are 
interpreted as UTC and not the "Australia/Broken_Hill" timezone. And thus as a 
result, the expected values can't be read/verified from the strings). 
   (while in python we could create the localized input timestamps with pandas)
   
   Given that, it might also make sense to first add a "localize" kernel for 
converting timestamps from naive to a certain timezone.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org