canassa opened a new pull request, #49353: URL: https://github.com/apache/arrow/pull/49353
### Rationale for This Change The vendored Howard Hinnant date library hardcodes `/usr/share/zoneinfo` as the timezone database path in `discover_tz_dir()`. It does not check the `TZDIR` environment variable, which is the POSIX standard mechanism for overriding this path. This causes timezone operations to fail on non FHS Linux distributions such as NixOS, where `zoneinfo` resides under a non standard path like `/nix/store/.../share/zoneinfo`. The upstream library also lacks `TZDIR` support (HowardHinnant/date#858). --- ### What Changes Are Included in This PR? * `cpp/src/arrow/vendored/datetime/tz.cpp`: Check the `TZDIR` environment variable in `discover_tz_dir()` before falling back to platform specific hardcoded paths. Uses `stat()` and `S_ISDIR()` for validation, matching the existing pattern in the function. * `cpp/src/arrow/vendored/datetime/README.md`: Document the patch. * `cpp/src/arrow/public_api_test.cc`: Add a non Windows test that sets `TZDIR` and verifies timezone resolution succeeds through Arrow's compute API. * `python/pyarrow/conftest.py`: Respect `TZDIR` in the emscripten `timezone_data` test marker. --- ### Are These Changes Tested? Yes. A new `Misc.TZDIREnvironmentVariable` test sets `TZDIR` to a valid `zoneinfo` directory and casts a UTC timestamp to `America/New_York`, verifying the code path works end to end. --- ### Are There Any User Facing Changes? Arrow now respects the `TZDIR` environment variable on non Windows platforms, enabling timezone operations on systems without `/usr/share/zoneinfo`. Fixes #49351 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
