kahemker commented on issue #46080:
URL: https://github.com/apache/arrow/issues/46080#issuecomment-2802176540

   Let's start by reviewing the [Installing 
PyArrow](https://arrow.apache.org/docs/python/install.html) documentation.  
[Using Pip](https://arrow.apache.org/docs/python/install.html#using-pip) simply 
states:
   
   > Install the latest version from PyPI (Windows, Linux and macOS)
   `pip install pyarrow`
   
   I experienced no importing issues of the pip wheels, so the Visual C++ 
Redistributable note can be skipped over.
   
   When I run [my minimum reproducible 
example](https://github.com/kahemker/pyarrow-orc-failure), I am presented with 
a timezone related runtime error, so now let's read the [tzdata on 
Windows](https://arrow.apache.org/docs/python/install.html#tzdata-on-windows) 
text in the install instructions
   
   > While Arrow uses the OS-provided timezone database on Linux and macOS, it 
requires a user-provided database on Windows. To download and extract the text 
version of the IANA timezone database follow the instructions in the C++ 
[Runtime 
Dependencies](https://arrow.apache.org/docs/cpp/build_system.html#download-timezone-database)
 or use pyarrow utility function pyarrow.util.download_tzdata_on_windows() that 
does the same.
   
   >By default, the timezone database will be detected at 
%USERPROFILE%\Downloads\tzdata. If the database has been downloaded in a 
different location, you will need to set a custom path to the database from 
Python:
   
   `import pyarrow as pa`
   
   `pa.set_timezone_db_path("custom_path")`
   
   I run `pyarrow.util.download_tzdata_on_windows()` and this does not solve my 
time zone problem.
   
   I run `pa.set_timezone_db_path("C:\Users\kyleh\Downloads\tzdata")` which I 
am led to believe should point PyArrow to the timezone information downloaded 
by `pyarrow.util.download_tzdata_on_windows()`.  Still experiencing the runtime 
error
   
   Not until do I perform these step does the runtime problem resolve itself
   
   > `pip install tzdata` or in my case since I am using uv `uv add tzdata`
   `$Env:TZDIR = 
"C:\Users\kyleh\PycharmProjects\pyarrow-orc-failure\.venv\Lib\site-packages\tzdata\zoneinfo"`
   
   None of this information is discussed in the PyArrow installation 
documentation.  Therefore, `pyarrow.util.download_tzdata_on_windows()` does not 
work anymore.
   
   Maybe it is better to say: `pyarrow.util.download_tzdata_on_windows()` 
works, but does not solve all timezone related issues with PyArrow and further 
guidance in the PyArrow installation documentation should be added to cover 
this edge case that I presented within this issue. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to