Dustin Moriarty created ARROW-15966:
---------------------------------------
Summary: pytz required by pyarrow but not included in package
metadata requirements.
Key: ARROW-15966
URL: https://issues.apache.org/jira/browse/ARROW-15966
Project: Apache Arrow
Issue Type: Bug
Components: Python
Reporter: Dustin Moriarty
Pyarrow raises a ModuleNotFoundError for pytz when a timestamped timezone is
used. However, pytz is not included in the pyarrow package metadata as a
standard requirement or an extra.
Pyarrow Version: 7.0.0
Python Version: 3.10.2
OS Version: macOS 12.3
How to reproduce.
1. Create a clean environment. I use pyenv but there are lots of ways to get to
the same result. As long as you are using something deterministic and not
something like conda you get the idea.
{code:java}
pyenv virtualenv 3.10.2 pyarrow_test_env
pyenv activate pyarrow_test_env{code}
2. Install pyarrow.
{code:java}
pip install pyarrow{code}
3. Create a table with a datetime with a timezone.
{code:java}
>>> import pyarrow
>>> from datetime import datetime
>>> from datetime import timezone
>>> pyarrow.table({"my_time": [datetime(2022,1,1, tzinfo=timezone.utc)]})
>>> Traceback (most recent call last): File "<stdin>", line 1, in <module>
>>> File "pyarrow/table.pxi", line 2577, in pyarrow.lib.table File
>>> "pyarrow/table.pxi", line 1868, in pyarrow.lib.Table.from_pydict File
>>> "pyarrow/table.pxi", line 2658, in pyarrow.lib._from_pydict File
>>> "pyarrow/array.pxi", line 342, in pyarrow.lib.asarray File
>>> "pyarrow/array.pxi", line 316, in pyarrow.lib.array File
>>> "pyarrow/array.pxi", line 39, in pyarrow.lib._sequence_to_array File
>>> "pyarrow/error.pxi", line 143, in pyarrow.lib.pyarrow_internal_check_status
>>> ModuleNotFoundError: No module named 'pytz'{code}
The only package required by pyarrow is numpy. There are no extra requirements
defined. If there are optional extras they should be defined in the package
metadata (e.g. setup.py extras_require).
--
This message was sent by Atlassian Jira
(v8.20.1#820001)