buhrmann commented on issue #36558: URL: https://github.com/apache/arrow/issues/36558#issuecomment-1798910222
Related to this (in the sense that offsets are inconsistently supported in Arrow), one can also cast with offsets that are later invalid, e.g. when converting to pandas. For example, while this works: ``` python arr = pa.array([datetime(2021, 1, 1)]).cast(pa.timestamp(unit="ns", tz="+04:00")) print(arr.to_pandas()) ``` This doesn't: ``` python arr = pa.array([datetime(2021, 1, 1)]).cast(pa.timestamp(unit="ns", tz="+0400")) print(arr.to_pandas()) ``` It wil happily cast with offset +0400 but will then fail in the conversion to pandas: ``` File [~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/array.pxi:867](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/array.pxi:867), in pyarrow.lib._PandasConvertible.to_pandas() File [~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/array.pxi:1491](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/array.pxi:1491), in pyarrow.lib.Array._to_pandas() File [~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/array.pxi:1741](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/array.pxi:1741), in pyarrow.lib._array_like_to_pandas() File [~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/pandas_compat.py:1215](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/pandas_compat.py:1215), in make_tz_aware(series, tz) [1211](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/pandas_compat.py:1211) def make_tz_aware(series, tz): [1212](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/pandas_compat.py:1212) """ [1213](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/pandas_compat.py:1213) Make a datetime64 Series timezone-aware for the given tz [1214](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/pandas_compat.py:1214) """ -> [1215](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/pandas_compat.py:1215) tz = pa.lib.string_to_tzinfo(tz) [1216](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/pandas_compat.py:1216) series = (series.dt.tz_localize('utc') [1217](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/pandas_compat.py:1217) .dt.tz_convert(tz)) [1218](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/pandas_compat.py:1218) return series File [~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/types.pxi:3576](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/types.pxi:3576), in pyarrow.lib.string_to_tzinfo() File [~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/error.pxi:144](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pyarrow/error.pxi:144), in pyarrow.lib.pyarrow_internal_check_status() File [~/micromamba/envs/grp/lib/python3.9/site-packages/pytz/__init__.py:188](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pytz/__init__.py:188), in timezone(zone) [186](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pytz/__init__.py:186) fp.close() [187](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pytz/__init__.py:187) else: --> [188](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pytz/__init__.py:188) raise UnknownTimeZoneError(zone) [190](https://file+.vscode-resource.vscode-cdn.net/Users/thomas/code/notebooks/~/micromamba/envs/grp/lib/python3.9/site-packages/pytz/__init__.py:190) return _tzinfo_cache[zone] UnknownTimeZoneError: '+0400' ``` Note that +0400 is a valid ISO offset, and pyarrow will itself create timestamp types with this as the `tz` parameter when reading a CSV, for example. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org