sm-Fifteen opened a new issue, #36558: URL: https://github.com/apache/arrow/issues/36558
### Describe the bug, including details regarding any error messages, version, and platform. See https://github.com/pola-rs/polars/issues/9586#issuecomment-1625418583, where the issue was identified. ```py import pyarrow as pa import pyarrow.compute as pc from datetime import datetime pc.assume_timezone(pa.array([datetime(2020, 1, 1)]), '+00:00') ``` > ArrowInvalid: Cannot locate timezone '+00:00': +00:00 not found in timezone database The [Arrow format specification](https://github.com/apache/arrow/blob/dd3670583d6fd8b95783f89f7b80c04588e16fb1/format/Schema.fbs#L319-L361) describes 3 timestamp formats: "naive date-time" (timezone string is null), "zoned date-time" (timezone string is from tzdb) and "offset date-time" (timezone string is a fixed RFC 3339 num-offset, so no Z). [The doc for assume_timezone](https://arrow.apache.org/docs/python/generated/pyarrow.compute.assume_timezone.html) makes no specific mention of it, but it cannot handle being passed an offset because it [only performs a lookup in tzdb](https://github.com/apache/arrow/blob/f8256bd625ac0b06238011fc13ea0249956e3859/cpp/src/arrow/compute/kernels/scalar_temporal_unary.cc#L116-L134) (via `LocateZone`) and doesn't try to parse the offset. That's despite documentation in other places saying that [offsets are completely fine and that "+00:00" should be considered as identical to "UTC"](https://arrow.apache.org/docs/cpp/api/datatype.html#classarrow_1 _1_timestamp_type). Given all this, I would expect "+00:00" to be properly recognized as UTC, and "+01:00" to be recognized as a fixed offset. ### Component(s) Python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
