[
https://issues.apache.org/jira/browse/ARROW-14567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17437877#comment-17437877
]
Joris Van den Bossche commented on ARROW-14567:
-----------------------------------------------
The PrettyPrint for Timestamp type is implemented in
{{StringFormatter<TimestampType>}} at {{formatting.h}}
(https://github.com/apache/arrow/blob/bf67ec74635db2183619601f025e4724bd5a6b75/cpp/src/arrow/util/formatting.h#L431-L473).
Currently this simply does not take into account any timezone information of
the type, and formats the stored UTC epoch.
The formatting code already uses the date.h utilities (and which we have
vendored anyway), so in principle we could use date.h to first localize the
epoch value. However, that makes printing dependent on finding a timezone
database (eg not yet supported on Windows at the moment).
An alternative could be to keep the printed value in UTC but add a {{+00:00}}
indication to make it it clear this are the UTC times that are printed (and not
the wall clock time in the timezone of the type).
cc [~rokm] [~apitrou] [~lidavidm]
> [C++][Python] PrettyPrint ignores timezone
> ------------------------------------------
>
> Key: ARROW-14567
> URL: https://issues.apache.org/jira/browse/ARROW-14567
> Project: Apache Arrow
> Issue Type: Improvement
> Reporter: Alenka Frim
> Priority: Major
>
> When printing TimestampArray in pyarrow the timezone information is ignored
> by PrettyPrint (__str__ calls to_string() in array.pxi).
> {code:python}
> import pyarrow as pa
> a = pa.array([0], pa.timestamp('s', tz='+02:00'))
> print(a) # representation not correct?
> # <pyarrow.lib.TimestampArray object at 0x7f834c7cb9a8>
> # [
> # 1970-01-01 00:00:00
> # ]
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)