[
https://issues.apache.org/jira/browse/ARROW-5895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16881585#comment-16881585
]
Joris Van den Bossche commented on ARROW-5895:
----------------------------------------------
So what changed in 0.14.0 compared to 0.13 is that timestamp columns are now
also annotated with the new LogicalType (eg {{TIMESTAMP(unit=MICROS)}}) in
addition to the older ConvertedType ({{TIMESTAMP_MILLIS/MICROS}}. However,
there are some compatibility problems where the older ConvertedType is omitted
for tz-naive data (see ARROW-5889).
Could you try with timezone aware data to check if you are encountering the
same issue? Because it might be that the S3 parquet reader does not yet
understand the new LogicalTypes, and thus the absence of the ConvertedType
annotation could lead to interpreting it as just integers (as you see in the
output)
I don't think there is an option to *not* write those new LogicalTypes, but the
omission of the ConvertedType annotation is a bug that should be fixed for
0.14.1.
> [Python] New version stores timestamps as epoch ms instead of ISO timestamp
> string
> ----------------------------------------------------------------------------------
>
> Key: ARROW-5895
> URL: https://issues.apache.org/jira/browse/ARROW-5895
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 0.14.0
> Environment: Linux dev.office.whoop.com 3.10.0-957.21.3.el7.x86_64 #1
> SMP Tue Jun 18 16:35:19 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
> Reporter: John Wilson
> Priority: Major
>
> Just upgraded from pyarrow 0.13 to 0.14.
> Columns of type TimestampType(timestmap[ns]) now get written as epoch ms
> values:
> 1561939200507
> Where 0.13 wrote TimestampType(timestamp[ns]) as an ISO string:
> 2019-07-01T00:00:00.507Z
> This broke my implementation. How do I get pyarrow to write ISO strings
> again in 0.14?
>
> Here is my table write:
> {{ pyarrow.parquet.write_to_dataset(table=tbl, root_path=local_path,}}
> {{ partition_cols=['env', 'dt'],}}
> {{ coerce_timestamps='ms',}}
> {{ allow_truncated_timestamps=True,}}
> {{ version='2.0',}}
> {{ compression='SNAPPY')}}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)