David Li created ARROW-10523:
--------------------------------

             Summary: [Python] Pandas timestamps are inferred to have only 
microsecond precision
                 Key: ARROW-10523
                 URL: https://issues.apache.org/jira/browse/ARROW-10523
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Python
    Affects Versions: 2.0.0
            Reporter: David Li


{code:java}
import pyarrow as pa
import pandas as pd
arr = pa.array([pd.Timestamp(year=2020, month=1, day=1, nanosecond=999)])
print(arr)
print(arr.type) {code}
This gives:
{noformat}
[
  2020-01-01 00:00:00.000000
]
timestamp[us]
{noformat}
However, Pandas Timestamps have nanosecond precision, which would be nice to 
preserve in inference.

The reason is that TypeInferrer [hardcodes 
microseconds|https://github.com/apache/arrow/blob/apache-arrow-2.0.0/cpp/src/arrow/python/inference.cc#L466]
 as it only knows about the standard library datetime, so I'm treating this as 
a feature request and not quite a bug. Of course, this can be worked around 
easily by specifying an explicit type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to