Joe Muruganandam created ARROW-5359: ---------------------------------------
Summary: timestamp_as_object support for pa.Table.to_pandas in pyarrow Key: ARROW-5359 URL: https://issues.apache.org/jira/browse/ARROW-5359 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.13.0 Environment: Ubuntu Reporter: Joe Muruganandam Creating ticket for issue reported in github([https://github.com/apache/arrow/issues/4284]) h2. pyarrow (Issue with timestamp conversion from arrow to pandas) pyarrow Table.to_pandas has option date_as_object but does not have similar option for timestamp. When a timestamp column in arrow table is converted to pandas the target datetype is pd.Timestamp and pd.Timestamp does not handle time > 2262-04-11 23:47:16.854775807 and hence in the below scenario the date is transformed to incorrect value. Adding timestamp_as_object option in pa.Table.to_pandas will help in this scenario. #Python(3.6.8) import pandas as pd import pyarrow as pa pd.*version* '0.24.1' pa.*version* '0.13.0' import datetime df = pd.DataFrame(\{"test_date": [datetime.datetime(3000,12,31,12,0),datetime.datetime(3100,12,31,12,0)]}) df test_date 0 3000-12-31 12:00:00 1 3100-12-31 12:00:00 pa_table = pa.Table.from_pandas(df) pa_table[0] Column name='test_date' type=TimestampType(timestamp[us]) [ [ 32535172800000000, 35690846400000000 ] ] pa_table.to_pandas() test_date 0 1831-11-22 12:50:52.580896768 1 1931-11-22 12:50:52.580896768 -- This message was sent by Atlassian JIRA (v7.6.3#76005)