[GitHub] [arrow] guidocioni opened a new issue #9781: Reading a parquet with timestamp out of bounds in pandas

GitBox Tue, 23 Mar 2021 07:28:07 -0700


guidocioni opened a new issue #9781:
URL: https://github.com/apache/arrow/issues/9781



   Hi,
   I'm using `pandas` to read a parquet file which unfortunately has some 
timestamps in the year 2500. Of course this causes problem `Casting from 
timestamp[us, tz=UTC] to timestamp[ns] would result in out of bounds timestamp: 
16740777600000000`. 
   There should be an option in `read_table` with `safe = False` which would 
coerce these timestamps to be `NaT` which would be the ideal solution but 
unfortunately there is nothing in the official documentation. 
   Right now I'm blocked as I cannot modify the original parquet file (as I 
cannot open it).
   Is there any workaround to somehow filter out these values before reading it 
into pandas? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] guidocioni opened a new issue #9781: Reading a parquet with timestamp out of bounds in pandas

Reply via email to