[
https://issues.apache.org/jira/browse/ARROW-12011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17306230#comment-17306230
]
Joris Van den Bossche commented on ARROW-12011:
-----------------------------------------------
bq. I'm not sure if we should disallow these values entirely, since the format
(as far as I can see) says nothing about the range of valid values, and the
underlying value is valid, if extreme - but at least you'd expect it to not
crash when printing.
I agree that if the format doesn't mention anything about a valid range, we
should allow that but of course ensure this doesn't crash when printing.
bq. There is a related issue for validating temporal data, so perhaps these
out-of-bounds values should be rejected as suggested.
I think that might be about the fact that for {{date64}} type, the values
should be multiples of 86400000. Are there other inherent restrictions for any
of the other temporal types?
The date library that we use (vendor) for this formatting is
https://github.com/HowardHinnant/date, it might possibly also be worth updating
our vendored version.
> [C++][Python] Crashes and incorrect results when converting large integers to
> dates
> -----------------------------------------------------------------------------------
>
> Key: ARROW-12011
> URL: https://issues.apache.org/jira/browse/ARROW-12011
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++, Python
> Affects Versions: 3.0.0
> Environment: OS: Windows 10 Pro (Version 20H2)
> CPU: AMD Ryzen 5 1600 Six-Core Processor 3.20 GHz
> Python: 3.8.8 AMD64
> pyarrow is latest version installed with pip
> Reporter: Tim Evans
> Priority: Major
>
> Running this code snippet will cause a crash. This happens for a range of
> numbers around this one as well:
>
> {code:java}
> import pyarrow
> date = pyarrow.array([-1448879500], pyarrow.date32())
> print(date)
> {code}
> I don't know where this crash is coming from, so it might be in the C++ code
> rather than the Python bindings.
> For other extreme numbers you get the wrong result. It looks like something
> is overflowing. Here is the input and result for a few different examples:
> * -2000000000 -> 31179-12-27
> * -1000000000 -> 16574-12-29
> * 2000000000 -> -27240-01-06
> * 1000000000 -> -12635-01-03
> I would prefer if these gave errors rather than silently overflowing.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)