[
https://issues.apache.org/jira/browse/ARROW-12011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304974#comment-17304974
]
David Li commented on ARROW-12011:
----------------------------------
Thanks for the report! I can confirm this happens on the main branch (commit
43d00e9629fe34dc40c78ea96c008de186726a39).
In all cases, it's because the given date either overflows or is an invalid
value for the underlying C++ date type. I'm not sure if we should disallow
these values entirely, since the format (as far as I can see) says nothing
about the range of valid values, and the underlying value is valid, if extreme
- but at least you'd expect it to not crash when printing. I see [~bkietz] and
[~jorisvandenbossche] have looked at similar issues before - what do you think?
Trimmed backtrace for the crash. The main issue is that the
date::year_month_day value is invalid (in particular, the year is invalid, it's
-32768).
{noformat}
(gdb) bt
#0 0x00007ffff6e54fb7 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff6e56921 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007ffff40b3892 in __gnu_cxx::__verbose_terminate_handler () at
/home/conda/feedstock_root/build_artifacts/ctng-compilers_1610729750655/work/.build/x86_64-conda-linux-gnu/src/gcc/libstdc++-v3/libsupc++/vterminate.cc:95
#3 0x00007ffff40b1f69 in __cxxabiv1::__terminate (handler=<optimized out>) at
/home/conda/feedstock_root/build_artifacts/ctng-compilers_1610729750655/work/.build/x86_64-conda-linux-gnu/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:48
#4 0x00007ffff40b1fab in std::terminate () at
/home/conda/feedstock_root/build_artifacts/ctng-compilers_1610729750655/work/.build/x86_64-conda-linux-gnu/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:58
#5 0x00007ffff40b2194 in __cxxabiv1::__cxa_throw
(obj=obj@entry=0x555555de36d0, tinfo=tinfo@entry=0x7ffff416d1a8 <typeinfo for
std::__ios_failure>, dest=dest@entry=0x7ffff40d11d4
<std::__ios_failure::~__ios_failure()>)
at
/home/conda/feedstock_root/build_artifacts/ctng-compilers_1610729750655/work/.build/x86_64-conda-linux-gnu/src/gcc/libstdc++-v3/libsupc++/eh_throw.cc:95
#6 0x00007ffff40af3a2 in std::__throw_ios_failure
(__s=__s@entry=0x7ffff412e067 "basic_ios::clear")
at
/home/conda/feedstock_root/build_artifacts/ctng-compilers_1610729750655/work/.build/x86_64-conda-linux-gnu/src/gcc/libstdc++-v3/src/c++11/cxx11-ios_failure.cc:115
#7 0x00007ffff40eb0aa in std::basic_ios<char, std::char_traits<char> >::clear
(this=<optimized out>, __state=<optimized out>)
at
/home/conda/feedstock_root/build_artifacts/ctng-compilers_1610729750655/work/.build/x86_64-conda-linux-gnu/build/build-cc-gcc-final/x86_64-conda-linux-gnu/libstdc++-v3/include/bits/ios_base.h:166
#8 0x00007ffff5289af3 in arrow_vendored::date::to_stream<char,
std::char_traits<char>, std::chrono::duration<long, std::ratio<1l, 1l> > >
(os=..., fmt=0x7ffff5cf063d "F", fds=..., abbrev=0x7fffffffb170,
offset_sec=0x7fffffffb168)
at
/home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/vendored/datetime/date.h:5078
#9 0x00007ffff527678c in arrow_vendored::date::to_stream<char,
std::char_traits<char>, std::chrono::duration<int, std::ratio<86400l, 1l> > >
(os=..., fmt=0x7ffff5cf063c "%F", tp=...)
at
/home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/vendored/datetime/date.h:5995
#10 0x00007ffff52718f4 in arrow_vendored::date::format<char,
std::chrono::time_point<std::chrono::_V2::system_clock,
std::chrono::duration<int, std::ratio<86400l, 1l> > > > (fmt=0x7ffff5cf063c
"%F", tp=...)
at
/home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/vendored/datetime/date.h:6021
#11 0x00007ffff5353770 in
arrow::ArrayPrinter::FormatDateTime<std::chrono::duration<int,
std::ratio<86400l, 1l> > > (this=0x7fffffffb610, fmt=0x7ffff5cf063c "%F",
value=-1448879500, add_epoch=true)
at
/home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/pretty_print.cc:398
#12 0x00007ffff5350d86 in std::enable_if<std::is_base_of<arrow::DateType,
arrow::NumericArray<arrow::Date32Type>::TypeClass>::value, arrow::Status>::type
arrow::ArrayPrinter::WriteDataValues<arrow::NumericArray<arrow::Date32Type>
>(arrow::NumericArray<arrow::Date32Type>
const&)::{lambda(long)#1}::operator()(long) const (this=0x7fffffffb610, i=0) at
/home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/pretty_print.cc:170
#13 0x00007ffff535395b in
arrow::ArrayPrinter::WriteValues<std::enable_if<std::is_base_of<arrow::DateType,
arrow::NumericArray<arrow::Date32Type>::TypeClass>::value,
arrow::Status>::type
arrow::ArrayPrinter::WriteDataValues<arrow::NumericArray<arrow::Date32Type>
>(arrow::NumericArray<arrow::Date32Type>
const&)::{lambda(long)#1}>(arrow::Array const&,
std::enable_if<std::is_base_of<arrow::DateType,
arrow::NumericArray<arrow::Date32Type>::TypeClass>::value, arrow::Status>::type
arrow::ArrayPrinter::WriteDataValues<arrow::NumericArray<arrow::Date32Type>
>(arrow::NumericArray<arrow::Date32Type> const&)::{lambda(long)#1}&&)
(this=0x7fffffffb610, array=..., func=...)
at
/home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/pretty_print.cc:137
#14 0x00007ffff5350dd5 in
arrow::ArrayPrinter::WriteDataValues<arrow::NumericArray<arrow::Date32Type> >
(this=0x7fffffffb610, array=...) at
/home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/pretty_print.cc:170
#15 0x00007ffff534ee4f in
arrow::ArrayPrinter::Visit<arrow::NumericArray<arrow::Date32Type> >
(this=0x7fffffffb610, array=...) at
/home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/pretty_print.cc:314
#16 0x00007ffff534ccd1 in arrow::VisitArrayInline<arrow::ArrayPrinter>
(array=..., visitor=0x7fffffffb610) at
/home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/visitor_inline.h:126
#17 0x00007ffff534b352 in arrow::ArrayPrinter::Print (this=0x7fffffffb610,
array=...) at
/home/lidavidm/Code/upstream/arrow-12011/cpp/src/arrow/pretty_print.cc:389
{noformat}
> [Python] Crashes and incorrect results when converting large integers to dates
> ------------------------------------------------------------------------------
>
> Key: ARROW-12011
> URL: https://issues.apache.org/jira/browse/ARROW-12011
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 3.0.0
> Environment: OS: Windows 10 Pro (Version 20H2)
> CPU: AMD Ryzen 5 1600 Six-Core Processor 3.20 GHz
> Python: 3.8.8 AMD64
> pyarrow is latest version installed with pip
> Reporter: Tim Evans
> Priority: Major
>
> Running this code snippet will cause a crash. This happens for a range of
> numbers around this one as well:
>
> {code:java}
> import pyarrow
> date = pyarrow.array([-1448879500], pyarrow.date32())
> print(date)
> {code}
> I don't know where this crash is coming from, so it might be in the C++ code
> rather than the Python bindings.
> For other extreme numbers you get the wrong result. It looks like something
> is overflowing. Here is the input and result for a few different examples:
> * -2000000000 -> 31179-12-27
> * -1000000000 -> 16574-12-29
> * 2000000000 -> -27240-01-06
> * 1000000000 -> -12635-01-03
> I would prefer if these gave errors rather than silently overflowing.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)