[
https://issues.apache.org/jira/browse/ARROW-15221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Li updated ARROW-15221:
-----------------------------
Description:
The test seems to be flaky. [Full
log|https://github.com/ursacomputing/crossbow/runs/4664466384?check_suite_focus=true]
{noformat}
44/84 Test #35: arrow-compute-hash-join-node-test .........***Failed 8.63 sec
Running arrow-compute-hash-join-node-test, redirecting output into
/build/cpp/build/test-logs/arrow-compute-hash-join-node-test.txt (attempt 1/1)
/arrow/cpp/build-support/run-test.sh: line 88: 19125 Segmentation fault
(core dumped) $TEST_EXECUTABLE "$@" > $LOGFILE.raw 2>&1
Running main() from
/build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest_main.cc
[==========] Running 23 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 7 tests from HashJoin
[ RUN ] HashJoin.Random
/arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:934: Failure
Failed
'_error_or_value46.status()' failed with Cancelled: Scheduler cancelled
Google Test trace:
/arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:1053: FULL_OUTER IS
parallel = false
/build/cpp/src/arrow/compute/exec
{noformat}
Another one observed in AMD64 Conda C++ [Full
Log|https://github.com/apache/arrow/runs/5055044211?check_suite_focus=true]
{noformat}
[----------] 7 tests from HashJoin
[ RUN ] HashJoin.Random
Found core dump, printing backtrace:warning: core file may not match specified
executable file.
[New LWP 19309]
[New LWP 19308]
[New LWP 19306]
[New LWP 19310]
[New LWP 19307]
[New LWP 19311]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/build/cpp/debug/arrow-compute-hash-join-node-test'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000000000011479 in ?? ()
[Current thread is 1 (Thread 0x7f8cfcb7d700 (LWP 19309))]Thread 6 (Thread
0x7f8cf9fff700 (LWP 19311)):
#0 0x00007f8d01131065 in futex_abstimed_wait_cancelable (private=<optimized
out>, abstime=0x7f8cf9ffd4a0, expected=0, futex_word=0x7f8cff40a790) at
../sysdeps/unix/sysv/linux/futex-internal.h:205
#1 __pthread_cond_wait_common (abstime=0x7f8cf9ffd4a0, mutex=0x7f8cff40a7d8,
cond=0x7f8cff40a768) at pthread_cond_wait.c:539
#2 __pthread_cond_timedwait (cond=0x7f8cff40a768, mutex=0x7f8cff40a7d8,
abstime=0x7f8cf9ffd4a0) at pthread_cond_wait.c:667
#3 0x00007f8d04411496 in background_thread_sleep (tsdn=<optimized out>,
interval=<optimized out>, info=0x7f8cff40a760) at src/background_thread.c:255
#4 background_work_sleep_once (ind=<optimized out>, info=<optimized out>,
tsdn=<optimized out>) at src/background_thread.c:307
#5 background_work (ind=<optimized out>, tsd=<optimized out>) at
src/background_thread.c:497
#6 background_thread_entry () at src/background_thread.c:522
#7 0x00007f8d0112a6db in start_thread (arg=0x7f8cf9fff700) at
pthread_create.c:463
#8 0x00007f8d0180171f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95Thread 5 (Thread 0x7f8cfedff700
(LWP 19307)):
#0 0x00007f8d01131065 in futex_abstimed_wait_cancelable (private=<optimized
out>, abstime=0x7f8cfedfd4a0, expected=0, futex_word=0x7f8cff40a5f0) at
../sysdeps/unix/sysv/linux/futex-internal.h:205
#1 __pthread_cond_wait_common (abstime=0x7f8cfedfd4a0, mutex=0x7f8cff40a638,
cond=0x7f8cff40a5c8) at pthread_cond_wait.c:539
#2 __pthread_cond_timedwait (cond=0x7f8cff40a5c8, mutex=0x7f8cff40a638,
abstime=0x7f8cfedfd4a0) at pthread_cond_wait.c:667
#3 0x00007f8d04411bc6 in background_thread_sleep (tsdn=<optimized out>,
interval=<optimized out>, info=<optimized out>) at src/background_thread.c:255
#4 background_work_sleep_once (ind=0, info=<optimized out>, tsdn=<optimized
out>) at src/background_thread.c:307
#5 background_thread0_work (tsd=<optimized out>) at src/background_thread.c:452
#6 background_work (ind=<optimized out>, tsd=<optimized out>) at
src/background_thread.c:490
#7 background_thread_entry () at src/background_thread.c:522
#8 0x00007f8d0112a6db in start_thread (arg=0x7f8cfedff700) at
pthread_create.c:463
#9 0x00007f8d0180171f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95Thread 4 (Thread 0x7f8cfb5ff700
(LWP 19310)):
#0 0x00007f8d01131065 in futex_abstimed_wait_cancelable (private=<optimized
out>, abstime=0x7f8cfb5fd4a0, expected=0, futex_word=0x7f8cff40a6c0) at
../sysdeps/unix/sysv/linux/futex-internal.h:205
#1 __pthread_cond_wait_common (abstime=0x7f8cfb5fd4a0, mutex=0x7f8cff40a708,
cond=0x7f8cff40a698) at pthread_cond_wait.c:539
#2 __pthread_cond_timedwait (cond=0x7f8cff40a698, mutex=0x7f8cff40a708,
abstime=0x7f8cfb5fd4a0) at pthread_cond_wait.c:667
#3 0x00007f8d04411496 in background_thread_sleep (tsdn=<optimized out>,
interval=<optimized out>, info=0x7f8cff40a690) at src/background_thread.c:255
#4 background_work_sleep_once (ind=<optimized out>, info=<optimized out>,
tsdn=<optimized out>) at src/background_thread.c:307
#5 background_work (ind=<optimized out>, tsd=<optimized out>) at
src/background_thread.c:497
#6 background_thread_entry () at src/background_thread.c:522
#7 0x00007f8d0112a6db in start_thread (arg=0x7f8cfb5ff700) at
pthread_create.c:463
#8 0x00007f8d0180171f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95Thread 3 (Thread 0x7f8cff6db0c0
(LWP 19306)):
#0 0x00005630dcb287d4 in __gnu_cxx::operator==<int const*, std::vector<int,
std::allocator<int> > > (__lhs=..., __rhs=...) at
/opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/stl_iterator.h:890
#1 0x00005630dcb1d3d1 in std::vector<int, std::allocator<int> >::empty
(this=0x7ffc7ab72ae0) at
/opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/stl_vector.h:1005
#2 0x00005630dcafd4ea in arrow::compute::HashJoinSimpleInt
(join_type=arrow::compute::JoinType::FULL_OUTER, l=..., null_in_key_l=...,
r=..., null_in_key_r=..., result_l=0x7ffc7ab72cb0, result_r=0x7ffc7ab72cd0,
output_length_limit=100000, length_limit_reached=0x7ffc7ab72e77) at
/arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:781
#3 0x00005630dcafe22c in arrow::compute::HashJoinSimple (ctx=0x5630de57a320,
join_type=arrow::compute::JoinType::FULL_OUTER, cmp=..., num_key_fields=1,
key_id_l=..., key_id_r=..., original_l=..., original_r=..., l=..., r=...,
output_ids_l=..., output_ids_r=..., output_length_limit=100000,
length_limit_reached=0x7ffc7ab72e77) at
/arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:887
#4 0x00005630dcb011c0 in arrow::compute::HashJoin_Random_Test::TestBody
(this=0x5630de47a300) at
/arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:1067
#5 0x00007f8d056a3c9c in
testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>
(object=0x5630de47a300, method=&virtual testing::Test::TestBody(),
location=0x7f8d056b897b "the test body") at
/build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2607
#6 0x00007f8d0569add2 in
testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>
(object=0x5630de47a300, method=&virtual testing::Test::TestBody(),
location=0x7f8d056b897b "the test body") at
/build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2643
#7 0x00007f8d05675c03 in testing::Test::Run (this=0x5630de47a300) at
/build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2682
#8 0x00007f8d0567663b in testing::TestInfo::Run (this=0x5630de476b50) at
/build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2861
#9 0x00007f8d05677010 in testing::TestSuite::Run (this=0x5630de476c70) at
/build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:3015
#10 0x00007f8d0568731c in testing::internal::UnitTestImpl::RunAllTests
(this=0x5630de4762e0) at
/build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:5855
#11 0x00007f8d056a4ce8 in
testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,
bool> (object=0x5630de4762e0, method=(bool
(testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const))
0x7f8d05686ed8 <testing::internal::UnitTestImpl::RunAllTests()>,
location=0x7f8d056b9468 "auxiliary test code (environments or event
listeners)") at
/build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2607
#12 0x00007f8d0569c064 in
testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,
bool> (object=0x5630de4762e0, method=(bool
(testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const))
0x7f8d05686ed8 <testing::internal::UnitTestImpl::RunAllTests()>,
location=0x7f8d056b9468 "auxiliary test code (environments or event
listeners)") at
/build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2643
#13 0x00007f8d056857b7 in testing::UnitTest::Run (this=0x7f8d056e5260
<testing::UnitTest::GetInstance()::instance>) at
/build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:5438
#14 0x00007f8d056e6919 in RUN_ALL_TESTS () at
/build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/include/gtest/gtest.h:2490
#15 0x00007f8d056e695c in main (argc=1, argv=0x7ffc7ab739d8) at
/build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest_main.cc:52
#16 0x00007f8d01701bf7 in __libc_start_main (main=0x7f8d056e691b <main(int,
char**)>, argc=1, argv=0x7ffc7ab739d8, init=<optimized out>, fini=<optimized
out>, rtld_fini=<optimized out>, stack_end=0x7ffc7ab739c8) at
../csu/libc-start.c:310
#17 0x00005630dcaedf49 in _start ()Thread 2 (Thread 0x7f8cfdb7e700 (LWP 19308)):
#0 0x00007f8d01130ad3 in futex_wait_cancelable (private=<optimized out>,
expected=0, futex_word=0x5630de565a80) at
../sysdeps/unix/sysv/linux/futex-internal.h:88
#1 __pthread_cond_wait_common (abstime=0x0, mutex=0x5630de565a30,
cond=0x5630de565a58) at pthread_cond_wait.c:502
#2 __pthread_cond_wait (cond=0x5630de565a58, mutex=0x5630de565a30) at
pthread_cond_wait.c:655
#3 0x00007f8d01b994d1 in __gthread_cond_wait (__mutex=<error reading variable:
dwarf2_find_location_expression: Corrupted DWARF expression.>,
__cond=<optimized out>) at
/home/conda/feedstock_root/build_artifacts/gcc_compilers_1643063175398/work/build/x86_64-conda-linux-gnu/libstdc++-v3/src/c++11/condition_variable.cc:865
#4 std::__condvar::wait (__m=<error reading variable:
dwarf2_find_location_expression: Corrupted DWARF expression.>, this=<optimized
out>) at ../../../../../libstdc++-v3/src/c++11/gthr-default.h:155
#5 std::condition_variable::wait (this=<optimized out>, __lock=...) at
../../../../../libstdc++-v3/src/c++11/condition_variable.cc:41
#6 0x00007f8d02f8fbb7 in arrow::internal::WorkerLoop (state=..., it=...) at
/arrow/cpp/src/arrow/util/thread_pool.cc:195
#7 0x00007f8d02f90960 in
arrow::internal::ThreadPool::<lambda()>::operator()(void) const
(__closure=0x5630de561958) at /arrow/cpp/src/arrow/util/thread_pool.cc:344
#8 0x00007f8d02f97498 in std::__invoke_impl<void,
arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
>(std::__invoke_other, arrow::internal::ThreadPool::<lambda()> &&) (__f=...) at
/opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/invoke.h:60
#9 0x00007f8d02f97438 in
std::__invoke<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
>(arrow::internal::ThreadPool::<lambda()> &&) (__fn=...) at
/opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/invoke.h:95
#10 0x00007f8d02f973d6 in
std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
> >::_M_invoke<0>(std::_Index_tuple<0>) (this=0x5630de561958) at
/opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:244
#11 0x00007f8d02f97293 in
std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
> >::operator()(void) (this=0x5630de561958) at
/opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:251
#12 0x00007f8d02f971e4 in
std::thread::_State_impl<std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
> > >::_M_run(void) (this=0x5630de561950) at
/opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:195
#13 0x00007f8d01b9d9d4 in std::execute_native_thread_routine (__p=<optimized
out>) at
/home/conda/feedstock_root/build_artifacts/gcc_compilers_1643063175398/work/build/x86_64-conda-linux-gnu/libstdc++-v3/include/bits/new_allocator.h:82
#14 0x00007f8d0112a6db in start_thread (arg=0x7f8cfdb7e700) at
pthread_create.c:463
#15 0x00007f8d0180171f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95Thread 1 (Thread 0x7f8cfcb7d700
(LWP 19309)):
#0 0x0000000000011479 in ?? ()
#1 0x00007f8d0331fae3 in arrow::compute::TaskSchedulerImpl::ScheduleMore
(this=0x5630de572960, thread_id=0, num_tasks_finished=0) at
/arrow/cpp/src/arrow/compute/exec/task_util.cc:326
#2 0x00007f8d0331e94c in arrow::compute::TaskSchedulerImpl::StartTaskGroup
(this=0x5630de572960, thread_id=0, group_id=1, total_num_tasks=0) at
/arrow/cpp/src/arrow/compute/exec/task_util.cc:153
#3 0x00007f8d0327d952 in arrow::compute::HashJoinBasicImpl::ProbeQueuedBatches
(this=0x7f8cec24aee0, thread_index=0) at
/arrow/cpp/src/arrow/compute/exec/hash_join.cc:726
#4 0x00007f8d0327d13b in
arrow::compute::HashJoinBasicImpl::BuildHashTable_on_finished
(this=0x7f8cec24aee0, thread_index=0) at
/arrow/cpp/src/arrow/compute/exec/hash_join.cc:663
#5 0x00007f8d0327d2db in
arrow::compute::HashJoinBasicImpl::RegisterBuildHashTable()::{lambda(unsigned
long)#2}::operator()(unsigned long) const (__closure=0x5630de654840,
thread_index=0) at /arrow/cpp/src/arrow/compute/exec/hash_join.cc:674
#6 0x00007f8d0328213c in std::_Function_handler<arrow::Status (unsigned long),
arrow::compute::HashJoinBasicImpl::RegisterBuildHashTable()::{lambda(unsigned
long)#2}>::_M_invoke(std::_Any_data const&, unsigned long&&) (__functor=...,
__args#0=@0x7f8cfcb7b138: 0) at
/opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_function.h:286
#7 0x00007f8d032aa81e in std::function<arrow::Status (unsigned
long)>::operator()(unsigned long) const (this=0x5630de654840, __args#0=0) at
/opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_function.h:688
#8 0x00007f8d0331f041 in
arrow::compute::TaskSchedulerImpl::OnTaskGroupFinished (this=0x5630de572960,
thread_id=0, group_id=0, all_task_groups_finished=0x7f8cfcb7b230) at
/arrow/cpp/src/arrow/compute/exec/task_util.cc:244
#9 0x00007f8d0331f934 in
arrow::compute::TaskSchedulerImpl::<lambda(size_t)>::operator()(size_t) const
(__closure=0x5630de6a1390, thread_id=0) at
/arrow/cpp/src/arrow/compute/exec/task_util.cc:349
#10 0x00007f8d0332152f in std::_Function_handler<arrow::Status(long unsigned
int), arrow::compute::TaskSchedulerImpl::ScheduleMore(size_t,
int)::<lambda(size_t)> >::_M_invoke(const std::_Any_data &, unsigned long &&)
(__functor=..., __args#0=@0x7f8cfcb7b2b8: 0) at
/opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_function.h:286
#11 0x00007f8d032aa81e in std::function<arrow::Status (unsigned
long)>::operator()(unsigned long) const (this=0x5630de654f70, __args#0=0) at
/opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_function.h:688
#12 0x00007f8d032a7f8c in
arrow::compute::HashJoinNode::ScheduleTaskCallback(std::function<arrow::Status
(unsigned long)>)::{lambda()#1}::operator()() const (__closure=0x5630de654f68)
at /arrow/cpp/src/arrow/compute/exec/hash_join_node.cc:604
#13 0x00007f8d032b9329 in arrow::internal::FnOnce<void
()>::FnImpl<arrow::compute::HashJoinNode::ScheduleTaskCallback(std::function<arrow::Status
(unsigned long)>)::{lambda()#1}>::invoke() (this=0x5630de654f60) at
/arrow/cpp/src/arrow/util/functional.h:152
#14 0x00007f8d02f91ade in arrow::internal::FnOnce<void ()>::operator()() &&
(this=0x7f8cfcb7b3f0) at /arrow/cpp/src/arrow/util/functional.h:140
#15 0x00007f8d02f8fa87 in arrow::internal::WorkerLoop (state=..., it=...) at
/arrow/cpp/src/arrow/util/thread_pool.cc:177
#16 0x00007f8d02f90960 in
arrow::internal::ThreadPool::<lambda()>::operator()(void) const
(__closure=0x5630de659468) at /arrow/cpp/src/arrow/util/thread_pool.cc:344
#17 0x00007f8d02f97498 in std::__invoke_impl<void,
arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
>(std::__invoke_other, arrow::internal::ThreadPool::<lambda()> &&) (__f=...) at
/opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/invoke.h:60
#18 0x00007f8d02f97438 in
std::__invoke<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
>(arrow::internal::ThreadPool::<lambda()> &&) (__fn=...) at
/opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/invoke.h:95
#19 0x00007f8d02f973d6 in
std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
> >::_M_invoke<0>(std::_Index_tuple<0>) (this=0x5630de659468) at
/opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:244
#20 0x00007f8d02f97293 in
std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
> >::operator()(void) (this=0x5630de659468) at
/opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:251
#21 0x00007f8d02f971e4 in
std::thread::_State_impl<std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
> > >::_M_run(void) (this=0x5630de659460) at
/opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:195
#22 0x00007f8d01b9d9d4 in std::execute_native_thread_routine (__p=<optimized
out>) at
/home/conda/feedstock_root/build_artifacts/gcc_compilers_1643063175398/work/build/x86_64-conda-linux-gnu/libstdc++-v3/include/bits/new_allocator.h:82
#23 0x00007f8d0112a6db in start_thread (arg=0x7f8cfcb7d700) at
pthread_create.c:463
#24 0x00007f8d0180171f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95
/build/cpp/src/arrow/compute/exec {noformat}
was:
The test seems to be flaky. [Full
log|https://github.com/ursacomputing/crossbow/runs/4664466384?check_suite_focus=true]
{noformat}
44/84 Test #35: arrow-compute-hash-join-node-test .........***Failed 8.63 sec
Running arrow-compute-hash-join-node-test, redirecting output into
/build/cpp/build/test-logs/arrow-compute-hash-join-node-test.txt (attempt 1/1)
/arrow/cpp/build-support/run-test.sh: line 88: 19125 Segmentation fault
(core dumped) $TEST_EXECUTABLE "$@" > $LOGFILE.raw 2>&1
Running main() from
/build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest_main.cc
[==========] Running 23 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 7 tests from HashJoin
[ RUN ] HashJoin.Random
/arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:934: Failure
Failed
'_error_or_value46.status()' failed with Cancelled: Scheduler cancelled
Google Test trace:
/arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:1053: FULL_OUTER IS
parallel = false
/build/cpp/src/arrow/compute/exec
{noformat}
> [C++] Occasional failure arrow-compute-hash-join-node-test
> ----------------------------------------------------------
>
> Key: ARROW-15221
> URL: https://issues.apache.org/jira/browse/ARROW-15221
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++
> Reporter: David Li
> Priority: Major
> Labels: query-engine
> Attachments: log.txt
>
>
> The test seems to be flaky. [Full
> log|https://github.com/ursacomputing/crossbow/runs/4664466384?check_suite_focus=true]
> {noformat}
> 44/84 Test #35: arrow-compute-hash-join-node-test .........***Failed 8.63
> sec
> Running arrow-compute-hash-join-node-test, redirecting output into
> /build/cpp/build/test-logs/arrow-compute-hash-join-node-test.txt (attempt 1/1)
> /arrow/cpp/build-support/run-test.sh: line 88: 19125 Segmentation fault
> (core dumped) $TEST_EXECUTABLE "$@" > $LOGFILE.raw 2>&1
> Running main() from
> /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest_main.cc
> [==========] Running 23 tests from 2 test suites.
> [----------] Global test environment set-up.
> [----------] 7 tests from HashJoin
> [ RUN ] HashJoin.Random
> /arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:934: Failure
> Failed
> '_error_or_value46.status()' failed with Cancelled: Scheduler cancelled
> Google Test trace:
> /arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:1053: FULL_OUTER IS
> parallel = false
> /build/cpp/src/arrow/compute/exec
> {noformat}
> Another one observed in AMD64 Conda C++ [Full
> Log|https://github.com/apache/arrow/runs/5055044211?check_suite_focus=true]
> {noformat}
> [----------] 7 tests from HashJoin
> [ RUN ] HashJoin.Random
> Found core dump, printing backtrace:warning: core file may not match
> specified executable file.
> [New LWP 19309]
> [New LWP 19308]
> [New LWP 19306]
> [New LWP 19310]
> [New LWP 19307]
> [New LWP 19311]
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> Core was generated by `/build/cpp/debug/arrow-compute-hash-join-node-test'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0 0x0000000000011479 in ?? ()
> [Current thread is 1 (Thread 0x7f8cfcb7d700 (LWP 19309))]Thread 6 (Thread
> 0x7f8cf9fff700 (LWP 19311)):
> #0 0x00007f8d01131065 in futex_abstimed_wait_cancelable (private=<optimized
> out>, abstime=0x7f8cf9ffd4a0, expected=0, futex_word=0x7f8cff40a790) at
> ../sysdeps/unix/sysv/linux/futex-internal.h:205
> #1 __pthread_cond_wait_common (abstime=0x7f8cf9ffd4a0, mutex=0x7f8cff40a7d8,
> cond=0x7f8cff40a768) at pthread_cond_wait.c:539
> #2 __pthread_cond_timedwait (cond=0x7f8cff40a768, mutex=0x7f8cff40a7d8,
> abstime=0x7f8cf9ffd4a0) at pthread_cond_wait.c:667
> #3 0x00007f8d04411496 in background_thread_sleep (tsdn=<optimized out>,
> interval=<optimized out>, info=0x7f8cff40a760) at src/background_thread.c:255
> #4 background_work_sleep_once (ind=<optimized out>, info=<optimized out>,
> tsdn=<optimized out>) at src/background_thread.c:307
> #5 background_work (ind=<optimized out>, tsd=<optimized out>) at
> src/background_thread.c:497
> #6 background_thread_entry () at src/background_thread.c:522
> #7 0x00007f8d0112a6db in start_thread (arg=0x7f8cf9fff700) at
> pthread_create.c:463
> #8 0x00007f8d0180171f in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95Thread 5 (Thread 0x7f8cfedff700
> (LWP 19307)):
> #0 0x00007f8d01131065 in futex_abstimed_wait_cancelable (private=<optimized
> out>, abstime=0x7f8cfedfd4a0, expected=0, futex_word=0x7f8cff40a5f0) at
> ../sysdeps/unix/sysv/linux/futex-internal.h:205
> #1 __pthread_cond_wait_common (abstime=0x7f8cfedfd4a0, mutex=0x7f8cff40a638,
> cond=0x7f8cff40a5c8) at pthread_cond_wait.c:539
> #2 __pthread_cond_timedwait (cond=0x7f8cff40a5c8, mutex=0x7f8cff40a638,
> abstime=0x7f8cfedfd4a0) at pthread_cond_wait.c:667
> #3 0x00007f8d04411bc6 in background_thread_sleep (tsdn=<optimized out>,
> interval=<optimized out>, info=<optimized out>) at src/background_thread.c:255
> #4 background_work_sleep_once (ind=0, info=<optimized out>, tsdn=<optimized
> out>) at src/background_thread.c:307
> #5 background_thread0_work (tsd=<optimized out>) at
> src/background_thread.c:452
> #6 background_work (ind=<optimized out>, tsd=<optimized out>) at
> src/background_thread.c:490
> #7 background_thread_entry () at src/background_thread.c:522
> #8 0x00007f8d0112a6db in start_thread (arg=0x7f8cfedff700) at
> pthread_create.c:463
> #9 0x00007f8d0180171f in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95Thread 4 (Thread 0x7f8cfb5ff700
> (LWP 19310)):
> #0 0x00007f8d01131065 in futex_abstimed_wait_cancelable (private=<optimized
> out>, abstime=0x7f8cfb5fd4a0, expected=0, futex_word=0x7f8cff40a6c0) at
> ../sysdeps/unix/sysv/linux/futex-internal.h:205
> #1 __pthread_cond_wait_common (abstime=0x7f8cfb5fd4a0, mutex=0x7f8cff40a708,
> cond=0x7f8cff40a698) at pthread_cond_wait.c:539
> #2 __pthread_cond_timedwait (cond=0x7f8cff40a698, mutex=0x7f8cff40a708,
> abstime=0x7f8cfb5fd4a0) at pthread_cond_wait.c:667
> #3 0x00007f8d04411496 in background_thread_sleep (tsdn=<optimized out>,
> interval=<optimized out>, info=0x7f8cff40a690) at src/background_thread.c:255
> #4 background_work_sleep_once (ind=<optimized out>, info=<optimized out>,
> tsdn=<optimized out>) at src/background_thread.c:307
> #5 background_work (ind=<optimized out>, tsd=<optimized out>) at
> src/background_thread.c:497
> #6 background_thread_entry () at src/background_thread.c:522
> #7 0x00007f8d0112a6db in start_thread (arg=0x7f8cfb5ff700) at
> pthread_create.c:463
> #8 0x00007f8d0180171f in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95Thread 3 (Thread 0x7f8cff6db0c0
> (LWP 19306)):
> #0 0x00005630dcb287d4 in __gnu_cxx::operator==<int const*, std::vector<int,
> std::allocator<int> > > (__lhs=..., __rhs=...) at
> /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/stl_iterator.h:890
> #1 0x00005630dcb1d3d1 in std::vector<int, std::allocator<int> >::empty
> (this=0x7ffc7ab72ae0) at
> /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/stl_vector.h:1005
> #2 0x00005630dcafd4ea in arrow::compute::HashJoinSimpleInt
> (join_type=arrow::compute::JoinType::FULL_OUTER, l=..., null_in_key_l=...,
> r=..., null_in_key_r=..., result_l=0x7ffc7ab72cb0, result_r=0x7ffc7ab72cd0,
> output_length_limit=100000, length_limit_reached=0x7ffc7ab72e77) at
> /arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:781
> #3 0x00005630dcafe22c in arrow::compute::HashJoinSimple (ctx=0x5630de57a320,
> join_type=arrow::compute::JoinType::FULL_OUTER, cmp=..., num_key_fields=1,
> key_id_l=..., key_id_r=..., original_l=..., original_r=..., l=..., r=...,
> output_ids_l=..., output_ids_r=..., output_length_limit=100000,
> length_limit_reached=0x7ffc7ab72e77) at
> /arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:887
> #4 0x00005630dcb011c0 in arrow::compute::HashJoin_Random_Test::TestBody
> (this=0x5630de47a300) at
> /arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:1067
> #5 0x00007f8d056a3c9c in
> testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test,
> void> (object=0x5630de47a300, method=&virtual testing::Test::TestBody(),
> location=0x7f8d056b897b "the test body") at
> /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2607
> #6 0x00007f8d0569add2 in
> testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>
> (object=0x5630de47a300, method=&virtual testing::Test::TestBody(),
> location=0x7f8d056b897b "the test body") at
> /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2643
> #7 0x00007f8d05675c03 in testing::Test::Run (this=0x5630de47a300) at
> /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2682
> #8 0x00007f8d0567663b in testing::TestInfo::Run (this=0x5630de476b50) at
> /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2861
> #9 0x00007f8d05677010 in testing::TestSuite::Run (this=0x5630de476c70) at
> /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:3015
> #10 0x00007f8d0568731c in testing::internal::UnitTestImpl::RunAllTests
> (this=0x5630de4762e0) at
> /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:5855
> #11 0x00007f8d056a4ce8 in
> testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,
> bool> (object=0x5630de4762e0, method=(bool
> (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl *
> const)) 0x7f8d05686ed8 <testing::internal::UnitTestImpl::RunAllTests()>,
> location=0x7f8d056b9468 "auxiliary test code (environments or event
> listeners)") at
> /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2607
> #12 0x00007f8d0569c064 in
> testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,
> bool> (object=0x5630de4762e0, method=(bool
> (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl *
> const)) 0x7f8d05686ed8 <testing::internal::UnitTestImpl::RunAllTests()>,
> location=0x7f8d056b9468 "auxiliary test code (environments or event
> listeners)") at
> /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2643
> #13 0x00007f8d056857b7 in testing::UnitTest::Run (this=0x7f8d056e5260
> <testing::UnitTest::GetInstance()::instance>) at
> /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:5438
> #14 0x00007f8d056e6919 in RUN_ALL_TESTS () at
> /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/include/gtest/gtest.h:2490
> #15 0x00007f8d056e695c in main (argc=1, argv=0x7ffc7ab739d8) at
> /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest_main.cc:52
> #16 0x00007f8d01701bf7 in __libc_start_main (main=0x7f8d056e691b <main(int,
> char**)>, argc=1, argv=0x7ffc7ab739d8, init=<optimized out>, fini=<optimized
> out>, rtld_fini=<optimized out>, stack_end=0x7ffc7ab739c8) at
> ../csu/libc-start.c:310
> #17 0x00005630dcaedf49 in _start ()Thread 2 (Thread 0x7f8cfdb7e700 (LWP
> 19308)):
> #0 0x00007f8d01130ad3 in futex_wait_cancelable (private=<optimized out>,
> expected=0, futex_word=0x5630de565a80) at
> ../sysdeps/unix/sysv/linux/futex-internal.h:88
> #1 __pthread_cond_wait_common (abstime=0x0, mutex=0x5630de565a30,
> cond=0x5630de565a58) at pthread_cond_wait.c:502
> #2 __pthread_cond_wait (cond=0x5630de565a58, mutex=0x5630de565a30) at
> pthread_cond_wait.c:655
> #3 0x00007f8d01b994d1 in __gthread_cond_wait (__mutex=<error reading
> variable: dwarf2_find_location_expression: Corrupted DWARF expression.>,
> __cond=<optimized out>) at
> /home/conda/feedstock_root/build_artifacts/gcc_compilers_1643063175398/work/build/x86_64-conda-linux-gnu/libstdc++-v3/src/c++11/condition_variable.cc:865
> #4 std::__condvar::wait (__m=<error reading variable:
> dwarf2_find_location_expression: Corrupted DWARF expression.>,
> this=<optimized out>) at
> ../../../../../libstdc++-v3/src/c++11/gthr-default.h:155
> #5 std::condition_variable::wait (this=<optimized out>, __lock=...) at
> ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:41
> #6 0x00007f8d02f8fbb7 in arrow::internal::WorkerLoop (state=..., it=...) at
> /arrow/cpp/src/arrow/util/thread_pool.cc:195
> #7 0x00007f8d02f90960 in
> arrow::internal::ThreadPool::<lambda()>::operator()(void) const
> (__closure=0x5630de561958) at /arrow/cpp/src/arrow/util/thread_pool.cc:344
> #8 0x00007f8d02f97498 in std::__invoke_impl<void,
> arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
> >(std::__invoke_other, arrow::internal::ThreadPool::<lambda()> &&) (__f=...)
> at
> /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/invoke.h:60
> #9 0x00007f8d02f97438 in
> std::__invoke<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
> >(arrow::internal::ThreadPool::<lambda()> &&) (__fn=...) at
> /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/invoke.h:95
> #10 0x00007f8d02f973d6 in
> std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
> > >::_M_invoke<0>(std::_Index_tuple<0>) (this=0x5630de561958) at
> /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:244
> #11 0x00007f8d02f97293 in
> std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
> > >::operator()(void) (this=0x5630de561958) at
> /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:251
> #12 0x00007f8d02f971e4 in
> std::thread::_State_impl<std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
> > > >::_M_run(void) (this=0x5630de561950) at
> /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:195
> #13 0x00007f8d01b9d9d4 in std::execute_native_thread_routine (__p=<optimized
> out>) at
> /home/conda/feedstock_root/build_artifacts/gcc_compilers_1643063175398/work/build/x86_64-conda-linux-gnu/libstdc++-v3/include/bits/new_allocator.h:82
> #14 0x00007f8d0112a6db in start_thread (arg=0x7f8cfdb7e700) at
> pthread_create.c:463
> #15 0x00007f8d0180171f in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95Thread 1 (Thread 0x7f8cfcb7d700
> (LWP 19309)):
> #0 0x0000000000011479 in ?? ()
> #1 0x00007f8d0331fae3 in arrow::compute::TaskSchedulerImpl::ScheduleMore
> (this=0x5630de572960, thread_id=0, num_tasks_finished=0) at
> /arrow/cpp/src/arrow/compute/exec/task_util.cc:326
> #2 0x00007f8d0331e94c in arrow::compute::TaskSchedulerImpl::StartTaskGroup
> (this=0x5630de572960, thread_id=0, group_id=1, total_num_tasks=0) at
> /arrow/cpp/src/arrow/compute/exec/task_util.cc:153
> #3 0x00007f8d0327d952 in
> arrow::compute::HashJoinBasicImpl::ProbeQueuedBatches (this=0x7f8cec24aee0,
> thread_index=0) at /arrow/cpp/src/arrow/compute/exec/hash_join.cc:726
> #4 0x00007f8d0327d13b in
> arrow::compute::HashJoinBasicImpl::BuildHashTable_on_finished
> (this=0x7f8cec24aee0, thread_index=0) at
> /arrow/cpp/src/arrow/compute/exec/hash_join.cc:663
> #5 0x00007f8d0327d2db in
> arrow::compute::HashJoinBasicImpl::RegisterBuildHashTable()::{lambda(unsigned
> long)#2}::operator()(unsigned long) const (__closure=0x5630de654840,
> thread_index=0) at /arrow/cpp/src/arrow/compute/exec/hash_join.cc:674
> #6 0x00007f8d0328213c in std::_Function_handler<arrow::Status (unsigned
> long),
> arrow::compute::HashJoinBasicImpl::RegisterBuildHashTable()::{lambda(unsigned
> long)#2}>::_M_invoke(std::_Any_data const&, unsigned long&&) (__functor=...,
> __args#0=@0x7f8cfcb7b138: 0) at
> /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_function.h:286
> #7 0x00007f8d032aa81e in std::function<arrow::Status (unsigned
> long)>::operator()(unsigned long) const (this=0x5630de654840, __args#0=0) at
> /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_function.h:688
> #8 0x00007f8d0331f041 in
> arrow::compute::TaskSchedulerImpl::OnTaskGroupFinished (this=0x5630de572960,
> thread_id=0, group_id=0, all_task_groups_finished=0x7f8cfcb7b230) at
> /arrow/cpp/src/arrow/compute/exec/task_util.cc:244
> #9 0x00007f8d0331f934 in
> arrow::compute::TaskSchedulerImpl::<lambda(size_t)>::operator()(size_t) const
> (__closure=0x5630de6a1390, thread_id=0) at
> /arrow/cpp/src/arrow/compute/exec/task_util.cc:349
> #10 0x00007f8d0332152f in std::_Function_handler<arrow::Status(long unsigned
> int), arrow::compute::TaskSchedulerImpl::ScheduleMore(size_t,
> int)::<lambda(size_t)> >::_M_invoke(const std::_Any_data &, unsigned long &&)
> (__functor=..., __args#0=@0x7f8cfcb7b2b8: 0) at
> /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_function.h:286
> #11 0x00007f8d032aa81e in std::function<arrow::Status (unsigned
> long)>::operator()(unsigned long) const (this=0x5630de654f70, __args#0=0) at
> /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_function.h:688
> #12 0x00007f8d032a7f8c in
> arrow::compute::HashJoinNode::ScheduleTaskCallback(std::function<arrow::Status
> (unsigned long)>)::{lambda()#1}::operator()() const
> (__closure=0x5630de654f68) at
> /arrow/cpp/src/arrow/compute/exec/hash_join_node.cc:604
> #13 0x00007f8d032b9329 in arrow::internal::FnOnce<void
> ()>::FnImpl<arrow::compute::HashJoinNode::ScheduleTaskCallback(std::function<arrow::Status
> (unsigned long)>)::{lambda()#1}>::invoke() (this=0x5630de654f60) at
> /arrow/cpp/src/arrow/util/functional.h:152
> #14 0x00007f8d02f91ade in arrow::internal::FnOnce<void ()>::operator()() &&
> (this=0x7f8cfcb7b3f0) at /arrow/cpp/src/arrow/util/functional.h:140
> #15 0x00007f8d02f8fa87 in arrow::internal::WorkerLoop (state=..., it=...) at
> /arrow/cpp/src/arrow/util/thread_pool.cc:177
> #16 0x00007f8d02f90960 in
> arrow::internal::ThreadPool::<lambda()>::operator()(void) const
> (__closure=0x5630de659468) at /arrow/cpp/src/arrow/util/thread_pool.cc:344
> #17 0x00007f8d02f97498 in std::__invoke_impl<void,
> arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
> >(std::__invoke_other, arrow::internal::ThreadPool::<lambda()> &&) (__f=...)
> at
> /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/invoke.h:60
> #18 0x00007f8d02f97438 in
> std::__invoke<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
> >(arrow::internal::ThreadPool::<lambda()> &&) (__fn=...) at
> /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/invoke.h:95
> #19 0x00007f8d02f973d6 in
> std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
> > >::_M_invoke<0>(std::_Index_tuple<0>) (this=0x5630de659468) at
> /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:244
> #20 0x00007f8d02f97293 in
> std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
> > >::operator()(void) (this=0x5630de659468) at
> /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:251
> #21 0x00007f8d02f971e4 in
> std::thread::_State_impl<std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()>
> > > >::_M_run(void) (this=0x5630de659460) at
> /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:195
> #22 0x00007f8d01b9d9d4 in std::execute_native_thread_routine (__p=<optimized
> out>) at
> /home/conda/feedstock_root/build_artifacts/gcc_compilers_1643063175398/work/build/x86_64-conda-linux-gnu/libstdc++-v3/include/bits/new_allocator.h:82
> #23 0x00007f8d0112a6db in start_thread (arg=0x7f8cfcb7d700) at
> pthread_create.c:463
> #24 0x00007f8d0180171f in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> /build/cpp/src/arrow/compute/exec {noformat}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)