[ https://issues.apache.org/jira/browse/ARROW-15221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Weston Pace reassigned ARROW-15221: ----------------------------------- Assignee: Weston Pace > [C++] Occasional failure arrow-compute-hash-join-node-test > ---------------------------------------------------------- > > Key: ARROW-15221 > URL: https://issues.apache.org/jira/browse/ARROW-15221 > Project: Apache Arrow > Issue Type: Bug > Components: C++ > Reporter: David Li > Assignee: Weston Pace > Priority: Major > Labels: query-engine > Attachments: log.txt > > > The test seems to be flaky. [Full > log|https://github.com/ursacomputing/crossbow/runs/4664466384?check_suite_focus=true] > {noformat} > 44/84 Test #35: arrow-compute-hash-join-node-test .........***Failed 8.63 > sec > Running arrow-compute-hash-join-node-test, redirecting output into > /build/cpp/build/test-logs/arrow-compute-hash-join-node-test.txt (attempt 1/1) > /arrow/cpp/build-support/run-test.sh: line 88: 19125 Segmentation fault > (core dumped) $TEST_EXECUTABLE "$@" > $LOGFILE.raw 2>&1 > Running main() from > /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest_main.cc > [==========] Running 23 tests from 2 test suites. > [----------] Global test environment set-up. > [----------] 7 tests from HashJoin > [ RUN ] HashJoin.Random > /arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:934: Failure > Failed > '_error_or_value46.status()' failed with Cancelled: Scheduler cancelled > Google Test trace: > /arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:1053: FULL_OUTER IS > parallel = false > /build/cpp/src/arrow/compute/exec > {noformat} > Another one observed in AMD64 Conda C++ [Full > Log|https://github.com/apache/arrow/runs/5055044211?check_suite_focus=true] > {noformat} > [----------] 7 tests from HashJoin > [ RUN ] HashJoin.Random > Found core dump, printing backtrace:warning: core file may not match > specified executable file. > [New LWP 19309] > [New LWP 19308] > [New LWP 19306] > [New LWP 19310] > [New LWP 19307] > [New LWP 19311] > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". > Core was generated by `/build/cpp/debug/arrow-compute-hash-join-node-test'. > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x0000000000011479 in ?? () > [Current thread is 1 (Thread 0x7f8cfcb7d700 (LWP 19309))]Thread 6 (Thread > 0x7f8cf9fff700 (LWP 19311)): > #0 0x00007f8d01131065 in futex_abstimed_wait_cancelable (private=<optimized > out>, abstime=0x7f8cf9ffd4a0, expected=0, futex_word=0x7f8cff40a790) at > ../sysdeps/unix/sysv/linux/futex-internal.h:205 > #1 __pthread_cond_wait_common (abstime=0x7f8cf9ffd4a0, mutex=0x7f8cff40a7d8, > cond=0x7f8cff40a768) at pthread_cond_wait.c:539 > #2 __pthread_cond_timedwait (cond=0x7f8cff40a768, mutex=0x7f8cff40a7d8, > abstime=0x7f8cf9ffd4a0) at pthread_cond_wait.c:667 > #3 0x00007f8d04411496 in background_thread_sleep (tsdn=<optimized out>, > interval=<optimized out>, info=0x7f8cff40a760) at src/background_thread.c:255 > #4 background_work_sleep_once (ind=<optimized out>, info=<optimized out>, > tsdn=<optimized out>) at src/background_thread.c:307 > #5 background_work (ind=<optimized out>, tsd=<optimized out>) at > src/background_thread.c:497 > #6 background_thread_entry () at src/background_thread.c:522 > #7 0x00007f8d0112a6db in start_thread (arg=0x7f8cf9fff700) at > pthread_create.c:463 > #8 0x00007f8d0180171f in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:95Thread 5 (Thread 0x7f8cfedff700 > (LWP 19307)): > #0 0x00007f8d01131065 in futex_abstimed_wait_cancelable (private=<optimized > out>, abstime=0x7f8cfedfd4a0, expected=0, futex_word=0x7f8cff40a5f0) at > ../sysdeps/unix/sysv/linux/futex-internal.h:205 > #1 __pthread_cond_wait_common (abstime=0x7f8cfedfd4a0, mutex=0x7f8cff40a638, > cond=0x7f8cff40a5c8) at pthread_cond_wait.c:539 > #2 __pthread_cond_timedwait (cond=0x7f8cff40a5c8, mutex=0x7f8cff40a638, > abstime=0x7f8cfedfd4a0) at pthread_cond_wait.c:667 > #3 0x00007f8d04411bc6 in background_thread_sleep (tsdn=<optimized out>, > interval=<optimized out>, info=<optimized out>) at src/background_thread.c:255 > #4 background_work_sleep_once (ind=0, info=<optimized out>, tsdn=<optimized > out>) at src/background_thread.c:307 > #5 background_thread0_work (tsd=<optimized out>) at > src/background_thread.c:452 > #6 background_work (ind=<optimized out>, tsd=<optimized out>) at > src/background_thread.c:490 > #7 background_thread_entry () at src/background_thread.c:522 > #8 0x00007f8d0112a6db in start_thread (arg=0x7f8cfedff700) at > pthread_create.c:463 > #9 0x00007f8d0180171f in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:95Thread 4 (Thread 0x7f8cfb5ff700 > (LWP 19310)): > #0 0x00007f8d01131065 in futex_abstimed_wait_cancelable (private=<optimized > out>, abstime=0x7f8cfb5fd4a0, expected=0, futex_word=0x7f8cff40a6c0) at > ../sysdeps/unix/sysv/linux/futex-internal.h:205 > #1 __pthread_cond_wait_common (abstime=0x7f8cfb5fd4a0, mutex=0x7f8cff40a708, > cond=0x7f8cff40a698) at pthread_cond_wait.c:539 > #2 __pthread_cond_timedwait (cond=0x7f8cff40a698, mutex=0x7f8cff40a708, > abstime=0x7f8cfb5fd4a0) at pthread_cond_wait.c:667 > #3 0x00007f8d04411496 in background_thread_sleep (tsdn=<optimized out>, > interval=<optimized out>, info=0x7f8cff40a690) at src/background_thread.c:255 > #4 background_work_sleep_once (ind=<optimized out>, info=<optimized out>, > tsdn=<optimized out>) at src/background_thread.c:307 > #5 background_work (ind=<optimized out>, tsd=<optimized out>) at > src/background_thread.c:497 > #6 background_thread_entry () at src/background_thread.c:522 > #7 0x00007f8d0112a6db in start_thread (arg=0x7f8cfb5ff700) at > pthread_create.c:463 > #8 0x00007f8d0180171f in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:95Thread 3 (Thread 0x7f8cff6db0c0 > (LWP 19306)): > #0 0x00005630dcb287d4 in __gnu_cxx::operator==<int const*, std::vector<int, > std::allocator<int> > > (__lhs=..., __rhs=...) at > /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/stl_iterator.h:890 > #1 0x00005630dcb1d3d1 in std::vector<int, std::allocator<int> >::empty > (this=0x7ffc7ab72ae0) at > /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/stl_vector.h:1005 > #2 0x00005630dcafd4ea in arrow::compute::HashJoinSimpleInt > (join_type=arrow::compute::JoinType::FULL_OUTER, l=..., null_in_key_l=..., > r=..., null_in_key_r=..., result_l=0x7ffc7ab72cb0, result_r=0x7ffc7ab72cd0, > output_length_limit=100000, length_limit_reached=0x7ffc7ab72e77) at > /arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:781 > #3 0x00005630dcafe22c in arrow::compute::HashJoinSimple (ctx=0x5630de57a320, > join_type=arrow::compute::JoinType::FULL_OUTER, cmp=..., num_key_fields=1, > key_id_l=..., key_id_r=..., original_l=..., original_r=..., l=..., r=..., > output_ids_l=..., output_ids_r=..., output_length_limit=100000, > length_limit_reached=0x7ffc7ab72e77) at > /arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:887 > #4 0x00005630dcb011c0 in arrow::compute::HashJoin_Random_Test::TestBody > (this=0x5630de47a300) at > /arrow/cpp/src/arrow/compute/exec/hash_join_node_test.cc:1067 > #5 0x00007f8d056a3c9c in > testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, > void> (object=0x5630de47a300, method=&virtual testing::Test::TestBody(), > location=0x7f8d056b897b "the test body") at > /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2607 > #6 0x00007f8d0569add2 in > testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> > (object=0x5630de47a300, method=&virtual testing::Test::TestBody(), > location=0x7f8d056b897b "the test body") at > /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2643 > #7 0x00007f8d05675c03 in testing::Test::Run (this=0x5630de47a300) at > /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2682 > #8 0x00007f8d0567663b in testing::TestInfo::Run (this=0x5630de476b50) at > /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2861 > #9 0x00007f8d05677010 in testing::TestSuite::Run (this=0x5630de476c70) at > /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:3015 > #10 0x00007f8d0568731c in testing::internal::UnitTestImpl::RunAllTests > (this=0x5630de4762e0) at > /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:5855 > #11 0x00007f8d056a4ce8 in > testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, > bool> (object=0x5630de4762e0, method=(bool > (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * > const)) 0x7f8d05686ed8 <testing::internal::UnitTestImpl::RunAllTests()>, > location=0x7f8d056b9468 "auxiliary test code (environments or event > listeners)") at > /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2607 > #12 0x00007f8d0569c064 in > testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, > bool> (object=0x5630de4762e0, method=(bool > (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * > const)) 0x7f8d05686ed8 <testing::internal::UnitTestImpl::RunAllTests()>, > location=0x7f8d056b9468 "auxiliary test code (environments or event > listeners)") at > /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2643 > #13 0x00007f8d056857b7 in testing::UnitTest::Run (this=0x7f8d056e5260 > <testing::UnitTest::GetInstance()::instance>) at > /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:5438 > #14 0x00007f8d056e6919 in RUN_ALL_TESTS () at > /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/include/gtest/gtest.h:2490 > #15 0x00007f8d056e695c in main (argc=1, argv=0x7ffc7ab739d8) at > /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest_main.cc:52 > #16 0x00007f8d01701bf7 in __libc_start_main (main=0x7f8d056e691b <main(int, > char**)>, argc=1, argv=0x7ffc7ab739d8, init=<optimized out>, fini=<optimized > out>, rtld_fini=<optimized out>, stack_end=0x7ffc7ab739c8) at > ../csu/libc-start.c:310 > #17 0x00005630dcaedf49 in _start ()Thread 2 (Thread 0x7f8cfdb7e700 (LWP > 19308)): > #0 0x00007f8d01130ad3 in futex_wait_cancelable (private=<optimized out>, > expected=0, futex_word=0x5630de565a80) at > ../sysdeps/unix/sysv/linux/futex-internal.h:88 > #1 __pthread_cond_wait_common (abstime=0x0, mutex=0x5630de565a30, > cond=0x5630de565a58) at pthread_cond_wait.c:502 > #2 __pthread_cond_wait (cond=0x5630de565a58, mutex=0x5630de565a30) at > pthread_cond_wait.c:655 > #3 0x00007f8d01b994d1 in __gthread_cond_wait (__mutex=<error reading > variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, > __cond=<optimized out>) at > /home/conda/feedstock_root/build_artifacts/gcc_compilers_1643063175398/work/build/x86_64-conda-linux-gnu/libstdc++-v3/src/c++11/condition_variable.cc:865 > #4 std::__condvar::wait (__m=<error reading variable: > dwarf2_find_location_expression: Corrupted DWARF expression.>, > this=<optimized out>) at > ../../../../../libstdc++-v3/src/c++11/gthr-default.h:155 > #5 std::condition_variable::wait (this=<optimized out>, __lock=...) at > ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:41 > #6 0x00007f8d02f8fbb7 in arrow::internal::WorkerLoop (state=..., it=...) at > /arrow/cpp/src/arrow/util/thread_pool.cc:195 > #7 0x00007f8d02f90960 in > arrow::internal::ThreadPool::<lambda()>::operator()(void) const > (__closure=0x5630de561958) at /arrow/cpp/src/arrow/util/thread_pool.cc:344 > #8 0x00007f8d02f97498 in std::__invoke_impl<void, > arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> > >(std::__invoke_other, arrow::internal::ThreadPool::<lambda()> &&) (__f=...) > at > /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/invoke.h:60 > #9 0x00007f8d02f97438 in > std::__invoke<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> > >(arrow::internal::ThreadPool::<lambda()> &&) (__fn=...) at > /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/invoke.h:95 > #10 0x00007f8d02f973d6 in > std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> > > >::_M_invoke<0>(std::_Index_tuple<0>) (this=0x5630de561958) at > /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:244 > #11 0x00007f8d02f97293 in > std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> > > >::operator()(void) (this=0x5630de561958) at > /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:251 > #12 0x00007f8d02f971e4 in > std::thread::_State_impl<std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> > > > >::_M_run(void) (this=0x5630de561950) at > /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:195 > #13 0x00007f8d01b9d9d4 in std::execute_native_thread_routine (__p=<optimized > out>) at > /home/conda/feedstock_root/build_artifacts/gcc_compilers_1643063175398/work/build/x86_64-conda-linux-gnu/libstdc++-v3/include/bits/new_allocator.h:82 > #14 0x00007f8d0112a6db in start_thread (arg=0x7f8cfdb7e700) at > pthread_create.c:463 > #15 0x00007f8d0180171f in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:95Thread 1 (Thread 0x7f8cfcb7d700 > (LWP 19309)): > #0 0x0000000000011479 in ?? () > #1 0x00007f8d0331fae3 in arrow::compute::TaskSchedulerImpl::ScheduleMore > (this=0x5630de572960, thread_id=0, num_tasks_finished=0) at > /arrow/cpp/src/arrow/compute/exec/task_util.cc:326 > #2 0x00007f8d0331e94c in arrow::compute::TaskSchedulerImpl::StartTaskGroup > (this=0x5630de572960, thread_id=0, group_id=1, total_num_tasks=0) at > /arrow/cpp/src/arrow/compute/exec/task_util.cc:153 > #3 0x00007f8d0327d952 in > arrow::compute::HashJoinBasicImpl::ProbeQueuedBatches (this=0x7f8cec24aee0, > thread_index=0) at /arrow/cpp/src/arrow/compute/exec/hash_join.cc:726 > #4 0x00007f8d0327d13b in > arrow::compute::HashJoinBasicImpl::BuildHashTable_on_finished > (this=0x7f8cec24aee0, thread_index=0) at > /arrow/cpp/src/arrow/compute/exec/hash_join.cc:663 > #5 0x00007f8d0327d2db in > arrow::compute::HashJoinBasicImpl::RegisterBuildHashTable()::{lambda(unsigned > long)#2}::operator()(unsigned long) const (__closure=0x5630de654840, > thread_index=0) at /arrow/cpp/src/arrow/compute/exec/hash_join.cc:674 > #6 0x00007f8d0328213c in std::_Function_handler<arrow::Status (unsigned > long), > arrow::compute::HashJoinBasicImpl::RegisterBuildHashTable()::{lambda(unsigned > long)#2}>::_M_invoke(std::_Any_data const&, unsigned long&&) (__functor=..., > __args#0=@0x7f8cfcb7b138: 0) at > /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_function.h:286 > #7 0x00007f8d032aa81e in std::function<arrow::Status (unsigned > long)>::operator()(unsigned long) const (this=0x5630de654840, __args#0=0) at > /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_function.h:688 > #8 0x00007f8d0331f041 in > arrow::compute::TaskSchedulerImpl::OnTaskGroupFinished (this=0x5630de572960, > thread_id=0, group_id=0, all_task_groups_finished=0x7f8cfcb7b230) at > /arrow/cpp/src/arrow/compute/exec/task_util.cc:244 > #9 0x00007f8d0331f934 in > arrow::compute::TaskSchedulerImpl::<lambda(size_t)>::operator()(size_t) const > (__closure=0x5630de6a1390, thread_id=0) at > /arrow/cpp/src/arrow/compute/exec/task_util.cc:349 > #10 0x00007f8d0332152f in std::_Function_handler<arrow::Status(long unsigned > int), arrow::compute::TaskSchedulerImpl::ScheduleMore(size_t, > int)::<lambda(size_t)> >::_M_invoke(const std::_Any_data &, unsigned long &&) > (__functor=..., __args#0=@0x7f8cfcb7b2b8: 0) at > /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_function.h:286 > #11 0x00007f8d032aa81e in std::function<arrow::Status (unsigned > long)>::operator()(unsigned long) const (this=0x5630de654f70, __args#0=0) at > /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_function.h:688 > #12 0x00007f8d032a7f8c in > arrow::compute::HashJoinNode::ScheduleTaskCallback(std::function<arrow::Status > (unsigned long)>)::{lambda()#1}::operator()() const > (__closure=0x5630de654f68) at > /arrow/cpp/src/arrow/compute/exec/hash_join_node.cc:604 > #13 0x00007f8d032b9329 in arrow::internal::FnOnce<void > ()>::FnImpl<arrow::compute::HashJoinNode::ScheduleTaskCallback(std::function<arrow::Status > (unsigned long)>)::{lambda()#1}>::invoke() (this=0x5630de654f60) at > /arrow/cpp/src/arrow/util/functional.h:152 > #14 0x00007f8d02f91ade in arrow::internal::FnOnce<void ()>::operator()() && > (this=0x7f8cfcb7b3f0) at /arrow/cpp/src/arrow/util/functional.h:140 > #15 0x00007f8d02f8fa87 in arrow::internal::WorkerLoop (state=..., it=...) at > /arrow/cpp/src/arrow/util/thread_pool.cc:177 > #16 0x00007f8d02f90960 in > arrow::internal::ThreadPool::<lambda()>::operator()(void) const > (__closure=0x5630de659468) at /arrow/cpp/src/arrow/util/thread_pool.cc:344 > #17 0x00007f8d02f97498 in std::__invoke_impl<void, > arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> > >(std::__invoke_other, arrow::internal::ThreadPool::<lambda()> &&) (__f=...) > at > /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/invoke.h:60 > #18 0x00007f8d02f97438 in > std::__invoke<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> > >(arrow::internal::ThreadPool::<lambda()> &&) (__fn=...) at > /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/invoke.h:95 > #19 0x00007f8d02f973d6 in > std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> > > >::_M_invoke<0>(std::_Index_tuple<0>) (this=0x5630de659468) at > /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:244 > #20 0x00007f8d02f97293 in > std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> > > >::operator()(void) (this=0x5630de659468) at > /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:251 > #21 0x00007f8d02f971e4 in > std::thread::_State_impl<std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::<lambda()> > > > >::_M_run(void) (this=0x5630de659460) at > /opt/conda/envs/arrow/x86_64-conda-linux-gnu/include/c++/9.4.0/thread:195 > #22 0x00007f8d01b9d9d4 in std::execute_native_thread_routine (__p=<optimized > out>) at > /home/conda/feedstock_root/build_artifacts/gcc_compilers_1643063175398/work/build/x86_64-conda-linux-gnu/libstdc++-v3/include/bits/new_allocator.h:82 > #23 0x00007f8d0112a6db in start_thread (arg=0x7f8cfcb7d700) at > pthread_create.c:463 > #24 0x00007f8d0180171f in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 > /build/cpp/src/arrow/compute/exec {noformat} -- This message was sent by Atlassian Jira (v8.20.1#820001)