[ 
https://issues.apache.org/jira/browse/IMPALA-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500556#comment-16500556
 ] 

Dan Hecht commented on IMPALA-7101:
-----------------------------------

I reproduced a hang by looping test_cancellation.py. After a few hours, it 
seems to reproduce. The stuck query is:
{code:java}
dhecht  tpch_seq_gzip   compute stats lineitem  DDL     2018-06-01 
20:22:11.463390000   61h33m  N/A     RUNNING Planning finished       0{code}
The {{async child queries}} and {{wait-thread}} for the parent query are still 
running, with these stacks:
{code:java}
(gdb) thread 199
[Switching to thread 199 (Thread 0x7f332f1a9700 (LWP 28425))]
#0 pthread_cond_wait@@GLIBC_2.3.2 () at 
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
185 in ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S
(gdb) bt
#0 0x00007f33cd47a360 in pthread_cond_wait@@GLIBC_2.3.2 () at 
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x0000000001864443 in 
impala::ConditionVariable::Wait(boost::unique_lock<boost::mutex>&) 
(this=0x1949a4d0, lock=...) at 
/home/dhecht/src/Impala/be/src/util/condition-variable.h:51
#2 0x0000000001ce0f65 in impala::Promise<bool>::Get() (this=0x1949a4d0) at 
/home/dhecht/src/Impala/be/src/util/promise.h:67
#3 0x0000000001e613e2 in impala::CountingBarrier::Wait() (this=0x1949a4d0) at 
/home/dhecht/src/Impala/be/src/util/counting-barrier.h:56
#4 0x00000000030d8bcf in impala::Coordinator::WaitForBackends() 
(this=0x192f9080) at /home/dhecht/src/Impala/be/src/runtime/coordinator.cc:581
#5 0x00000000030d7c2a in 
impala::Coordinator::HandleExecStateTransition(impala::Coordinator::ExecState, 
impala::Coordinator::ExecState) (this=0x192f9080, 
old_state=impala::Coordinator::ExecState::EXECUTING, 
new_state=impala::Coordinator::ExecState::RETURNED_RESULTS) at 
/home/dhecht/src/Impala/be/src/runtime/coordinator.cc:530
#6 0x00000000030d68df in 
impala::Coordinator::SetNonErrorTerminalState(impala::Coordinator::ExecState) 
(this=0x192f9080, state=impala::Coordinator::ExecState::RETURNED_RESULTS) at 
/home/dhecht/src/Impala/be/src/runtime/coordinator.cc:451
#7 0x00000000030d9827 in impala::Coordinator::GetNext(impala::QueryResultSet*, 
int, bool*) (this=0x192f9080, results=0xf22f860, max_rows=1024, eos=0x12cac471) 
at /home/dhecht/src/Impala/be/src/runtime/coordinator.cc:630
#8 0x0000000001daa8d9 in impala::ClientRequestState::FetchRowsInternal(int, 
impala::QueryResultSet*) (this=0x12cac000, max_rows=1024, 
fetched_rows=0xf22f860) at 
/home/dhecht/src/Impala/be/src/service/client-request-state.cc:801
#9 0x0000000001da9ccd in impala::ClientRequestState::FetchRows(int, 
impala::QueryResultSet*) (this=0x12cac000, max_rows=1024, 
fetched_rows=0xf22f860) at 
/home/dhecht/src/Impala/be/src/service/client-request-state.cc:707
#10 0x0000000001dc22b9 in impala::ImpalaServer::FetchInternal(impala::TUniqueId 
const&, int, bool, apache::hive::service::cli::thrift::TFetchResultsResp*) 
(this=0xce1d000, query_id=..., fetch_size=1024, fetch_first=false, 
fetch_results=0xccf6450)
at /home/dhecht/src/Impala/be/src/service/impala-hs2-server.cc:214
#11 0x0000000001dcc1bf in 
impala::ImpalaServer::FetchResults(apache::hive::service::cli::thrift::TFetchResultsResp&,
 apache::hive::service::cli::thrift::TFetchResultsReq const&) (this=0xce1d000, 
return_val=..., request=...)
at /home/dhecht/src/Impala/be/src/service/impala-hs2-server.cc:765
#12 0x0000000001decfef in impala::ChildQuery::ExecAndFetch() (this=0xccf63c0) 
at /home/dhecht/src/Impala/be/src/service/child-query.cc:85
#13 0x0000000001df3127 in impala::ChildQueryExecutor::ExecChildQueries() 
(this=0x3838fc50) at /home/dhecht/src/Impala/be/src/service/child-query.cc:182
#14 0x0000000001df4f65 in boost::_mfi::mf0<void, 
impala::ChildQueryExecutor>::operator()(impala::ChildQueryExecutor*) const 
(this=0x7f332f1a8ce8, p=0x3838fc50) at 
/home/dhecht/toolchain/boost-1.57.0-p3/include/boost/bind/mem_fn_template.hpp:49
#15 0x0000000001df4cae in 
boost::_bi::list1<boost::_bi::value<impala::ChildQueryExecutor*> 
>::operator()<boost::_mfi::mf0<void, impala::ChildQueryExecutor>, 
boost::_bi::list0>(boost::_bi::type<void>, boost::_mfi::mf0<void, 
impala::ChildQueryExecutor>&, boost::_bi::list0&, int) (this=0x7f332f1a8cf8, 
f=..., a=...) at 
/home/dhecht/toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:253
#16 0x0000000001df4ae9 in boost::_bi::bind_t<void, boost::_mfi::mf0<void, 
impala::ChildQueryExecutor>, 
boost::_bi::list1<boost::_bi::value<impala::ChildQueryExecutor*> > 
>::operator()() (this=0x7f332f1a8ce8)
at 
/home/dhecht/toolchain/boost-1.57.0-p3/include/boost/bind/bind_template.hpp:20
#17 0x0000000001df4938 in 
boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, 
boost::_mfi::mf0<void, impala::ChildQueryExecutor>, 
boost::_bi::list1<boost::_bi::value<impala::ChildQueryExecutor*> > >, 
void>::invoke(boost::detail::function::function_buffer&) (function_obj_ptr=...) 
at 
/home/dhecht/toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153
#18 0x0000000001bef182 in boost::function0<void>::operator()() const 
(this=0x7f332f1a8ce0) at 
/home/dhecht/toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:767
#19 0x0000000001edad77 in impala::Thread::SuperviseThread(std::string const&, 
std::string const&, boost::function<void ()>, impala::ThreadDebugInfo const*, 
impala::Promise<long>*) (name="async child queries", 
category="query-exec-state", functor=..., parent_thread_info=0x7f33331b0990, 
thread_started=0x7f33331ac9d0) at 
/home/dhecht/src/Impala/be/src/util/thread.cc:356
#20 0x0000000001ee2f13 in boost::_bi::list5<boost::_bi::value<std::string>, 
boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, 
boost::_bi::value<impala::ThreadDebugInfo*>, 
boost::_bi::value<impala::Promise<long>*> >::operator()<void (*)(std::string 
const&, std::string const&, boost::function<void ()>, impala::ThreadDebugInfo 
const*, impala::Promise<long>*), boost::_bi::list0>(boost::_bi::type<void>, 
void (*&)(std::string const&, std::string const&, boost::function<void ()>, 
impala::ThreadDebugInfo const*, impala::Promise<long>*), boost::_bi::list0&, 
int) (this=0x13bacdc0, f=@0x13bacdb8: 0x1edaa10 
<impala::Thread::SuperviseThread(std::string const&, std::string const&, 
boost::function<void ()>, impala::ThreadDebugInfo const*, 
impala::Promise<long>*)>, a=...) at 
/home/dhecht/toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:525
#21 0x0000000001ee2e37 in boost::_bi::bind_t<void, void (*)(std::string const&, 
std::string const&, boost::function<void ()>, impala::ThreadDebugInfo const*, 
impala::Promise<long>*), boost::_bi::list5<boost::_bi::value<std::string>, 
boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, 
boost::_bi::value<impala::ThreadDebugInfo*>, 
boost::_bi::value<impala::Promise<long>*> > >::operator()() (this=0x13bacdb8)
at 
/home/dhecht/toolchain/boost-1.57.0-p3/include/boost/bind/bind_template.hpp:20
#22 0x0000000001ee2dfa in boost::detail::thread_data<boost::_bi::bind_t<void, 
void (*)(std::string const&, std::string const&, boost::function<void ()>, 
impala::ThreadDebugInfo const*, impala::Promise<long>*), 
boost::_bi::list5<boost::_bi::value<std::string>, 
boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, 
boost::_bi::value<impala::ThreadDebugInfo*>, 
boost::_bi::value<impala::Promise<long>*> > > >::run() (this=0x13bacc00)
at 
/home/dhecht/toolchain/boost-1.57.0-p3/include/boost/thread/detail/thread.hpp:116
#23 0x00000000031ddffa in thread_proxy ()
#24 0x00007f33cd4746ba in start_thread (arg=0x7f332f1a9700) at 
pthread_create.c:333
#25 0x00007f33cd1aa41d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
(gdb) thread 200
[Switching to thread 200 (Thread 0x7f334b2e2700 (LWP 28426))]
#0 pthread_cond_wait@@GLIBC_2.3.2 () at 
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
185 in ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S
(gdb) bt
#0 0x00007f33cd47a360 in pthread_cond_wait@@GLIBC_2.3.2 () at 
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x0000000001e02599 in 
boost::condition_variable::wait(boost::unique_lock<boost::mutex>&) 
(this=0x13bacc58, m=...) at 
/home/dhecht/toolchain/boost-1.57.0-p3/include/boost/thread/pthread/condition_variable.hpp:73
#2 0x00000000031dea2c in boost::thread::join_noexcept() ()
#3 0x0000000001ba1487 in boost::thread::join() (this=0xd41cb90) at 
/home/dhecht/toolchain/boost-1.57.0-p3/include/boost/thread/detail/thread.hpp:767
#4 0x0000000001ba1c04 in impala::Thread::Join() const (this=0x3f7f9c80) at 
/home/dhecht/src/Impala/be/src/util/thread.h:119
#5 0x0000000001df334e in 
impala::ChildQueryExecutor::WaitForAll(std::vector<impala::ChildQuery*, 
std::allocator<impala::ChildQuery*> >*) (this=0x3838fc50, 
completed_queries=0x7f334b2e16e0)
at /home/dhecht/src/Impala/be/src/service/child-query.cc:202
#6 0x0000000001da96f0 in impala::ClientRequestState::WaitInternal() 
(this=0xcdce000) at 
/home/dhecht/src/Impala/be/src/service/client-request-state.cc:668
#7 0x0000000001da9228 in impala::ClientRequestState::Wait() (this=0xcdce000) at 
/home/dhecht/src/Impala/be/src/service/client-request-state.cc:639
#8 0x0000000001db367f in boost::_mfi::mf0<void, 
impala::ClientRequestState>::operator()(impala::ClientRequestState*) const 
(this=0x7f334b2e1ce8, p=0xcdce000) at 
/home/dhecht/toolchain/boost-1.57.0-p3/include/boost/bind/mem_fn_template.hpp:49
#9 0x0000000001db33e0 in 
boost::_bi::list1<boost::_bi::value<impala::ClientRequestState*> 
>::operator()<boost::_mfi::mf0<void, impala::ClientRequestState>, 
boost::_bi::list0>(boost::_bi::type<void>, boost::_mfi::mf0<void, 
impala::ClientRequestState>&, boost::_bi::list0&, int) (this=0x7f334b2e1cf8, 
f=..., a=...) at 
/home/dhecht/toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:253
#10 0x0000000001db31eb in boost::_bi::bind_t<void, boost::_mfi::mf0<void, 
impala::ClientRequestState>, 
boost::_bi::list1<boost::_bi::value<impala::ClientRequestState*> > 
>::operator()() (this=0x7f334b2e1ce8)
at 
/home/dhecht/toolchain/boost-1.57.0-p3/include/boost/bind/bind_template.hpp:20
#11 0x0000000001db2f8c in 
boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, 
boost::_mfi::mf0<void, impala::ClientRequestState>, 
boost::_bi::list1<boost::_bi::value<impala::ClientRequestState*> > >, 
void>::invoke(boost::detail::function::function_buffer&) (function_obj_ptr=...) 
at 
/home/dhecht/toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153
#12 0x0000000001bef182 in boost::function0<void>::operator()() const 
(this=0x7f334b2e1ce0) at 
/home/dhecht/toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:767
#13 0x0000000001edad77 in impala::Thread::SuperviseThread(std::string const&, 
std::string const&, boost::function<void ()>, impala::ThreadDebugInfo const*, 
impala::Promise<long>*) (name="wait-thread", category="query-exec-state", 
functor=..., parent_thread_info=0x7f33331b0990, thread_started=0x7f33331afa40) 
at /home/dhecht/src/Impala/be/src/util/thread.cc:356
#14 0x0000000001ee2f13 in boost::_bi::list5<boost::_bi::value<std::string>, 
boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, 
boost::_bi::value<impala::ThreadDebugInfo*>, 
boost::_bi::value<impala::Promise<long>*> >::operator()<void (*)(std::string 
const&, std::string const&, boost::function<void ()>, impala::ThreadDebugInfo 
const*, impala::Promise<long>*), boost::_bi::list0>(boost::_bi::type<void>, 
void (*&)(std::string const&, std::string const&, boost::function<void ()>, 
impala::ThreadDebugInfo const*, impala::Promise<long>*), boost::_bi::list0&, 
int) (this=0x174047c0, f=@0x174047b8: 0x1edaa10 
<impala::Thread::SuperviseThread(std::string const&, std::string const&, 
boost::function<void ()>, impala::ThreadDebugInfo const*, 
impala::Promise<long>*)>, a=...) at 
/home/dhecht/toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:525
#15 0x0000000001ee2e37 in boost::_bi::bind_t<void, void (*)(std::string const&, 
std::string const&, boost::function<void ()>, impala::ThreadDebugInfo const*, 
impala::Promise<long>*), boost::_bi::list5<boost::_bi::value<std::string>, 
boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, 
boost::_bi::value<impala::ThreadDebugInfo*>, 
boost::_bi::value<impala::Promise<long>*> > >::operator()() (this=0x174047b8)
at 
/home/dhecht/toolchain/boost-1.57.0-p3/include/boost/bind/bind_template.hpp:20
#16 0x0000000001ee2dfa in boost::detail::thread_data<boost::_bi::bind_t<void, 
void (*)(std::string const&, std::string const&, boost::function<void ()>, 
impala::ThreadDebugInfo const*, impala::Promise<long>*), 
boost::_bi::list5<boost::_bi::value<std::string>, 
boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, 
boost::_bi::value<impala::ThreadDebugInfo*>, 
boost::_bi::value<impala::Promise<long>*> > > >::run() (this=0x17404600)
at 
/home/dhecht/toolchain/boost-1.57.0-p3/include/boost/thread/detail/thread.hpp:116
#17 0x00000000031ddffa in thread_proxy ()
#18 0x00007f33cd4746ba in start_thread (arg=0x7f334b2e2700) at 
pthread_create.c:333
#19 0x00007f33cd1aa41d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:109{code}
 

> Builds are timing out/hanging
> -----------------------------
>
>                 Key: IMPALA-7101
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7101
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Thomas Tauber-Marshall
>            Assignee: Tim Armstrong
>            Priority: Blocker
>              Labels: broken-build
>
> We've seen a large number of builds in the last week or two that appear to 
> have hung and gotten killed after a 24-hour timeout.
> Exactly where the hang is occurring is different in each build, but II 
> suspect it has something to do with cancellation no working correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to