Re: Does this mem-tracker.h assertion ring a bell?
Filed https://issues.apache.org/jira/browse/IMPALA-6118. -- Philip On Wed, Oct 25, 2017 at 2:50 PM, Tim Armstrongwrote: > Will you file a JIRA for this bug? Sounds like something we don't want to > lose track of. > > - Tim > > On Wed, Oct 25, 2017 at 11:05 AM, Philip Zeyliger > wrote: > > > Thanks. I'm beginning to think my patch is not causing these breakages. A > > different run was almost clean, with some TPC-DS query tests failing, > that > > I think are also new. > > > > -- Philip > > > > On Wed, Oct 25, 2017 at 10:36 AM, Tim Armstrong > > > wrote: > > > > > Yeah it's probably another consequence of > > > https://issues.apache.org/jira/browse/IMPALA-5789. Maybe your patch > > > changed > > > the timing enough to trigger it. > > > > > > I think the bug might be related to using directory.capacity() as the > > > argument to Release(). Calling directory.clear() after releasing the > > memory > > > in FitlerState::Disable() won't necessarily deallocate the memory so we > > > could end up releasing it twice. > > > > > > On Wed, Oct 25, 2017 at 10:11 AM, Mostafa Mokhtar < > mmokh...@cloudera.com > > > > > > wrote: > > > > > > > Maybe related to https://issues.apache.org/jira/browse/IMPALA-6099? > > > > > > > > On Wed, Oct 25, 2017 at 10:02 AM, Philip Zeyliger < > phi...@cloudera.com > > > > > > > wrote: > > > > > > > > > Hi folks, > > > > > > > > > > I'm debugging some test failures related to an LLVM/AvroCodegen > patch > > > > I've > > > > > got going on. The failures are in the parallel EE tests, and most > of > > > them > > > > > are complaining that Impala is out to lunch. It looks like the > > > following > > > > > assertion is firing, causing an impalad to fail, causing many tests > > to > > > > > start failing. (I've also got a minidump, but the build was on > > > > > jenkins.impala.io, so I don't think I have the symbols/binaries to > > use > > > > > it.) > > > > > > > > > > If this sort of thing rings a bell for anyone, please holler! > > > > > > > > > > Obviously I'll work on reproducing this locally to figure it out. > > > > > > > > > > F1025 02:20:43.786911 82485 mem-tracker.h:231] Check failed: > > > > > tracker->consumption_->current_value() >= 0 (-1052615 vs. 0) > > > > > Runtime Filter (Coordinator): Total=-1.00 MB Peak=1.00 MB > > > > > *** Check failure stack trace: *** > > > > > @ 0x2f1e11d google::LogMessage::Fail() > > > > > @ 0x2f1f9c2 google::LogMessage::SendToLog() > > > > > @ 0x2f1daf7 google::LogMessage::Flush() > > > > > @ 0x2f210be google::LogMessageFatal::~ > > LogMessageFatal() > > > > > @ 0x17425fb impala::MemTracker::Release() > > > > > @ 0x1fa7e8b impala::Coordinator::UpdateFilter() > > > > > @ 0x186e3cf impala::ImpalaServer::UpdateFilter() > > > > > @ 0x18d824f impala::ImpalaInternalService: > > > :UpdateFilter() > > > > > @ 0x1dda35a > > > > > impala::ImpalaInternalServiceProcessor::process_UpdateFilter() > > > > > @ 0x1dd8308 > > > > > impala::ImpalaInternalServiceProcessor::dispatchCall() > > > > > @ 0x15410ea apache::thrift:: > > > TDispatchProcessor::process() > > > > > @ 0x171042b > > > > > apache::thrift::server::TAcceptQueueServer::Task::run() > > > > > @ 0x170c307 impala::ThriftThread::RunRunnable() > > > > > @ 0x170da13 boost::_mfi::mf2<>::operator()() > > > > > @ 0x170d8a9 boost::_bi::list3<>::operator()<>() > > > > > @ 0x170d5f5 boost::_bi::bind_t<>::operator()() > > > > > @ 0x170d508 > > > > > boost::detail::function::void_function_obj_invoker0<>::invoke() > > > > > @ 0x171bdfc boost::function0<>::operator()() > > > > > @ 0x19f3393 impala::Thread::SuperviseThread() > > > > > @ 0x19fbf26 boost::_bi::list4<>::operator()<>() > > > > > @ 0x19fbe69 boost::_bi::bind_t<>::operator()() > > > > > @ 0x19fbe2c boost::detail::thread_data<>::run() > > > > > @ 0x20a7c9a thread_proxy > > > > > @ 0x7fe6536186ba start_thread > > > > > @ 0x7fe65334e3dd clone > > > > > r.java:81) > > > > > > > > > > > > > > >
Re: Does this mem-tracker.h assertion ring a bell?
Will you file a JIRA for this bug? Sounds like something we don't want to lose track of. - Tim On Wed, Oct 25, 2017 at 11:05 AM, Philip Zeyligerwrote: > Thanks. I'm beginning to think my patch is not causing these breakages. A > different run was almost clean, with some TPC-DS query tests failing, that > I think are also new. > > -- Philip > > On Wed, Oct 25, 2017 at 10:36 AM, Tim Armstrong > wrote: > > > Yeah it's probably another consequence of > > https://issues.apache.org/jira/browse/IMPALA-5789. Maybe your patch > > changed > > the timing enough to trigger it. > > > > I think the bug might be related to using directory.capacity() as the > > argument to Release(). Calling directory.clear() after releasing the > memory > > in FitlerState::Disable() won't necessarily deallocate the memory so we > > could end up releasing it twice. > > > > On Wed, Oct 25, 2017 at 10:11 AM, Mostafa Mokhtar > > > wrote: > > > > > Maybe related to https://issues.apache.org/jira/browse/IMPALA-6099? > > > > > > On Wed, Oct 25, 2017 at 10:02 AM, Philip Zeyliger > > > > wrote: > > > > > > > Hi folks, > > > > > > > > I'm debugging some test failures related to an LLVM/AvroCodegen patch > > > I've > > > > got going on. The failures are in the parallel EE tests, and most of > > them > > > > are complaining that Impala is out to lunch. It looks like the > > following > > > > assertion is firing, causing an impalad to fail, causing many tests > to > > > > start failing. (I've also got a minidump, but the build was on > > > > jenkins.impala.io, so I don't think I have the symbols/binaries to > use > > > > it.) > > > > > > > > If this sort of thing rings a bell for anyone, please holler! > > > > > > > > Obviously I'll work on reproducing this locally to figure it out. > > > > > > > > F1025 02:20:43.786911 82485 mem-tracker.h:231] Check failed: > > > > tracker->consumption_->current_value() >= 0 (-1052615 vs. 0) > > > > Runtime Filter (Coordinator): Total=-1.00 MB Peak=1.00 MB > > > > *** Check failure stack trace: *** > > > > @ 0x2f1e11d google::LogMessage::Fail() > > > > @ 0x2f1f9c2 google::LogMessage::SendToLog() > > > > @ 0x2f1daf7 google::LogMessage::Flush() > > > > @ 0x2f210be google::LogMessageFatal::~ > LogMessageFatal() > > > > @ 0x17425fb impala::MemTracker::Release() > > > > @ 0x1fa7e8b impala::Coordinator::UpdateFilter() > > > > @ 0x186e3cf impala::ImpalaServer::UpdateFilter() > > > > @ 0x18d824f impala::ImpalaInternalService: > > :UpdateFilter() > > > > @ 0x1dda35a > > > > impala::ImpalaInternalServiceProcessor::process_UpdateFilter() > > > > @ 0x1dd8308 > > > > impala::ImpalaInternalServiceProcessor::dispatchCall() > > > > @ 0x15410ea apache::thrift:: > > TDispatchProcessor::process() > > > > @ 0x171042b > > > > apache::thrift::server::TAcceptQueueServer::Task::run() > > > > @ 0x170c307 impala::ThriftThread::RunRunnable() > > > > @ 0x170da13 boost::_mfi::mf2<>::operator()() > > > > @ 0x170d8a9 boost::_bi::list3<>::operator()<>() > > > > @ 0x170d5f5 boost::_bi::bind_t<>::operator()() > > > > @ 0x170d508 > > > > boost::detail::function::void_function_obj_invoker0<>::invoke() > > > > @ 0x171bdfc boost::function0<>::operator()() > > > > @ 0x19f3393 impala::Thread::SuperviseThread() > > > > @ 0x19fbf26 boost::_bi::list4<>::operator()<>() > > > > @ 0x19fbe69 boost::_bi::bind_t<>::operator()() > > > > @ 0x19fbe2c boost::detail::thread_data<>::run() > > > > @ 0x20a7c9a thread_proxy > > > > @ 0x7fe6536186ba start_thread > > > > @ 0x7fe65334e3dd clone > > > > r.java:81) > > > > > > > > > >
Re: Does this mem-tracker.h assertion ring a bell?
Thanks. I'm beginning to think my patch is not causing these breakages. A different run was almost clean, with some TPC-DS query tests failing, that I think are also new. -- Philip On Wed, Oct 25, 2017 at 10:36 AM, Tim Armstrongwrote: > Yeah it's probably another consequence of > https://issues.apache.org/jira/browse/IMPALA-5789. Maybe your patch > changed > the timing enough to trigger it. > > I think the bug might be related to using directory.capacity() as the > argument to Release(). Calling directory.clear() after releasing the memory > in FitlerState::Disable() won't necessarily deallocate the memory so we > could end up releasing it twice. > > On Wed, Oct 25, 2017 at 10:11 AM, Mostafa Mokhtar > wrote: > > > Maybe related to https://issues.apache.org/jira/browse/IMPALA-6099? > > > > On Wed, Oct 25, 2017 at 10:02 AM, Philip Zeyliger > > wrote: > > > > > Hi folks, > > > > > > I'm debugging some test failures related to an LLVM/AvroCodegen patch > > I've > > > got going on. The failures are in the parallel EE tests, and most of > them > > > are complaining that Impala is out to lunch. It looks like the > following > > > assertion is firing, causing an impalad to fail, causing many tests to > > > start failing. (I've also got a minidump, but the build was on > > > jenkins.impala.io, so I don't think I have the symbols/binaries to use > > > it.) > > > > > > If this sort of thing rings a bell for anyone, please holler! > > > > > > Obviously I'll work on reproducing this locally to figure it out. > > > > > > F1025 02:20:43.786911 82485 mem-tracker.h:231] Check failed: > > > tracker->consumption_->current_value() >= 0 (-1052615 vs. 0) > > > Runtime Filter (Coordinator): Total=-1.00 MB Peak=1.00 MB > > > *** Check failure stack trace: *** > > > @ 0x2f1e11d google::LogMessage::Fail() > > > @ 0x2f1f9c2 google::LogMessage::SendToLog() > > > @ 0x2f1daf7 google::LogMessage::Flush() > > > @ 0x2f210be google::LogMessageFatal::~LogMessageFatal() > > > @ 0x17425fb impala::MemTracker::Release() > > > @ 0x1fa7e8b impala::Coordinator::UpdateFilter() > > > @ 0x186e3cf impala::ImpalaServer::UpdateFilter() > > > @ 0x18d824f impala::ImpalaInternalService: > :UpdateFilter() > > > @ 0x1dda35a > > > impala::ImpalaInternalServiceProcessor::process_UpdateFilter() > > > @ 0x1dd8308 > > > impala::ImpalaInternalServiceProcessor::dispatchCall() > > > @ 0x15410ea apache::thrift:: > TDispatchProcessor::process() > > > @ 0x171042b > > > apache::thrift::server::TAcceptQueueServer::Task::run() > > > @ 0x170c307 impala::ThriftThread::RunRunnable() > > > @ 0x170da13 boost::_mfi::mf2<>::operator()() > > > @ 0x170d8a9 boost::_bi::list3<>::operator()<>() > > > @ 0x170d5f5 boost::_bi::bind_t<>::operator()() > > > @ 0x170d508 > > > boost::detail::function::void_function_obj_invoker0<>::invoke() > > > @ 0x171bdfc boost::function0<>::operator()() > > > @ 0x19f3393 impala::Thread::SuperviseThread() > > > @ 0x19fbf26 boost::_bi::list4<>::operator()<>() > > > @ 0x19fbe69 boost::_bi::bind_t<>::operator()() > > > @ 0x19fbe2c boost::detail::thread_data<>::run() > > > @ 0x20a7c9a thread_proxy > > > @ 0x7fe6536186ba start_thread > > > @ 0x7fe65334e3dd clone > > > r.java:81) > > > > > >
Re: Does this mem-tracker.h assertion ring a bell?
Yeah it's probably another consequence of https://issues.apache.org/jira/browse/IMPALA-5789. Maybe your patch changed the timing enough to trigger it. I think the bug might be related to using directory.capacity() as the argument to Release(). Calling directory.clear() after releasing the memory in FitlerState::Disable() won't necessarily deallocate the memory so we could end up releasing it twice. On Wed, Oct 25, 2017 at 10:11 AM, Mostafa Mokhtarwrote: > Maybe related to https://issues.apache.org/jira/browse/IMPALA-6099? > > On Wed, Oct 25, 2017 at 10:02 AM, Philip Zeyliger > wrote: > > > Hi folks, > > > > I'm debugging some test failures related to an LLVM/AvroCodegen patch > I've > > got going on. The failures are in the parallel EE tests, and most of them > > are complaining that Impala is out to lunch. It looks like the following > > assertion is firing, causing an impalad to fail, causing many tests to > > start failing. (I've also got a minidump, but the build was on > > jenkins.impala.io, so I don't think I have the symbols/binaries to use > > it.) > > > > If this sort of thing rings a bell for anyone, please holler! > > > > Obviously I'll work on reproducing this locally to figure it out. > > > > F1025 02:20:43.786911 82485 mem-tracker.h:231] Check failed: > > tracker->consumption_->current_value() >= 0 (-1052615 vs. 0) > > Runtime Filter (Coordinator): Total=-1.00 MB Peak=1.00 MB > > *** Check failure stack trace: *** > > @ 0x2f1e11d google::LogMessage::Fail() > > @ 0x2f1f9c2 google::LogMessage::SendToLog() > > @ 0x2f1daf7 google::LogMessage::Flush() > > @ 0x2f210be google::LogMessageFatal::~LogMessageFatal() > > @ 0x17425fb impala::MemTracker::Release() > > @ 0x1fa7e8b impala::Coordinator::UpdateFilter() > > @ 0x186e3cf impala::ImpalaServer::UpdateFilter() > > @ 0x18d824f impala::ImpalaInternalService::UpdateFilter() > > @ 0x1dda35a > > impala::ImpalaInternalServiceProcessor::process_UpdateFilter() > > @ 0x1dd8308 > > impala::ImpalaInternalServiceProcessor::dispatchCall() > > @ 0x15410ea apache::thrift::TDispatchProcessor::process() > > @ 0x171042b > > apache::thrift::server::TAcceptQueueServer::Task::run() > > @ 0x170c307 impala::ThriftThread::RunRunnable() > > @ 0x170da13 boost::_mfi::mf2<>::operator()() > > @ 0x170d8a9 boost::_bi::list3<>::operator()<>() > > @ 0x170d5f5 boost::_bi::bind_t<>::operator()() > > @ 0x170d508 > > boost::detail::function::void_function_obj_invoker0<>::invoke() > > @ 0x171bdfc boost::function0<>::operator()() > > @ 0x19f3393 impala::Thread::SuperviseThread() > > @ 0x19fbf26 boost::_bi::list4<>::operator()<>() > > @ 0x19fbe69 boost::_bi::bind_t<>::operator()() > > @ 0x19fbe2c boost::detail::thread_data<>::run() > > @ 0x20a7c9a thread_proxy > > @ 0x7fe6536186ba start_thread > > @ 0x7fe65334e3dd clone > > r.java:81) > > >
Re: Does this mem-tracker.h assertion ring a bell?
Maybe related to https://issues.apache.org/jira/browse/IMPALA-6099? On Wed, Oct 25, 2017 at 10:02 AM, Philip Zeyligerwrote: > Hi folks, > > I'm debugging some test failures related to an LLVM/AvroCodegen patch I've > got going on. The failures are in the parallel EE tests, and most of them > are complaining that Impala is out to lunch. It looks like the following > assertion is firing, causing an impalad to fail, causing many tests to > start failing. (I've also got a minidump, but the build was on > jenkins.impala.io, so I don't think I have the symbols/binaries to use > it.) > > If this sort of thing rings a bell for anyone, please holler! > > Obviously I'll work on reproducing this locally to figure it out. > > F1025 02:20:43.786911 82485 mem-tracker.h:231] Check failed: > tracker->consumption_->current_value() >= 0 (-1052615 vs. 0) > Runtime Filter (Coordinator): Total=-1.00 MB Peak=1.00 MB > *** Check failure stack trace: *** > @ 0x2f1e11d google::LogMessage::Fail() > @ 0x2f1f9c2 google::LogMessage::SendToLog() > @ 0x2f1daf7 google::LogMessage::Flush() > @ 0x2f210be google::LogMessageFatal::~LogMessageFatal() > @ 0x17425fb impala::MemTracker::Release() > @ 0x1fa7e8b impala::Coordinator::UpdateFilter() > @ 0x186e3cf impala::ImpalaServer::UpdateFilter() > @ 0x18d824f impala::ImpalaInternalService::UpdateFilter() > @ 0x1dda35a > impala::ImpalaInternalServiceProcessor::process_UpdateFilter() > @ 0x1dd8308 > impala::ImpalaInternalServiceProcessor::dispatchCall() > @ 0x15410ea apache::thrift::TDispatchProcessor::process() > @ 0x171042b > apache::thrift::server::TAcceptQueueServer::Task::run() > @ 0x170c307 impala::ThriftThread::RunRunnable() > @ 0x170da13 boost::_mfi::mf2<>::operator()() > @ 0x170d8a9 boost::_bi::list3<>::operator()<>() > @ 0x170d5f5 boost::_bi::bind_t<>::operator()() > @ 0x170d508 > boost::detail::function::void_function_obj_invoker0<>::invoke() > @ 0x171bdfc boost::function0<>::operator()() > @ 0x19f3393 impala::Thread::SuperviseThread() > @ 0x19fbf26 boost::_bi::list4<>::operator()<>() > @ 0x19fbe69 boost::_bi::bind_t<>::operator()() > @ 0x19fbe2c boost::detail::thread_data<>::run() > @ 0x20a7c9a thread_proxy > @ 0x7fe6536186ba start_thread > @ 0x7fe65334e3dd clone > r.java:81) >
Does this mem-tracker.h assertion ring a bell?
Hi folks, I'm debugging some test failures related to an LLVM/AvroCodegen patch I've got going on. The failures are in the parallel EE tests, and most of them are complaining that Impala is out to lunch. It looks like the following assertion is firing, causing an impalad to fail, causing many tests to start failing. (I've also got a minidump, but the build was on jenkins.impala.io, so I don't think I have the symbols/binaries to use it.) If this sort of thing rings a bell for anyone, please holler! Obviously I'll work on reproducing this locally to figure it out. F1025 02:20:43.786911 82485 mem-tracker.h:231] Check failed: tracker->consumption_->current_value() >= 0 (-1052615 vs. 0) Runtime Filter (Coordinator): Total=-1.00 MB Peak=1.00 MB *** Check failure stack trace: *** @ 0x2f1e11d google::LogMessage::Fail() @ 0x2f1f9c2 google::LogMessage::SendToLog() @ 0x2f1daf7 google::LogMessage::Flush() @ 0x2f210be google::LogMessageFatal::~LogMessageFatal() @ 0x17425fb impala::MemTracker::Release() @ 0x1fa7e8b impala::Coordinator::UpdateFilter() @ 0x186e3cf impala::ImpalaServer::UpdateFilter() @ 0x18d824f impala::ImpalaInternalService::UpdateFilter() @ 0x1dda35a impala::ImpalaInternalServiceProcessor::process_UpdateFilter() @ 0x1dd8308 impala::ImpalaInternalServiceProcessor::dispatchCall() @ 0x15410ea apache::thrift::TDispatchProcessor::process() @ 0x171042b apache::thrift::server::TAcceptQueueServer::Task::run() @ 0x170c307 impala::ThriftThread::RunRunnable() @ 0x170da13 boost::_mfi::mf2<>::operator()() @ 0x170d8a9 boost::_bi::list3<>::operator()<>() @ 0x170d5f5 boost::_bi::bind_t<>::operator()() @ 0x170d508 boost::detail::function::void_function_obj_invoker0<>::invoke() @ 0x171bdfc boost::function0<>::operator()() @ 0x19f3393 impala::Thread::SuperviseThread() @ 0x19fbf26 boost::_bi::list4<>::operator()<>() @ 0x19fbe69 boost::_bi::bind_t<>::operator()() @ 0x19fbe2c boost::detail::thread_data<>::run() @ 0x20a7c9a thread_proxy @ 0x7fe6536186ba start_thread @ 0x7fe65334e3dd clone r.java:81)