Re: Does this mem-tracker.h assertion ring a bell?

2017-10-27 Thread Philip Zeyliger
Filed https://issues.apache.org/jira/browse/IMPALA-6118.

-- Philip

On Wed, Oct 25, 2017 at 2:50 PM, Tim Armstrong 
wrote:

> Will you file a JIRA for this bug? Sounds like something we don't want to
> lose track of.
>
> - Tim
>
> On Wed, Oct 25, 2017 at 11:05 AM, Philip Zeyliger 
> wrote:
>
> > Thanks. I'm beginning to think my patch is not causing these breakages. A
> > different run was almost clean, with some TPC-DS query tests failing,
> that
> > I think are also new.
> >
> > -- Philip
> >
> > On Wed, Oct 25, 2017 at 10:36 AM, Tim Armstrong  >
> > wrote:
> >
> > > Yeah it's probably another consequence of
> > > https://issues.apache.org/jira/browse/IMPALA-5789. Maybe your patch
> > > changed
> > > the timing enough to trigger it.
> > >
> > > I think the bug might be related to using directory.capacity() as the
> > > argument to Release(). Calling directory.clear() after releasing the
> > memory
> > > in FitlerState::Disable() won't necessarily deallocate the memory so we
> > > could end up releasing it twice.
> > >
> > > On Wed, Oct 25, 2017 at 10:11 AM, Mostafa Mokhtar <
> mmokh...@cloudera.com
> > >
> > > wrote:
> > >
> > > > Maybe related to https://issues.apache.org/jira/browse/IMPALA-6099?
> > > >
> > > > On Wed, Oct 25, 2017 at 10:02 AM, Philip Zeyliger <
> phi...@cloudera.com
> > >
> > > > wrote:
> > > >
> > > > > Hi folks,
> > > > >
> > > > > I'm debugging some test failures related to an LLVM/AvroCodegen
> patch
> > > > I've
> > > > > got going on. The failures are in the parallel EE tests, and most
> of
> > > them
> > > > > are complaining that Impala is out to lunch. It looks like the
> > > following
> > > > > assertion is firing, causing an impalad to fail, causing many tests
> > to
> > > > > start failing. (I've also got a minidump, but the build was on
> > > > > jenkins.impala.io, so I don't think I have the symbols/binaries to
> > use
> > > > > it.)
> > > > >
> > > > > If this sort of thing rings a bell for anyone, please holler!
> > > > >
> > > > > Obviously I'll work on reproducing this locally to figure it out.
> > > > >
> > > > > F1025 02:20:43.786911 82485 mem-tracker.h:231] Check failed:
> > > > > tracker->consumption_->current_value() >= 0 (-1052615 vs. 0)
> > > > > Runtime Filter (Coordinator): Total=-1.00 MB Peak=1.00 MB
> > > > > *** Check failure stack trace: ***
> > > > > @  0x2f1e11d  google::LogMessage::Fail()
> > > > > @  0x2f1f9c2  google::LogMessage::SendToLog()
> > > > > @  0x2f1daf7  google::LogMessage::Flush()
> > > > > @  0x2f210be  google::LogMessageFatal::~
> > LogMessageFatal()
> > > > > @  0x17425fb  impala::MemTracker::Release()
> > > > > @  0x1fa7e8b  impala::Coordinator::UpdateFilter()
> > > > > @  0x186e3cf  impala::ImpalaServer::UpdateFilter()
> > > > > @  0x18d824f  impala::ImpalaInternalService:
> > > :UpdateFilter()
> > > > > @  0x1dda35a
> > > > > impala::ImpalaInternalServiceProcessor::process_UpdateFilter()
> > > > > @  0x1dd8308
> > > > > impala::ImpalaInternalServiceProcessor::dispatchCall()
> > > > > @  0x15410ea  apache::thrift::
> > > TDispatchProcessor::process()
> > > > > @  0x171042b
> > > > > apache::thrift::server::TAcceptQueueServer::Task::run()
> > > > > @  0x170c307  impala::ThriftThread::RunRunnable()
> > > > > @  0x170da13  boost::_mfi::mf2<>::operator()()
> > > > > @  0x170d8a9  boost::_bi::list3<>::operator()<>()
> > > > > @  0x170d5f5  boost::_bi::bind_t<>::operator()()
> > > > > @  0x170d508
> > > > > boost::detail::function::void_function_obj_invoker0<>::invoke()
> > > > > @  0x171bdfc  boost::function0<>::operator()()
> > > > > @  0x19f3393  impala::Thread::SuperviseThread()
> > > > > @  0x19fbf26  boost::_bi::list4<>::operator()<>()
> > > > > @  0x19fbe69  boost::_bi::bind_t<>::operator()()
> > > > > @  0x19fbe2c  boost::detail::thread_data<>::run()
> > > > > @  0x20a7c9a  thread_proxy
> > > > > @ 0x7fe6536186ba  start_thread
> > > > > @ 0x7fe65334e3dd  clone
> > > > > r.java:81)
> > > > >
> > > >
> > >
> >
>


Re: Does this mem-tracker.h assertion ring a bell?

2017-10-25 Thread Tim Armstrong
Will you file a JIRA for this bug? Sounds like something we don't want to
lose track of.

- Tim

On Wed, Oct 25, 2017 at 11:05 AM, Philip Zeyliger 
wrote:

> Thanks. I'm beginning to think my patch is not causing these breakages. A
> different run was almost clean, with some TPC-DS query tests failing, that
> I think are also new.
>
> -- Philip
>
> On Wed, Oct 25, 2017 at 10:36 AM, Tim Armstrong 
> wrote:
>
> > Yeah it's probably another consequence of
> > https://issues.apache.org/jira/browse/IMPALA-5789. Maybe your patch
> > changed
> > the timing enough to trigger it.
> >
> > I think the bug might be related to using directory.capacity() as the
> > argument to Release(). Calling directory.clear() after releasing the
> memory
> > in FitlerState::Disable() won't necessarily deallocate the memory so we
> > could end up releasing it twice.
> >
> > On Wed, Oct 25, 2017 at 10:11 AM, Mostafa Mokhtar  >
> > wrote:
> >
> > > Maybe related to https://issues.apache.org/jira/browse/IMPALA-6099?
> > >
> > > On Wed, Oct 25, 2017 at 10:02 AM, Philip Zeyliger  >
> > > wrote:
> > >
> > > > Hi folks,
> > > >
> > > > I'm debugging some test failures related to an LLVM/AvroCodegen patch
> > > I've
> > > > got going on. The failures are in the parallel EE tests, and most of
> > them
> > > > are complaining that Impala is out to lunch. It looks like the
> > following
> > > > assertion is firing, causing an impalad to fail, causing many tests
> to
> > > > start failing. (I've also got a minidump, but the build was on
> > > > jenkins.impala.io, so I don't think I have the symbols/binaries to
> use
> > > > it.)
> > > >
> > > > If this sort of thing rings a bell for anyone, please holler!
> > > >
> > > > Obviously I'll work on reproducing this locally to figure it out.
> > > >
> > > > F1025 02:20:43.786911 82485 mem-tracker.h:231] Check failed:
> > > > tracker->consumption_->current_value() >= 0 (-1052615 vs. 0)
> > > > Runtime Filter (Coordinator): Total=-1.00 MB Peak=1.00 MB
> > > > *** Check failure stack trace: ***
> > > > @  0x2f1e11d  google::LogMessage::Fail()
> > > > @  0x2f1f9c2  google::LogMessage::SendToLog()
> > > > @  0x2f1daf7  google::LogMessage::Flush()
> > > > @  0x2f210be  google::LogMessageFatal::~
> LogMessageFatal()
> > > > @  0x17425fb  impala::MemTracker::Release()
> > > > @  0x1fa7e8b  impala::Coordinator::UpdateFilter()
> > > > @  0x186e3cf  impala::ImpalaServer::UpdateFilter()
> > > > @  0x18d824f  impala::ImpalaInternalService:
> > :UpdateFilter()
> > > > @  0x1dda35a
> > > > impala::ImpalaInternalServiceProcessor::process_UpdateFilter()
> > > > @  0x1dd8308
> > > > impala::ImpalaInternalServiceProcessor::dispatchCall()
> > > > @  0x15410ea  apache::thrift::
> > TDispatchProcessor::process()
> > > > @  0x171042b
> > > > apache::thrift::server::TAcceptQueueServer::Task::run()
> > > > @  0x170c307  impala::ThriftThread::RunRunnable()
> > > > @  0x170da13  boost::_mfi::mf2<>::operator()()
> > > > @  0x170d8a9  boost::_bi::list3<>::operator()<>()
> > > > @  0x170d5f5  boost::_bi::bind_t<>::operator()()
> > > > @  0x170d508
> > > > boost::detail::function::void_function_obj_invoker0<>::invoke()
> > > > @  0x171bdfc  boost::function0<>::operator()()
> > > > @  0x19f3393  impala::Thread::SuperviseThread()
> > > > @  0x19fbf26  boost::_bi::list4<>::operator()<>()
> > > > @  0x19fbe69  boost::_bi::bind_t<>::operator()()
> > > > @  0x19fbe2c  boost::detail::thread_data<>::run()
> > > > @  0x20a7c9a  thread_proxy
> > > > @ 0x7fe6536186ba  start_thread
> > > > @ 0x7fe65334e3dd  clone
> > > > r.java:81)
> > > >
> > >
> >
>


Re: Does this mem-tracker.h assertion ring a bell?

2017-10-25 Thread Philip Zeyliger
Thanks. I'm beginning to think my patch is not causing these breakages. A
different run was almost clean, with some TPC-DS query tests failing, that
I think are also new.

-- Philip

On Wed, Oct 25, 2017 at 10:36 AM, Tim Armstrong 
wrote:

> Yeah it's probably another consequence of
> https://issues.apache.org/jira/browse/IMPALA-5789. Maybe your patch
> changed
> the timing enough to trigger it.
>
> I think the bug might be related to using directory.capacity() as the
> argument to Release(). Calling directory.clear() after releasing the memory
> in FitlerState::Disable() won't necessarily deallocate the memory so we
> could end up releasing it twice.
>
> On Wed, Oct 25, 2017 at 10:11 AM, Mostafa Mokhtar 
> wrote:
>
> > Maybe related to https://issues.apache.org/jira/browse/IMPALA-6099?
> >
> > On Wed, Oct 25, 2017 at 10:02 AM, Philip Zeyliger 
> > wrote:
> >
> > > Hi folks,
> > >
> > > I'm debugging some test failures related to an LLVM/AvroCodegen patch
> > I've
> > > got going on. The failures are in the parallel EE tests, and most of
> them
> > > are complaining that Impala is out to lunch. It looks like the
> following
> > > assertion is firing, causing an impalad to fail, causing many tests to
> > > start failing. (I've also got a minidump, but the build was on
> > > jenkins.impala.io, so I don't think I have the symbols/binaries to use
> > > it.)
> > >
> > > If this sort of thing rings a bell for anyone, please holler!
> > >
> > > Obviously I'll work on reproducing this locally to figure it out.
> > >
> > > F1025 02:20:43.786911 82485 mem-tracker.h:231] Check failed:
> > > tracker->consumption_->current_value() >= 0 (-1052615 vs. 0)
> > > Runtime Filter (Coordinator): Total=-1.00 MB Peak=1.00 MB
> > > *** Check failure stack trace: ***
> > > @  0x2f1e11d  google::LogMessage::Fail()
> > > @  0x2f1f9c2  google::LogMessage::SendToLog()
> > > @  0x2f1daf7  google::LogMessage::Flush()
> > > @  0x2f210be  google::LogMessageFatal::~LogMessageFatal()
> > > @  0x17425fb  impala::MemTracker::Release()
> > > @  0x1fa7e8b  impala::Coordinator::UpdateFilter()
> > > @  0x186e3cf  impala::ImpalaServer::UpdateFilter()
> > > @  0x18d824f  impala::ImpalaInternalService:
> :UpdateFilter()
> > > @  0x1dda35a
> > > impala::ImpalaInternalServiceProcessor::process_UpdateFilter()
> > > @  0x1dd8308
> > > impala::ImpalaInternalServiceProcessor::dispatchCall()
> > > @  0x15410ea  apache::thrift::
> TDispatchProcessor::process()
> > > @  0x171042b
> > > apache::thrift::server::TAcceptQueueServer::Task::run()
> > > @  0x170c307  impala::ThriftThread::RunRunnable()
> > > @  0x170da13  boost::_mfi::mf2<>::operator()()
> > > @  0x170d8a9  boost::_bi::list3<>::operator()<>()
> > > @  0x170d5f5  boost::_bi::bind_t<>::operator()()
> > > @  0x170d508
> > > boost::detail::function::void_function_obj_invoker0<>::invoke()
> > > @  0x171bdfc  boost::function0<>::operator()()
> > > @  0x19f3393  impala::Thread::SuperviseThread()
> > > @  0x19fbf26  boost::_bi::list4<>::operator()<>()
> > > @  0x19fbe69  boost::_bi::bind_t<>::operator()()
> > > @  0x19fbe2c  boost::detail::thread_data<>::run()
> > > @  0x20a7c9a  thread_proxy
> > > @ 0x7fe6536186ba  start_thread
> > > @ 0x7fe65334e3dd  clone
> > > r.java:81)
> > >
> >
>


Re: Does this mem-tracker.h assertion ring a bell?

2017-10-25 Thread Tim Armstrong
Yeah it's probably another consequence of
https://issues.apache.org/jira/browse/IMPALA-5789. Maybe your patch changed
the timing enough to trigger it.

I think the bug might be related to using directory.capacity() as the
argument to Release(). Calling directory.clear() after releasing the memory
in FitlerState::Disable() won't necessarily deallocate the memory so we
could end up releasing it twice.

On Wed, Oct 25, 2017 at 10:11 AM, Mostafa Mokhtar 
wrote:

> Maybe related to https://issues.apache.org/jira/browse/IMPALA-6099?
>
> On Wed, Oct 25, 2017 at 10:02 AM, Philip Zeyliger 
> wrote:
>
> > Hi folks,
> >
> > I'm debugging some test failures related to an LLVM/AvroCodegen patch
> I've
> > got going on. The failures are in the parallel EE tests, and most of them
> > are complaining that Impala is out to lunch. It looks like the following
> > assertion is firing, causing an impalad to fail, causing many tests to
> > start failing. (I've also got a minidump, but the build was on
> > jenkins.impala.io, so I don't think I have the symbols/binaries to use
> > it.)
> >
> > If this sort of thing rings a bell for anyone, please holler!
> >
> > Obviously I'll work on reproducing this locally to figure it out.
> >
> > F1025 02:20:43.786911 82485 mem-tracker.h:231] Check failed:
> > tracker->consumption_->current_value() >= 0 (-1052615 vs. 0)
> > Runtime Filter (Coordinator): Total=-1.00 MB Peak=1.00 MB
> > *** Check failure stack trace: ***
> > @  0x2f1e11d  google::LogMessage::Fail()
> > @  0x2f1f9c2  google::LogMessage::SendToLog()
> > @  0x2f1daf7  google::LogMessage::Flush()
> > @  0x2f210be  google::LogMessageFatal::~LogMessageFatal()
> > @  0x17425fb  impala::MemTracker::Release()
> > @  0x1fa7e8b  impala::Coordinator::UpdateFilter()
> > @  0x186e3cf  impala::ImpalaServer::UpdateFilter()
> > @  0x18d824f  impala::ImpalaInternalService::UpdateFilter()
> > @  0x1dda35a
> > impala::ImpalaInternalServiceProcessor::process_UpdateFilter()
> > @  0x1dd8308
> > impala::ImpalaInternalServiceProcessor::dispatchCall()
> > @  0x15410ea  apache::thrift::TDispatchProcessor::process()
> > @  0x171042b
> > apache::thrift::server::TAcceptQueueServer::Task::run()
> > @  0x170c307  impala::ThriftThread::RunRunnable()
> > @  0x170da13  boost::_mfi::mf2<>::operator()()
> > @  0x170d8a9  boost::_bi::list3<>::operator()<>()
> > @  0x170d5f5  boost::_bi::bind_t<>::operator()()
> > @  0x170d508
> > boost::detail::function::void_function_obj_invoker0<>::invoke()
> > @  0x171bdfc  boost::function0<>::operator()()
> > @  0x19f3393  impala::Thread::SuperviseThread()
> > @  0x19fbf26  boost::_bi::list4<>::operator()<>()
> > @  0x19fbe69  boost::_bi::bind_t<>::operator()()
> > @  0x19fbe2c  boost::detail::thread_data<>::run()
> > @  0x20a7c9a  thread_proxy
> > @ 0x7fe6536186ba  start_thread
> > @ 0x7fe65334e3dd  clone
> > r.java:81)
> >
>


Re: Does this mem-tracker.h assertion ring a bell?

2017-10-25 Thread Mostafa Mokhtar
Maybe related to https://issues.apache.org/jira/browse/IMPALA-6099?

On Wed, Oct 25, 2017 at 10:02 AM, Philip Zeyliger 
wrote:

> Hi folks,
>
> I'm debugging some test failures related to an LLVM/AvroCodegen patch I've
> got going on. The failures are in the parallel EE tests, and most of them
> are complaining that Impala is out to lunch. It looks like the following
> assertion is firing, causing an impalad to fail, causing many tests to
> start failing. (I've also got a minidump, but the build was on
> jenkins.impala.io, so I don't think I have the symbols/binaries to use
> it.)
>
> If this sort of thing rings a bell for anyone, please holler!
>
> Obviously I'll work on reproducing this locally to figure it out.
>
> F1025 02:20:43.786911 82485 mem-tracker.h:231] Check failed:
> tracker->consumption_->current_value() >= 0 (-1052615 vs. 0)
> Runtime Filter (Coordinator): Total=-1.00 MB Peak=1.00 MB
> *** Check failure stack trace: ***
> @  0x2f1e11d  google::LogMessage::Fail()
> @  0x2f1f9c2  google::LogMessage::SendToLog()
> @  0x2f1daf7  google::LogMessage::Flush()
> @  0x2f210be  google::LogMessageFatal::~LogMessageFatal()
> @  0x17425fb  impala::MemTracker::Release()
> @  0x1fa7e8b  impala::Coordinator::UpdateFilter()
> @  0x186e3cf  impala::ImpalaServer::UpdateFilter()
> @  0x18d824f  impala::ImpalaInternalService::UpdateFilter()
> @  0x1dda35a
> impala::ImpalaInternalServiceProcessor::process_UpdateFilter()
> @  0x1dd8308
> impala::ImpalaInternalServiceProcessor::dispatchCall()
> @  0x15410ea  apache::thrift::TDispatchProcessor::process()
> @  0x171042b
> apache::thrift::server::TAcceptQueueServer::Task::run()
> @  0x170c307  impala::ThriftThread::RunRunnable()
> @  0x170da13  boost::_mfi::mf2<>::operator()()
> @  0x170d8a9  boost::_bi::list3<>::operator()<>()
> @  0x170d5f5  boost::_bi::bind_t<>::operator()()
> @  0x170d508
> boost::detail::function::void_function_obj_invoker0<>::invoke()
> @  0x171bdfc  boost::function0<>::operator()()
> @  0x19f3393  impala::Thread::SuperviseThread()
> @  0x19fbf26  boost::_bi::list4<>::operator()<>()
> @  0x19fbe69  boost::_bi::bind_t<>::operator()()
> @  0x19fbe2c  boost::detail::thread_data<>::run()
> @  0x20a7c9a  thread_proxy
> @ 0x7fe6536186ba  start_thread
> @ 0x7fe65334e3dd  clone
> r.java:81)
>


Does this mem-tracker.h assertion ring a bell?

2017-10-25 Thread Philip Zeyliger
Hi folks,

I'm debugging some test failures related to an LLVM/AvroCodegen patch I've
got going on. The failures are in the parallel EE tests, and most of them
are complaining that Impala is out to lunch. It looks like the following
assertion is firing, causing an impalad to fail, causing many tests to
start failing. (I've also got a minidump, but the build was on
jenkins.impala.io, so I don't think I have the symbols/binaries to use it.)

If this sort of thing rings a bell for anyone, please holler!

Obviously I'll work on reproducing this locally to figure it out.

F1025 02:20:43.786911 82485 mem-tracker.h:231] Check failed:
tracker->consumption_->current_value() >= 0 (-1052615 vs. 0)
Runtime Filter (Coordinator): Total=-1.00 MB Peak=1.00 MB
*** Check failure stack trace: ***
@  0x2f1e11d  google::LogMessage::Fail()
@  0x2f1f9c2  google::LogMessage::SendToLog()
@  0x2f1daf7  google::LogMessage::Flush()
@  0x2f210be  google::LogMessageFatal::~LogMessageFatal()
@  0x17425fb  impala::MemTracker::Release()
@  0x1fa7e8b  impala::Coordinator::UpdateFilter()
@  0x186e3cf  impala::ImpalaServer::UpdateFilter()
@  0x18d824f  impala::ImpalaInternalService::UpdateFilter()
@  0x1dda35a
impala::ImpalaInternalServiceProcessor::process_UpdateFilter()
@  0x1dd8308
impala::ImpalaInternalServiceProcessor::dispatchCall()
@  0x15410ea  apache::thrift::TDispatchProcessor::process()
@  0x171042b
apache::thrift::server::TAcceptQueueServer::Task::run()
@  0x170c307  impala::ThriftThread::RunRunnable()
@  0x170da13  boost::_mfi::mf2<>::operator()()
@  0x170d8a9  boost::_bi::list3<>::operator()<>()
@  0x170d5f5  boost::_bi::bind_t<>::operator()()
@  0x170d508
boost::detail::function::void_function_obj_invoker0<>::invoke()
@  0x171bdfc  boost::function0<>::operator()()
@  0x19f3393  impala::Thread::SuperviseThread()
@  0x19fbf26  boost::_bi::list4<>::operator()<>()
@  0x19fbe69  boost::_bi::bind_t<>::operator()()
@  0x19fbe2c  boost::detail::thread_data<>::run()
@  0x20a7c9a  thread_proxy
@ 0x7fe6536186ba  start_thread
@ 0x7fe65334e3dd  clone
r.java:81)