Sorry for the delay - i was finally able to reproduce it and i also fixed it.
The commit is a bit larger than my first try. https://github.com/cruppstahl/hypertable/commits/v0.9.5 commit b45ba15b701373c3a1f689f8997f31bde8ff5165 Author: Christoph Rupp <[email protected]> Date: Wed Apr 25 18:54:11 2012 +0200 issue 827: fixed deadlock when scanning secondary indices Thanks again for your great help! Best regards Christoph 2012/4/26 gcc.lua <[email protected]> > Hi, > > thanks to reply quickly, but the commit just remove m_mutex inside > virtual ~IndexScannerCallback() , > I try it, will a new problem occured, see end of report, > some additional info about reproduce this issue before you commit > > void run() > { > TableScannerPtr aScanner = tbSourcelist- > >create_scanner( specbuilder.get(), 5000 ); > > while( aScanner->next( gotCell ) ) > { > .... > if(condition) > break;//if have next result, now break, internel scanner > thread running > .... > } > return;//trigger TableScanner destructor, next info see my first > post please > } > > > ////////////////////////////////////////////////////////////////////////////////////// > > > pure virtual method called > terminate called without an active exception > > Program received signal SIGABRT, Aborted. > [Switching to Thread 0x7fffe6ff5700 (LWP 23887)] > 0x00007ffff48db1b5 in raise () from /lib/libc.so.6 > > > (gdb) where > #0 0x00007ffff48db1b5 in raise () from /lib/libc.so.6 > #1 0x00007ffff48ddfc0 in abort () from /lib/libc.so.6 > #2 0x00007ffff516fdc5 in __gnu_cxx::__verbose_terminate_handler() () > from /usr/lib/libstdc++.so.6 > #3 0x00007ffff516e166 in ?? () from /usr/lib/libstdc++.so.6 > #4 0x00007ffff516e193 in std::terminate() () from /usr/lib/libstdc+ > +.so.6 > #5 0x00007ffff516ea6f in __cxa_pure_virtual () from /usr/lib/libstdc+ > +.so.6 > #6 0x00000000005c43c6 in > Hypertable::TableScannerAsync::maybe_callback_ok > (this=0x7fffb432ecd0, > scanner_id=19373, next=true, do_callback=true, cells=...) > at > /root/qiao/Project/hypertable-0.9.5.6/src/cc/Hypertable/Lib/ > TableScannerAsync.cc:520 > #7 0x00000000005c393f in > Hypertable::TableScannerAsync::handle_result > (this=0x7fffb432ecd0, scanner_id=19373, event=..., is_create=true) > at > /root/qiao/Project/hypertable-0.9.5.6/src/cc/Hypertable/Lib/ > TableScannerAsync.cc:464 > #8 0x00000000005fdc5e in Hypertable::TableScannerHandler::run > (this=0x7fff99915850) at > /root/qiao/Project/hypertable-0.9.5.6/src/cc/Hypertable/Lib/ > TableScannerHandler.cc:40 > #9 0x000000000045f2c5 in > Hypertable::ApplicationQueue::Worker::operator() (this=0xaaa120) at > /root/qiao/Project/hypertable-0.9.5.6/src/cc/AsyncComm/ > ApplicationQueue.h:173 > #10 0x0000000000470f04 in > boost::detail::thread_data<Hypertable::ApplicationQueue::Worker>::run > (this=0xaa9ff0) at /usr/include/boost/thread/detail/thread.hpp:56 > #11 0x00007ffff77b5200 in thread_proxy () from > /usr/lib/libboost_thread.so.1.42.0 > #12 0x00007ffff79c58ca in start_thread () from /lib/libpthread.so.0 > #13 0x00007ffff497892d in clone () from /lib/libc.so.6 > #14 0x0000000000000000 in ?? () > > On 4月26日, 上午12时56分, Christoph Rupp <[email protected]> wrote: > > Hi, > > > > thanks for the great bug report. > > > > I am not able to reproduce this issue, but i think i came up with a fix. > If > > you want to check out the sources then you can get them here: > https://github.com/cruppstahl/hypertablebranch "v0.9.5" > > > > This is the commit: > > commit 2572b5dcb524e1c36dc23307c37784fd34c1bdde > > Author: Christoph Rupp <[email protected]> > > Date: Wed Apr 25 18:54:11 2012 +0200 > > > > issue 827: fixed deadlock when scanning secondary indices > > > > And here's the diff: > > > > diff --git a/src/cc/Hypertable/Lib/IndexScannerCallback.h > > b/src/cc/Hypertable/Li > > index 70ffda7..1b37127 100644 > > --- a/src/cc/Hypertable/Lib/IndexScannerCallback.h > > +++ b/src/cc/Hypertable/Lib/IndexScannerCallback.h > > @@ -118,13 +118,12 @@ static String last; > > } > > > > virtual ~IndexScannerCallback() { > > - ScopedLock lock(m_mutex); > > - if (m_mutator) > > - delete m_mutator; > > foreach (TableScannerAsync *s, m_scanners) > > delete s; > > m_scanners.clear(); > > sspecs_clear(); > > + if (m_mutator) > > + delete m_mutator; > > > > Can you please give it a try and see if this helps? > > > > Thanks > > Christoph > > > > 2012/4/24 gcc.lua <[email protected]> > > > > > user thread logic like follow: > > > TableScannerPtr aScanner = tbSourcelist- > > > >create_scanner( specbuilder.get(), 5000 ); > > > while( aScanner->next( gotCell ) ) > > > { > > > ..... > > > } > > > > > dead lock between user thread and scanner thread: > > > > > 1. user thread TableScanner > > > > > TableScannerAsync::~TableScannerAsync() { > > > try { > > > cancel(); > > > wait_for_completion(); > > > } > > > catch (Exception &e) { > > > HT_ERROR_OUT << e << HT_END; > > > } > > > if (m_use_index) { > > > delete m_cb;//<=========================dead lock entry > > > m_cb = 0; > > > } > > > } > > > ///////////////////////////////////////// > > > virtual ~IndexScannerCallback() { > > > ScopedLock lock(m_mutex);//<========= user thread got this > > > IndexScannerCallback::m_mutex > > > if (m_mutator) > > > delete m_mutator; > > > > > foreach (TableScannerAsync *s, m_scanners) > > > delete s;//dead lock 1<=============user thread wait > > > TableScannerAsync::m_mutex > > > > > 2. scanner thread > > > > > void TableScannerAsync::handle_result(int scanner_id, EventPtr > > > &event, bool is_create) { > > > > > bool cancelled = is_cancelled(); > > > ScopedLock lock(m_mutex);<============scanner thread got > > > TableScannerAsync::m_mutex > > > ScanCellsPtr cells; > > > > > . . . . . . > > > maybe_callback_ok();<================call m_cb->scan_ok(this, > > > cells); > > > > > } > > > ////////////////////////////// > > > class IndexScannerCallback : public ResultCallback { > > > > > virtual void scan_ok(TableScannerAsync *scanner, ScanCellsPtr > > > &scancells) { > > > bool is_eos = scancells->get_eos(); > > > String table_name = scanner->get_table_name(); > > > > > ScopedLock lock(m_mutex);//dead lock 2<============scanner > > > thread wait IndexScannerCallback::m_mutex > > > > > -- > > > You received this message because you are subscribed to the Google > Groups > > > "Hypertable Development" group. > > > To post to this group, send email to [email protected]. > > > To unsubscribe from this group, send email to > > > [email protected]. > > > For more options, visit this group at > > >http://groups.google.com/group/hypertable-dev?hl=en. > > -- > You received this message because you are subscribed to the Google Groups > "Hypertable Development" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/hypertable-dev?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Hypertable Development" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/hypertable-dev?hl=en.
