I believe I have identified the cause of deadlock. ibis::category::prepare acquires ibis::column::mutex. Then it calls ibis::category::fillRows, which constructs ibis::direkte::direkte, which attempts to acquire the column mutex again. Perhaps ibis::column::mutex should be initialized with PTHREAD_MUTEX_RECURSIVE?
On 11/1/14, 3:53 PM, "Enns, Steven" <[email protected]> wrote: >Hey Sean, > >What specifically was wrong with your index data? I am experiencing the >same issue. > >Thanks, >Steve > >On 3/15/14, 4:08 PM, "Sean McNamara" <[email protected]> wrote: > >>John- >> >>There was an issue with the index data that we had generated. After >>rebuilding the indexes /w 1.3.9 everything works great! >> >>Sorry for the false alarm. >> >>Thanks, >> >>Sean >> >> >>________________________________________ >>From: [email protected] >>[[email protected]] on behalf of K. John Wu >>[[email protected]] >>Sent: Saturday, March 15, 2014 1:52 AM >>To: FastBit Users >>Subject: Re: [FastBit-users] fastbit query hangs on FUTEX_WAIT_PRIVATE >> >>Hi, Sean, >> >>Please check out SVN revision 706 and give it a try. Let us know if >>you continue to encounter problems. >> >>John >> >>PS: You can use the following command line to check out the latest >>code from SVN >> >>svn checkout https://codeforge.lbl.gov/anonscm/fastbit >> >> >> >> >> >> >>On 3/14/14, 2:26 PM, Sean McNamara wrote: >>> Hey John- >>> >>> I just tried the 1.3.9 release and I still see the same issue. The >>> stacktrace is pasted below. I believe it is getting stuck in >>> column.cpp line 737: ibis::util::mutexLock lock(&mutex, >>> "column::getNullMask"); >>> >>> I can only reproduce the issue on ubuntu. If I can get the issue >>> reproduced on my mac /w generated data I will send it your way in case >>> you wouldn't mind examining. Btw- I am building with c++0x instead of >>> c++0x11 on ubuntu since it has an older gcc that doesn't support 0x11. >>> >>> Thanks again, >>> >>> Sean >>> >>> >>> #0 0x00007ffff59d989c in __lll_lock_wait () from >>> /lib/x86_64-linux-gnu/libpthread.so.0 >>> (gdb) bt >>> #0 0x00007ffff59d989c in __lll_lock_wait () from >>> /lib/x86_64-linux-gnu/libpthread.so.0 >>> #1 0x00007ffff59d5065 in _L_lock_858 () from >>> /lib/x86_64-linux-gnu/libpthread.so.0 >>> #2 0x00007ffff59d4eba in pthread_mutex_lock () from >>> /lib/x86_64-linux-gnu/libpthread.so.0 >>> #3 0x00007ffff708e613 in ibis::column::getNullMask(ibis::bitvector&) >>> const () from /usr/local/lib/libfastbit.so.0 >>> #4 0x00007ffff79573d0 in ibis::direkte::direkte(ibis::column const*, >>> unsigned int, unsigned int) () >>> from /usr/local/lib/libfastbit.so.0 >>> #5 0x00007ffff77f2ec3 in ibis::category::fillIndex(char const*) const >>> () from /usr/local/lib/libfastbit.so.0 >>> #6 0x00007ffff77f6c68 in ibis::category::prepareMembers() const () >>> from /usr/local/lib/libfastbit.so.0 >>> #7 0x00007ffff77fbd85 in ibis::category::getDictionary() const () >>> from /usr/local/lib/libfastbit.so.0 >>> #8 0x00007ffff6fe085d in ibis::bord::bord(char const*, char const*, >>> ibis::selectClause const&, std::vector<ibis::part const*, >>> std::allocator<ibis::part const*> > const&) () from >>> /usr/local/lib/libfastbit.so.0 >>> #9 0x00007ffff78b0c69 in ibis::filter::sift2(ibis::selectClause >>> const&, std::vector<ibis::part const*, std::allocator<ibis::part >>> const*> > const&, ibis::whereClause const&) () from >>> /usr/local/lib/libfastbit.so.0 >>> #10 0x00007ffff78b8c28 in ibis::table::select(std::vector<ibis::part >>> const*, std::allocator<ibis::part const*> > const&, char const*, char >>> const*) () from /usr/local/lib/libfastbit.so.0 >>> #11 0x00007ffff771513b in ibis::mensa::select(char const*, char >>> const*) const () from /usr/local/lib/libfastbit.so.0 >>> >>> >>> >>> >>> ---------------------------------------------------------------------- >>> *From:* [email protected] >>> [[email protected]] on behalf of Sean McNamara >>> [[email protected]] >>> *Sent:* Friday, March 14, 2014 12:19 PM >>> *To:* FastBit Users >>> *Subject:* Re: [FastBit-users] fastbit query hangs on >>>FUTEX_WAIT_PRIVATE >>> >>> John- >>> >>> Unfortunately I cannot share this dataset. I may try to make a >>> dataset that I can share if I can repo the issue. >>> >>> In case it is helpful here is a stacktrace: >>> http://pastebin.com/FT3qsLH6 >>> >>> I tried pulling the data down to my local machine and it works fine >>> there, no issues whatsoever. (I have a newer version of fastbit >>> installed locally). So first I will try deploying the latest and >>> greatest on our cluster. I will let you know how that goes. >>> >>> Thanks again! >>> >>> Sean >>> >>> ---------------------------------------------------------------------- >>> *From:* [email protected] >>> [[email protected]] on behalf of John [[email protected]] >>> *Sent:* Friday, March 14, 2014 12:04 PM >>> *To:* FastBit Users >>> *Subject:* Re: [FastBit-users] fastbit query hangs on >>>FUTEX_WAIT_PRIVATE >>> >>> Hi, Sean, >>> >>> Thanks for bring this issue up. It appears to be some sort of >>> deadlock. I could look into further if you can share the sample data. >>> Is the link you give the data or the log messages? >>> >>> -- John -- >>> >>> On Mar 14, 2014, at 10:53 AM, Sean McNamara >>> <[email protected] <mailto:[email protected]>> >>>wrote: >>> >>>> Hi- >>>> >>>> I¹m trying to troubleshoot an issue that I just started seeing. >>>> Queries seem to hang, but only for certain columns and it¹s not >>>> clear to me why. If it¹s any help, I am using fastbit a few commits >>>> after 692. >>>> >>>> Here is the strace for the query: >>>> >>>> strace ibis -d /mnt/data/test -q "select daily_binned_datetime² >>>> >>>> http://pastebin.com/xczKJVWL >>>> >>>> >>>> Here is the tail of what ibis is doing with verbosity: >>>> >>>> fileManager::storage(0x258e630, 0) cleared >>>> array_t<i>::freeMemory this=0x24421a0 actual=0x24515f0 and m_begin=0 >>>> (active references: 0, past references: 1) >>>> fileManager::storage(0x24515f0, 0) cleared >>>> fileManager::flushFile will do nothing because >>>> >>>>"/mnt/data/explore/keyidx/35000/rp13/2014/02/03/daily_binned_datetime.i >>>>d >>>>x" >>>> is not tracked by the file manager >>>> fileManager::storage(0x24515f0, 0) initialization completed >>>> array_t<i> constructed at 0x2451350 with actual=0x24515f0, m_begin=0 >>>> and m_end=0 >>>> fileManager::storage(0x258e630, 0) initialization completed >>>> array_t<l> constructed at 0x2451368 with actual=0x258e630, m_begin=0 >>>> and m_end=0 >>>> fileManager::storage(0x2451170, 0) initialization completed >>>> array_t<PN4ibis9bitvectorE> constructed at 0x2451380 with >>>> actual=0x2451170, m_begin=0 and m_end=0 >>>> array_t<PN4ibis9bitvectorE>::freeMemory this=0x2451380 >>>> actual=0x2451170 and m_begin=0 (active references: 0, past >>>> references: 1) >>>> fileManager::storage(0x2451170, 0) cleared >>>> fileManager::storage(0x2451170, 0x2451290) added 16 bytes to >>>> increase totalBytes to 80192 >>>> fileManager::storage(0x2451170, 0x2451290) initialization completed >>>> with 16 elements >>>> fileManager::storage(0x24512e0, 0) initialization completed >>>> array_t<j> constructed at 0x24512c0 with actual=0x24512e0, m_begin=0 >>>> and m_end=0 >>>> bitvector (0x24512b0) constructed with m_vec at 0x24512c0 <‹ hangs >>>> here >>>> >>>> >>>> Does anyone have any insight? >>>> >>>> Thanks, >>>> >>>> Sean >>>> >>>> _______________________________________________ >>>> FastBit-users mailing list >>>> [email protected] <mailto:[email protected]> >>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >>> >>> >>> _______________________________________________ >>> FastBit-users mailing list >>> [email protected] >>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >>> >>_______________________________________________ >>FastBit-users mailing list >>[email protected] >>https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >>_______________________________________________ >>FastBit-users mailing list >>[email protected] >>https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users > >_______________________________________________ >FastBit-users mailing list >[email protected] >https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users _______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
