Hi, Steven, Sean's problem was fixed with an update I made on March 13, 2014. The change is in SVN repository. Please give the code from SVN repository a try when you get the chance.
By the way, the command to access the SVN repository is svn checkout https://codeforge.lbl.gov/anonscm/fastbit Please feel free to let us know if you encounter any issue with the new code. Thanks. John On 11/1/14 10:31 PM, Enns, Steven wrote: > I’ve attached an alternative diff, which is to disable the optimization in > ibis::category::fillIndex for when every entry has the same value. This > also resolves my deadlock issues. > > > On 11/1/14, 8:36 PM, "Enns, Steven" <[email protected]> wrote: > >> Attached is my proposed fix using recursive mutex that is confirmed to >> resolve deadlock. >> >> The deadlock only seems to occur when the column contains a single >> distinct value, so dictionary is of size 1, and the following conditional >> runs in ibis::category::fillIndex: >> >> if (dic.size() == 1) { // assume every entry has the given value >> rlc = new ibis::direkte(this, 1, thePart->nRows()); >> } >> >> >> On 11/1/14, 4:21 PM, "Enns, Steven" <[email protected]> wrote: >> >>> I believe I have identified the cause of deadlock. >>> ibis::category::prepare acquires ibis::column::mutex. Then it calls >>> ibis::category::fillRows, which constructs ibis::direkte::direkte, which >>> attempts to acquire the column mutex again. Perhaps ibis::column::mutex >>> should be initialized with PTHREAD_MUTEX_RECURSIVE? >>> >>> >>> On 11/1/14, 3:53 PM, "Enns, Steven" <[email protected]> wrote: >>> >>>> Hey Sean, >>>> >>>> What specifically was wrong with your index data? I am experiencing the >>>> same issue. >>>> >>>> Thanks, >>>> Steve >>>> >>>> On 3/15/14, 4:08 PM, "Sean McNamara" <[email protected]> >>>> wrote: >>>> >>>>> John- >>>>> >>>>> There was an issue with the index data that we had generated. After >>>>> rebuilding the indexes /w 1.3.9 everything works great! >>>>> >>>>> Sorry for the false alarm. >>>>> >>>>> Thanks, >>>>> >>>>> Sean >>>>> >>>>> >>>>> ________________________________________ >>>>> From: [email protected] >>>>> [[email protected]] on behalf of K. John Wu >>>>> [[email protected]] >>>>> Sent: Saturday, March 15, 2014 1:52 AM >>>>> To: FastBit Users >>>>> Subject: Re: [FastBit-users] fastbit query hangs on FUTEX_WAIT_PRIVATE >>>>> >>>>> Hi, Sean, >>>>> >>>>> Please check out SVN revision 706 and give it a try. Let us know if >>>>> you continue to encounter problems. >>>>> >>>>> John >>>>> >>>>> PS: You can use the following command line to check out the latest >>>>> code from SVN >>>>> >>>>> svn checkout https://codeforge.lbl.gov/anonscm/fastbit >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On 3/14/14, 2:26 PM, Sean McNamara wrote: >>>>>> Hey John- >>>>>> >>>>>> I just tried the 1.3.9 release and I still see the same issue. The >>>>>> stacktrace is pasted below. I believe it is getting stuck in >>>>>> column.cpp line 737: ibis::util::mutexLock lock(&mutex, >>>>>> "column::getNullMask"); >>>>>> >>>>>> I can only reproduce the issue on ubuntu. If I can get the issue >>>>>> reproduced on my mac /w generated data I will send it your way in >>>>>> case >>>>>> you wouldn't mind examining. Btw- I am building with c++0x instead >>>>>> of >>>>>> c++0x11 on ubuntu since it has an older gcc that doesn't support >>>>>> 0x11. >>>>>> >>>>>> Thanks again, >>>>>> >>>>>> Sean >>>>>> >>>>>> >>>>>> #0 0x00007ffff59d989c in __lll_lock_wait () from >>>>>> /lib/x86_64-linux-gnu/libpthread.so.0 >>>>>> (gdb) bt >>>>>> #0 0x00007ffff59d989c in __lll_lock_wait () from >>>>>> /lib/x86_64-linux-gnu/libpthread.so.0 >>>>>> #1 0x00007ffff59d5065 in _L_lock_858 () from >>>>>> /lib/x86_64-linux-gnu/libpthread.so.0 >>>>>> #2 0x00007ffff59d4eba in pthread_mutex_lock () from >>>>>> /lib/x86_64-linux-gnu/libpthread.so.0 >>>>>> #3 0x00007ffff708e613 in ibis::column::getNullMask(ibis::bitvector&) >>>>>> const () from /usr/local/lib/libfastbit.so.0 >>>>>> #4 0x00007ffff79573d0 in ibis::direkte::direkte(ibis::column const*, >>>>>> unsigned int, unsigned int) () >>>>>> from /usr/local/lib/libfastbit.so.0 >>>>>> #5 0x00007ffff77f2ec3 in ibis::category::fillIndex(char const*) >>>>>> const >>>>>> () from /usr/local/lib/libfastbit.so.0 >>>>>> #6 0x00007ffff77f6c68 in ibis::category::prepareMembers() const () >>>>>> from /usr/local/lib/libfastbit.so.0 >>>>>> #7 0x00007ffff77fbd85 in ibis::category::getDictionary() const () >>>>>> from /usr/local/lib/libfastbit.so.0 >>>>>> #8 0x00007ffff6fe085d in ibis::bord::bord(char const*, char const*, >>>>>> ibis::selectClause const&, std::vector<ibis::part const*, >>>>>> std::allocator<ibis::part const*> > const&) () from >>>>>> /usr/local/lib/libfastbit.so.0 >>>>>> #9 0x00007ffff78b0c69 in ibis::filter::sift2(ibis::selectClause >>>>>> const&, std::vector<ibis::part const*, std::allocator<ibis::part >>>>>> const*> > const&, ibis::whereClause const&) () from >>>>>> /usr/local/lib/libfastbit.so.0 >>>>>> #10 0x00007ffff78b8c28 in ibis::table::select(std::vector<ibis::part >>>>>> const*, std::allocator<ibis::part const*> > const&, char const*, char >>>>>> const*) () from /usr/local/lib/libfastbit.so.0 >>>>>> #11 0x00007ffff771513b in ibis::mensa::select(char const*, char >>>>>> const*) const () from /usr/local/lib/libfastbit.so.0 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> ---------------------------------------------------------------------- >>>>>> *From:* [email protected] >>>>>> [[email protected]] on behalf of Sean McNamara >>>>>> [[email protected]] >>>>>> *Sent:* Friday, March 14, 2014 12:19 PM >>>>>> *To:* FastBit Users >>>>>> *Subject:* Re: [FastBit-users] fastbit query hangs on >>>>>> FUTEX_WAIT_PRIVATE >>>>>> >>>>>> John- >>>>>> >>>>>> Unfortunately I cannot share this dataset. I may try to make a >>>>>> dataset that I can share if I can repo the issue. >>>>>> >>>>>> In case it is helpful here is a stacktrace: >>>>>> http://pastebin.com/FT3qsLH6 >>>>>> >>>>>> I tried pulling the data down to my local machine and it works fine >>>>>> there, no issues whatsoever. (I have a newer version of fastbit >>>>>> installed locally). So first I will try deploying the latest and >>>>>> greatest on our cluster. I will let you know how that goes. >>>>>> >>>>>> Thanks again! >>>>>> >>>>>> Sean >>>>>> >>>>>> >>>>>> ---------------------------------------------------------------------- >>>>>> *From:* [email protected] >>>>>> [[email protected]] on behalf of John >>>>>> [[email protected]] >>>>>> *Sent:* Friday, March 14, 2014 12:04 PM >>>>>> *To:* FastBit Users >>>>>> *Subject:* Re: [FastBit-users] fastbit query hangs on >>>>>> FUTEX_WAIT_PRIVATE >>>>>> >>>>>> Hi, Sean, >>>>>> >>>>>> Thanks for bring this issue up. It appears to be some sort of >>>>>> deadlock. I could look into further if you can share the sample >>>>>> data. >>>>>> Is the link you give the data or the log messages? >>>>>> >>>>>> -- John -- >>>>>> >>>>>> On Mar 14, 2014, at 10:53 AM, Sean McNamara >>>>>> <[email protected] <mailto:[email protected]>> >>>>>> wrote: >>>>>> >>>>>>> Hi- >>>>>>> >>>>>>> I¹m trying to troubleshoot an issue that I just started seeing. >>>>>>> Queries seem to hang, but only for certain columns and it¹s not >>>>>>> clear to me why. If it¹s any help, I am using fastbit a few commits >>>>>>> after 692. >>>>>>> >>>>>>> Here is the strace for the query: >>>>>>> >>>>>>> strace ibis -d /mnt/data/test -q "select daily_binned_datetime² >>>>>>> >>>>>>> http://pastebin.com/xczKJVWL >>>>>>> >>>>>>> >>>>>>> Here is the tail of what ibis is doing with verbosity: >>>>>>> >>>>>>> fileManager::storage(0x258e630, 0) cleared >>>>>>> array_t<i>::freeMemory this=0x24421a0 actual=0x24515f0 and m_begin=0 >>>>>>> (active references: 0, past references: 1) >>>>>>> fileManager::storage(0x24515f0, 0) cleared >>>>>>> fileManager::flushFile will do nothing because >>>>>>> >>>>>>> "/mnt/data/explore/keyidx/35000/rp13/2014/02/03/daily_binned_datetime >>>>>>> . >>>>>>> i >>>>>>> d >>>>>>> x" >>>>>>> is not tracked by the file manager >>>>>>> fileManager::storage(0x24515f0, 0) initialization completed >>>>>>> array_t<i> constructed at 0x2451350 with actual=0x24515f0, m_begin=0 >>>>>>> and m_end=0 >>>>>>> fileManager::storage(0x258e630, 0) initialization completed >>>>>>> array_t<l> constructed at 0x2451368 with actual=0x258e630, m_begin=0 >>>>>>> and m_end=0 >>>>>>> fileManager::storage(0x2451170, 0) initialization completed >>>>>>> array_t<PN4ibis9bitvectorE> constructed at 0x2451380 with >>>>>>> actual=0x2451170, m_begin=0 and m_end=0 >>>>>>> array_t<PN4ibis9bitvectorE>::freeMemory this=0x2451380 >>>>>>> actual=0x2451170 and m_begin=0 (active references: 0, past >>>>>>> references: 1) >>>>>>> fileManager::storage(0x2451170, 0) cleared >>>>>>> fileManager::storage(0x2451170, 0x2451290) added 16 bytes to >>>>>>> increase totalBytes to 80192 >>>>>>> fileManager::storage(0x2451170, 0x2451290) initialization completed >>>>>>> with 16 elements >>>>>>> fileManager::storage(0x24512e0, 0) initialization completed >>>>>>> array_t<j> constructed at 0x24512c0 with actual=0x24512e0, m_begin=0 >>>>>>> and m_end=0 >>>>>>> bitvector (0x24512b0) constructed with m_vec at 0x24512c0 <‹ hangs >>>>>>> here >>>>>>> >>>>>>> >>>>>>> Does anyone have any insight? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Sean >>>>>>> >>>>>>> _______________________________________________ >>>>>>> FastBit-users mailing list >>>>>>> [email protected] <mailto:[email protected]> >>>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> FastBit-users mailing list >>>>>> [email protected] >>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >>>>>> >>>>> _______________________________________________ >>>>> FastBit-users mailing list >>>>> [email protected] >>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >>>>> _______________________________________________ >>>>> FastBit-users mailing list >>>>> [email protected] >>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >>>> >>>> _______________________________________________ >>>> FastBit-users mailing list >>>> [email protected] >>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >>> >>> _______________________________________________ >>> FastBit-users mailing list >>> [email protected] >>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >> > > > > _______________________________________________ > FastBit-users mailing list > [email protected] > https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users > _______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
