You’re right - my mistake. Thank you John! On 11/1/14, 10:06 PM, "K. John Wu" <[email protected]> wrote:
>Hi, Steven, > >Sean's problem was fixed with an update I made on March 13, 2014. The >change is in SVN repository. Please give the code from SVN repository >a try when you get the chance. > >By the way, the command to access the SVN repository is > >svn checkout https://codeforge.lbl.gov/anonscm/fastbit > >Please feel free to let us know if you encounter any issue with the >new code. > >Thanks. > >John > > > > >On 11/1/14 10:31 PM, Enns, Steven wrote: >> I’ve attached an alternative diff, which is to disable the optimization >>in >> ibis::category::fillIndex for when every entry has the same value. This >> also resolves my deadlock issues. >> >> >> On 11/1/14, 8:36 PM, "Enns, Steven" <[email protected]> wrote: >> >>> Attached is my proposed fix using recursive mutex that is confirmed to >>> resolve deadlock. >>> >>> The deadlock only seems to occur when the column contains a single >>> distinct value, so dictionary is of size 1, and the following >>>conditional >>> runs in ibis::category::fillIndex: >>> >>> if (dic.size() == 1) { // assume every entry has the given value >>> rlc = new ibis::direkte(this, 1, thePart->nRows()); >>> } >>> >>> >>> On 11/1/14, 4:21 PM, "Enns, Steven" <[email protected]> wrote: >>> >>>> I believe I have identified the cause of deadlock. >>>> ibis::category::prepare acquires ibis::column::mutex. Then it calls >>>> ibis::category::fillRows, which constructs ibis::direkte::direkte, >>>>which >>>> attempts to acquire the column mutex again. Perhaps >>>>ibis::column::mutex >>>> should be initialized with PTHREAD_MUTEX_RECURSIVE? >>>> >>>> >>>> On 11/1/14, 3:53 PM, "Enns, Steven" <[email protected]> wrote: >>>> >>>>> Hey Sean, >>>>> >>>>> What specifically was wrong with your index data? I am experiencing >>>>>the >>>>> same issue. >>>>> >>>>> Thanks, >>>>> Steve >>>>> >>>>> On 3/15/14, 4:08 PM, "Sean McNamara" <[email protected]> >>>>> wrote: >>>>> >>>>>> John- >>>>>> >>>>>> There was an issue with the index data that we had generated. After >>>>>> rebuilding the indexes /w 1.3.9 everything works great! >>>>>> >>>>>> Sorry for the false alarm. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Sean >>>>>> >>>>>> >>>>>> ________________________________________ >>>>>> From: [email protected] >>>>>> [[email protected]] on behalf of K. John Wu >>>>>> [[email protected]] >>>>>> Sent: Saturday, March 15, 2014 1:52 AM >>>>>> To: FastBit Users >>>>>> Subject: Re: [FastBit-users] fastbit query hangs on >>>>>>FUTEX_WAIT_PRIVATE >>>>>> >>>>>> Hi, Sean, >>>>>> >>>>>> Please check out SVN revision 706 and give it a try. Let us know if >>>>>> you continue to encounter problems. >>>>>> >>>>>> John >>>>>> >>>>>> PS: You can use the following command line to check out the latest >>>>>> code from SVN >>>>>> >>>>>> svn checkout https://codeforge.lbl.gov/anonscm/fastbit >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 3/14/14, 2:26 PM, Sean McNamara wrote: >>>>>>> Hey John- >>>>>>> >>>>>>> I just tried the 1.3.9 release and I still see the same issue. The >>>>>>> stacktrace is pasted below. I believe it is getting stuck in >>>>>>> column.cpp line 737: ibis::util::mutexLock lock(&mutex, >>>>>>> "column::getNullMask"); >>>>>>> >>>>>>> I can only reproduce the issue on ubuntu. If I can get the issue >>>>>>> reproduced on my mac /w generated data I will send it your way in >>>>>>> case >>>>>>> you wouldn't mind examining. Btw- I am building with c++0x instead >>>>>>> of >>>>>>> c++0x11 on ubuntu since it has an older gcc that doesn't support >>>>>>> 0x11. >>>>>>> >>>>>>> Thanks again, >>>>>>> >>>>>>> Sean >>>>>>> >>>>>>> >>>>>>> #0 0x00007ffff59d989c in __lll_lock_wait () from >>>>>>> /lib/x86_64-linux-gnu/libpthread.so.0 >>>>>>> (gdb) bt >>>>>>> #0 0x00007ffff59d989c in __lll_lock_wait () from >>>>>>> /lib/x86_64-linux-gnu/libpthread.so.0 >>>>>>> #1 0x00007ffff59d5065 in _L_lock_858 () from >>>>>>> /lib/x86_64-linux-gnu/libpthread.so.0 >>>>>>> #2 0x00007ffff59d4eba in pthread_mutex_lock () from >>>>>>> /lib/x86_64-linux-gnu/libpthread.so.0 >>>>>>> #3 0x00007ffff708e613 in >>>>>>>ibis::column::getNullMask(ibis::bitvector&) >>>>>>> const () from /usr/local/lib/libfastbit.so.0 >>>>>>> #4 0x00007ffff79573d0 in ibis::direkte::direkte(ibis::column >>>>>>>const*, >>>>>>> unsigned int, unsigned int) () >>>>>>> from /usr/local/lib/libfastbit.so.0 >>>>>>> #5 0x00007ffff77f2ec3 in ibis::category::fillIndex(char const*) >>>>>>> const >>>>>>> () from /usr/local/lib/libfastbit.so.0 >>>>>>> #6 0x00007ffff77f6c68 in ibis::category::prepareMembers() const () >>>>>>> from /usr/local/lib/libfastbit.so.0 >>>>>>> #7 0x00007ffff77fbd85 in ibis::category::getDictionary() const () >>>>>>> from /usr/local/lib/libfastbit.so.0 >>>>>>> #8 0x00007ffff6fe085d in ibis::bord::bord(char const*, char >>>>>>>const*, >>>>>>> ibis::selectClause const&, std::vector<ibis::part const*, >>>>>>> std::allocator<ibis::part const*> > const&) () from >>>>>>> /usr/local/lib/libfastbit.so.0 >>>>>>> #9 0x00007ffff78b0c69 in ibis::filter::sift2(ibis::selectClause >>>>>>> const&, std::vector<ibis::part const*, std::allocator<ibis::part >>>>>>> const*> > const&, ibis::whereClause const&) () from >>>>>>> /usr/local/lib/libfastbit.so.0 >>>>>>> #10 0x00007ffff78b8c28 in >>>>>>>ibis::table::select(std::vector<ibis::part >>>>>>> const*, std::allocator<ibis::part const*> > const&, char const*, >>>>>>>char >>>>>>> const*) () from /usr/local/lib/libfastbit.so.0 >>>>>>> #11 0x00007ffff771513b in ibis::mensa::select(char const*, char >>>>>>> const*) const () from /usr/local/lib/libfastbit.so.0 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>-------------------------------------------------------------------- >>>>>>>-- >>>>>>> *From:* [email protected] >>>>>>> [[email protected]] on behalf of Sean McNamara >>>>>>> [[email protected]] >>>>>>> *Sent:* Friday, March 14, 2014 12:19 PM >>>>>>> *To:* FastBit Users >>>>>>> *Subject:* Re: [FastBit-users] fastbit query hangs on >>>>>>> FUTEX_WAIT_PRIVATE >>>>>>> >>>>>>> John- >>>>>>> >>>>>>> Unfortunately I cannot share this dataset. I may try to make a >>>>>>> dataset that I can share if I can repo the issue. >>>>>>> >>>>>>> In case it is helpful here is a stacktrace: >>>>>>> http://pastebin.com/FT3qsLH6 >>>>>>> >>>>>>> I tried pulling the data down to my local machine and it works fine >>>>>>> there, no issues whatsoever. (I have a newer version of fastbit >>>>>>> installed locally). So first I will try deploying the latest and >>>>>>> greatest on our cluster. I will let you know how that goes. >>>>>>> >>>>>>> Thanks again! >>>>>>> >>>>>>> Sean >>>>>>> >>>>>>> >>>>>>> >>>>>>>-------------------------------------------------------------------- >>>>>>>-- >>>>>>> *From:* [email protected] >>>>>>> [[email protected]] on behalf of John >>>>>>> [[email protected]] >>>>>>> *Sent:* Friday, March 14, 2014 12:04 PM >>>>>>> *To:* FastBit Users >>>>>>> *Subject:* Re: [FastBit-users] fastbit query hangs on >>>>>>> FUTEX_WAIT_PRIVATE >>>>>>> >>>>>>> Hi, Sean, >>>>>>> >>>>>>> Thanks for bring this issue up. It appears to be some sort of >>>>>>> deadlock. I could look into further if you can share the sample >>>>>>> data. >>>>>>> Is the link you give the data or the log messages? >>>>>>> >>>>>>> -- John -- >>>>>>> >>>>>>> On Mar 14, 2014, at 10:53 AM, Sean McNamara >>>>>>> <[email protected] <mailto:[email protected]>> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi- >>>>>>>> >>>>>>>> I¹m trying to troubleshoot an issue that I just started seeing. >>>>>>>> Queries seem to hang, but only for certain columns and it¹s not >>>>>>>> clear to me why. If it¹s any help, I am using fastbit a few >>>>>>>>commits >>>>>>>> after 692. >>>>>>>> >>>>>>>> Here is the strace for the query: >>>>>>>> >>>>>>>> strace ibis -d /mnt/data/test -q "select daily_binned_datetime² >>>>>>>> >>>>>>>> http://pastebin.com/xczKJVWL >>>>>>>> >>>>>>>> >>>>>>>> Here is the tail of what ibis is doing with verbosity: >>>>>>>> >>>>>>>> fileManager::storage(0x258e630, 0) cleared >>>>>>>> array_t<i>::freeMemory this=0x24421a0 actual=0x24515f0 and >>>>>>>>m_begin=0 >>>>>>>> (active references: 0, past references: 1) >>>>>>>> fileManager::storage(0x24515f0, 0) cleared >>>>>>>> fileManager::flushFile will do nothing because >>>>>>>> >>>>>>>> >>>>>>>>"/mnt/data/explore/keyidx/35000/rp13/2014/02/03/daily_binned_dateti >>>>>>>>me >>>>>>>> . >>>>>>>> i >>>>>>>> d >>>>>>>> x" >>>>>>>> is not tracked by the file manager >>>>>>>> fileManager::storage(0x24515f0, 0) initialization completed >>>>>>>> array_t<i> constructed at 0x2451350 with actual=0x24515f0, >>>>>>>>m_begin=0 >>>>>>>> and m_end=0 >>>>>>>> fileManager::storage(0x258e630, 0) initialization completed >>>>>>>> array_t<l> constructed at 0x2451368 with actual=0x258e630, >>>>>>>>m_begin=0 >>>>>>>> and m_end=0 >>>>>>>> fileManager::storage(0x2451170, 0) initialization completed >>>>>>>> array_t<PN4ibis9bitvectorE> constructed at 0x2451380 with >>>>>>>> actual=0x2451170, m_begin=0 and m_end=0 >>>>>>>> array_t<PN4ibis9bitvectorE>::freeMemory this=0x2451380 >>>>>>>> actual=0x2451170 and m_begin=0 (active references: 0, past >>>>>>>> references: 1) >>>>>>>> fileManager::storage(0x2451170, 0) cleared >>>>>>>> fileManager::storage(0x2451170, 0x2451290) added 16 bytes to >>>>>>>> increase totalBytes to 80192 >>>>>>>> fileManager::storage(0x2451170, 0x2451290) initialization >>>>>>>>completed >>>>>>>> with 16 elements >>>>>>>> fileManager::storage(0x24512e0, 0) initialization completed >>>>>>>> array_t<j> constructed at 0x24512c0 with actual=0x24512e0, >>>>>>>>m_begin=0 >>>>>>>> and m_end=0 >>>>>>>> bitvector (0x24512b0) constructed with m_vec at 0x24512c0 <‹ >>>>>>>>hangs >>>>>>>> here >>>>>>>> >>>>>>>> >>>>>>>> Does anyone have any insight? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Sean >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> FastBit-users mailing list >>>>>>>> [email protected] <mailto:[email protected]> >>>>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> FastBit-users mailing list >>>>>>> [email protected] >>>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >>>>>>> >>>>>> _______________________________________________ >>>>>> FastBit-users mailing list >>>>>> [email protected] >>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >>>>>> _______________________________________________ >>>>>> FastBit-users mailing list >>>>>> [email protected] >>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >>>>> >>>>> _______________________________________________ >>>>> FastBit-users mailing list >>>>> [email protected] >>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >>>> >>>> _______________________________________________ >>>> FastBit-users mailing list >>>> [email protected] >>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >>> >> >> >> >> _______________________________________________ >> FastBit-users mailing list >> [email protected] >> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >> >_______________________________________________ >FastBit-users mailing list >[email protected] >https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users _______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
