I believe I have identified the cause of deadlock.
ibis::category::prepare acquires ibis::column::mutex.  Then it calls
ibis::category::fillRows, which constructs ibis::direkte::direkte, which
attempts to acquire the column mutex again.  Perhaps ibis::column::mutex
should be initialized with PTHREAD_MUTEX_RECURSIVE?


On 11/1/14, 3:53 PM, "Enns, Steven" <[email protected]> wrote:

>Hey Sean,
>
>What specifically was wrong with your index data?  I am experiencing the
>same issue.  
>
>Thanks,
>Steve
>
>On 3/15/14, 4:08 PM, "Sean McNamara" <[email protected]> wrote:
>
>>John-
>>
>>There was an issue with the index data that we had generated. After
>>rebuilding the indexes /w 1.3.9 everything works great!
>>
>>Sorry for the false alarm.
>>
>>Thanks,
>>
>>Sean
>>
>>
>>________________________________________
>>From: [email protected]
>>[[email protected]] on behalf of K. John Wu
>>[[email protected]]
>>Sent: Saturday, March 15, 2014 1:52 AM
>>To: FastBit Users
>>Subject: Re: [FastBit-users] fastbit query hangs on FUTEX_WAIT_PRIVATE
>>
>>Hi, Sean,
>>
>>Please check out SVN revision 706 and give it a try.  Let us know if
>>you continue to encounter problems.
>>
>>John
>>
>>PS: You can use the following command line to check out the latest
>>code from SVN
>>
>>svn checkout https://codeforge.lbl.gov/anonscm/fastbit
>>
>>
>>
>>
>>
>>
>>On 3/14/14, 2:26 PM, Sean McNamara wrote:
>>> Hey John-
>>>
>>> I just tried the 1.3.9 release and I still see the same issue.  The
>>> stacktrace is pasted below.  I believe it is getting stuck in
>>> column.cpp line 737: ibis::util::mutexLock lock(&mutex,
>>> "column::getNullMask");
>>>
>>> I can only reproduce the issue on ubuntu.  If I can get the issue
>>> reproduced on my mac /w generated data I will send it your way in case
>>> you wouldn't mind examining.  Btw- I am building with c++0x instead of
>>> c++0x11 on ubuntu since it has an older gcc that doesn't support 0x11.
>>>
>>> Thanks again,
>>>
>>> Sean
>>>
>>>
>>> #0  0x00007ffff59d989c in __lll_lock_wait () from
>>> /lib/x86_64-linux-gnu/libpthread.so.0
>>> (gdb) bt
>>> #0  0x00007ffff59d989c in __lll_lock_wait () from
>>> /lib/x86_64-linux-gnu/libpthread.so.0
>>> #1  0x00007ffff59d5065 in _L_lock_858 () from
>>> /lib/x86_64-linux-gnu/libpthread.so.0
>>> #2  0x00007ffff59d4eba in pthread_mutex_lock () from
>>> /lib/x86_64-linux-gnu/libpthread.so.0
>>> #3  0x00007ffff708e613 in ibis::column::getNullMask(ibis::bitvector&)
>>> const () from /usr/local/lib/libfastbit.so.0
>>> #4  0x00007ffff79573d0 in ibis::direkte::direkte(ibis::column const*,
>>> unsigned int, unsigned int) ()
>>>    from /usr/local/lib/libfastbit.so.0
>>> #5  0x00007ffff77f2ec3 in ibis::category::fillIndex(char const*) const
>>> () from /usr/local/lib/libfastbit.so.0
>>> #6  0x00007ffff77f6c68 in ibis::category::prepareMembers() const ()
>>> from /usr/local/lib/libfastbit.so.0
>>> #7  0x00007ffff77fbd85 in ibis::category::getDictionary() const ()
>>> from /usr/local/lib/libfastbit.so.0
>>> #8  0x00007ffff6fe085d in ibis::bord::bord(char const*, char const*,
>>> ibis::selectClause const&, std::vector<ibis::part const*,
>>> std::allocator<ibis::part const*> > const&) () from
>>> /usr/local/lib/libfastbit.so.0
>>> #9  0x00007ffff78b0c69 in ibis::filter::sift2(ibis::selectClause
>>> const&, std::vector<ibis::part const*, std::allocator<ibis::part
>>> const*> > const&, ibis::whereClause const&) () from
>>> /usr/local/lib/libfastbit.so.0
>>> #10 0x00007ffff78b8c28 in ibis::table::select(std::vector<ibis::part
>>> const*, std::allocator<ibis::part const*> > const&, char const*, char
>>> const*) () from /usr/local/lib/libfastbit.so.0
>>> #11 0x00007ffff771513b in ibis::mensa::select(char const*, char
>>> const*) const () from /usr/local/lib/libfastbit.so.0
>>>
>>>
>>>
>>>
>>> ----------------------------------------------------------------------
>>> *From:* [email protected]
>>> [[email protected]] on behalf of Sean McNamara
>>> [[email protected]]
>>> *Sent:* Friday, March 14, 2014 12:19 PM
>>> *To:* FastBit Users
>>> *Subject:* Re: [FastBit-users] fastbit query hangs on
>>>FUTEX_WAIT_PRIVATE
>>>
>>> John-
>>>
>>> Unfortunately I cannot share this dataset.  I may try to make a
>>> dataset that I can share if I can repo the issue.
>>>
>>> In case it is helpful here is a stacktrace:
>>> http://pastebin.com/FT3qsLH6
>>>
>>> I tried pulling the data down to my local machine and it works fine
>>> there, no issues whatsoever. (I have a newer version of fastbit
>>> installed locally).  So first I will try deploying the latest and
>>> greatest on our cluster. I will let you know how that goes.
>>>
>>> Thanks again!
>>>
>>> Sean
>>>
>>> ----------------------------------------------------------------------
>>> *From:* [email protected]
>>> [[email protected]] on behalf of John [[email protected]]
>>> *Sent:* Friday, March 14, 2014 12:04 PM
>>> *To:* FastBit Users
>>> *Subject:* Re: [FastBit-users] fastbit query hangs on
>>>FUTEX_WAIT_PRIVATE
>>>
>>> Hi, Sean,
>>>
>>> Thanks for bring this issue up.  It appears to be some sort of
>>> deadlock.  I could look into further if you can share the sample data.
>>>  Is the link you give the data or the log messages?
>>>
>>> -- John --
>>>
>>> On Mar 14, 2014, at 10:53 AM, Sean McNamara
>>> <[email protected] <mailto:[email protected]>>
>>>wrote:
>>>
>>>> Hi-
>>>>
>>>> I¹m trying to troubleshoot an issue that I just started seeing.
>>>>  Queries seem to hang, but only for certain columns and it¹s not
>>>> clear to me why.  If it¹s any help, I am using fastbit a few commits
>>>> after 692.
>>>>
>>>> Here is the strace for the query:
>>>>
>>>> strace ibis -d /mnt/data/test -q "select daily_binned_datetime²
>>>>
>>>> http://pastebin.com/xczKJVWL
>>>>
>>>>
>>>> Here is the tail of what ibis is doing with verbosity:
>>>>
>>>> fileManager::storage(0x258e630, 0) cleared
>>>> array_t<i>::freeMemory this=0x24421a0 actual=0x24515f0 and m_begin=0
>>>> (active references: 0, past references: 1)
>>>> fileManager::storage(0x24515f0, 0) cleared
>>>> fileManager::flushFile will do nothing because
>>>> 
>>>>"/mnt/data/explore/keyidx/35000/rp13/2014/02/03/daily_binned_datetime.i
>>>>d
>>>>x"
>>>> is not tracked by the file manager
>>>> fileManager::storage(0x24515f0, 0) initialization completed
>>>> array_t<i> constructed at 0x2451350 with actual=0x24515f0, m_begin=0
>>>> and m_end=0
>>>> fileManager::storage(0x258e630, 0) initialization completed
>>>> array_t<l> constructed at 0x2451368 with actual=0x258e630, m_begin=0
>>>> and m_end=0
>>>> fileManager::storage(0x2451170, 0) initialization completed
>>>> array_t<PN4ibis9bitvectorE> constructed at 0x2451380 with
>>>> actual=0x2451170, m_begin=0 and m_end=0
>>>> array_t<PN4ibis9bitvectorE>::freeMemory this=0x2451380
>>>> actual=0x2451170 and m_begin=0 (active references: 0, past
>>>> references: 1)
>>>> fileManager::storage(0x2451170, 0) cleared
>>>> fileManager::storage(0x2451170, 0x2451290) added 16 bytes to
>>>> increase totalBytes to 80192
>>>> fileManager::storage(0x2451170, 0x2451290) initialization completed
>>>> with 16 elements
>>>> fileManager::storage(0x24512e0, 0) initialization completed
>>>> array_t<j> constructed at 0x24512c0 with actual=0x24512e0, m_begin=0
>>>> and m_end=0
>>>> bitvector (0x24512b0) constructed with m_vec at 0x24512c0   <‹ hangs
>>>> here
>>>>
>>>>
>>>> Does anyone have any insight?
>>>>
>>>> Thanks,
>>>>
>>>> Sean
>>>>
>>>> _______________________________________________
>>>> FastBit-users mailing list
>>>> [email protected] <mailto:[email protected]>
>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>>
>>>
>>> _______________________________________________
>>> FastBit-users mailing list
>>> [email protected]
>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>>
>>_______________________________________________
>>FastBit-users mailing list
>>[email protected]
>>https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>_______________________________________________
>>FastBit-users mailing list
>>[email protected]
>>https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>
>_______________________________________________
>FastBit-users mailing list
>[email protected]
>https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to