Hi, Steven,

Sean's problem was fixed with an update I made on March 13, 2014.  The
change is in SVN repository.  Please give the code from SVN repository
a try when you get the chance.

By the way, the command to access the SVN repository is

svn checkout https://codeforge.lbl.gov/anonscm/fastbit

Please feel free to let us know if you encounter any issue with the
new code.

Thanks.

John




On 11/1/14 10:31 PM, Enns, Steven wrote:
> I’ve attached an alternative diff, which is to disable the optimization in
> ibis::category::fillIndex for when every entry has the same value.  This
> also resolves my deadlock issues.
> 
> 
> On 11/1/14, 8:36 PM, "Enns, Steven" <[email protected]> wrote:
> 
>> Attached is my proposed fix using recursive mutex that is confirmed to
>> resolve deadlock.
>>
>> The deadlock only seems to occur when the column contains a single
>> distinct value, so dictionary is of size 1, and the following conditional
>> runs in ibis::category::fillIndex:
>>
>> if (dic.size() == 1) { // assume every entry has the given value
>>    rlc = new ibis::direkte(this, 1, thePart->nRows());
>>    }
>>
>>
>> On 11/1/14, 4:21 PM, "Enns, Steven" <[email protected]> wrote:
>>
>>> I believe I have identified the cause of deadlock.
>>> ibis::category::prepare acquires ibis::column::mutex.  Then it calls
>>> ibis::category::fillRows, which constructs ibis::direkte::direkte, which
>>> attempts to acquire the column mutex again.  Perhaps ibis::column::mutex
>>> should be initialized with PTHREAD_MUTEX_RECURSIVE?
>>>
>>>
>>> On 11/1/14, 3:53 PM, "Enns, Steven" <[email protected]> wrote:
>>>
>>>> Hey Sean,
>>>>
>>>> What specifically was wrong with your index data?  I am experiencing the
>>>> same issue.  
>>>>
>>>> Thanks,
>>>> Steve
>>>>
>>>> On 3/15/14, 4:08 PM, "Sean McNamara" <[email protected]>
>>>> wrote:
>>>>
>>>>> John-
>>>>>
>>>>> There was an issue with the index data that we had generated. After
>>>>> rebuilding the indexes /w 1.3.9 everything works great!
>>>>>
>>>>> Sorry for the false alarm.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Sean
>>>>>
>>>>>
>>>>> ________________________________________
>>>>> From: [email protected]
>>>>> [[email protected]] on behalf of K. John Wu
>>>>> [[email protected]]
>>>>> Sent: Saturday, March 15, 2014 1:52 AM
>>>>> To: FastBit Users
>>>>> Subject: Re: [FastBit-users] fastbit query hangs on FUTEX_WAIT_PRIVATE
>>>>>
>>>>> Hi, Sean,
>>>>>
>>>>> Please check out SVN revision 706 and give it a try.  Let us know if
>>>>> you continue to encounter problems.
>>>>>
>>>>> John
>>>>>
>>>>> PS: You can use the following command line to check out the latest
>>>>> code from SVN
>>>>>
>>>>> svn checkout https://codeforge.lbl.gov/anonscm/fastbit
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 3/14/14, 2:26 PM, Sean McNamara wrote:
>>>>>> Hey John-
>>>>>>
>>>>>> I just tried the 1.3.9 release and I still see the same issue.  The
>>>>>> stacktrace is pasted below.  I believe it is getting stuck in
>>>>>> column.cpp line 737: ibis::util::mutexLock lock(&mutex,
>>>>>> "column::getNullMask");
>>>>>>
>>>>>> I can only reproduce the issue on ubuntu.  If I can get the issue
>>>>>> reproduced on my mac /w generated data I will send it your way in
>>>>>> case
>>>>>> you wouldn't mind examining.  Btw- I am building with c++0x instead
>>>>>> of
>>>>>> c++0x11 on ubuntu since it has an older gcc that doesn't support
>>>>>> 0x11.
>>>>>>
>>>>>> Thanks again,
>>>>>>
>>>>>> Sean
>>>>>>
>>>>>>
>>>>>> #0  0x00007ffff59d989c in __lll_lock_wait () from
>>>>>> /lib/x86_64-linux-gnu/libpthread.so.0
>>>>>> (gdb) bt
>>>>>> #0  0x00007ffff59d989c in __lll_lock_wait () from
>>>>>> /lib/x86_64-linux-gnu/libpthread.so.0
>>>>>> #1  0x00007ffff59d5065 in _L_lock_858 () from
>>>>>> /lib/x86_64-linux-gnu/libpthread.so.0
>>>>>> #2  0x00007ffff59d4eba in pthread_mutex_lock () from
>>>>>> /lib/x86_64-linux-gnu/libpthread.so.0
>>>>>> #3  0x00007ffff708e613 in ibis::column::getNullMask(ibis::bitvector&)
>>>>>> const () from /usr/local/lib/libfastbit.so.0
>>>>>> #4  0x00007ffff79573d0 in ibis::direkte::direkte(ibis::column const*,
>>>>>> unsigned int, unsigned int) ()
>>>>>>    from /usr/local/lib/libfastbit.so.0
>>>>>> #5  0x00007ffff77f2ec3 in ibis::category::fillIndex(char const*)
>>>>>> const
>>>>>> () from /usr/local/lib/libfastbit.so.0
>>>>>> #6  0x00007ffff77f6c68 in ibis::category::prepareMembers() const ()
>>>>>> from /usr/local/lib/libfastbit.so.0
>>>>>> #7  0x00007ffff77fbd85 in ibis::category::getDictionary() const ()
>>>>>> from /usr/local/lib/libfastbit.so.0
>>>>>> #8  0x00007ffff6fe085d in ibis::bord::bord(char const*, char const*,
>>>>>> ibis::selectClause const&, std::vector<ibis::part const*,
>>>>>> std::allocator<ibis::part const*> > const&) () from
>>>>>> /usr/local/lib/libfastbit.so.0
>>>>>> #9  0x00007ffff78b0c69 in ibis::filter::sift2(ibis::selectClause
>>>>>> const&, std::vector<ibis::part const*, std::allocator<ibis::part
>>>>>> const*> > const&, ibis::whereClause const&) () from
>>>>>> /usr/local/lib/libfastbit.so.0
>>>>>> #10 0x00007ffff78b8c28 in ibis::table::select(std::vector<ibis::part
>>>>>> const*, std::allocator<ibis::part const*> > const&, char const*, char
>>>>>> const*) () from /usr/local/lib/libfastbit.so.0
>>>>>> #11 0x00007ffff771513b in ibis::mensa::select(char const*, char
>>>>>> const*) const () from /usr/local/lib/libfastbit.so.0
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ----------------------------------------------------------------------
>>>>>> *From:* [email protected]
>>>>>> [[email protected]] on behalf of Sean McNamara
>>>>>> [[email protected]]
>>>>>> *Sent:* Friday, March 14, 2014 12:19 PM
>>>>>> *To:* FastBit Users
>>>>>> *Subject:* Re: [FastBit-users] fastbit query hangs on
>>>>>> FUTEX_WAIT_PRIVATE
>>>>>>
>>>>>> John-
>>>>>>
>>>>>> Unfortunately I cannot share this dataset.  I may try to make a
>>>>>> dataset that I can share if I can repo the issue.
>>>>>>
>>>>>> In case it is helpful here is a stacktrace:
>>>>>> http://pastebin.com/FT3qsLH6
>>>>>>
>>>>>> I tried pulling the data down to my local machine and it works fine
>>>>>> there, no issues whatsoever. (I have a newer version of fastbit
>>>>>> installed locally).  So first I will try deploying the latest and
>>>>>> greatest on our cluster. I will let you know how that goes.
>>>>>>
>>>>>> Thanks again!
>>>>>>
>>>>>> Sean
>>>>>>
>>>>>>
>>>>>> ----------------------------------------------------------------------
>>>>>> *From:* [email protected]
>>>>>> [[email protected]] on behalf of John
>>>>>> [[email protected]]
>>>>>> *Sent:* Friday, March 14, 2014 12:04 PM
>>>>>> *To:* FastBit Users
>>>>>> *Subject:* Re: [FastBit-users] fastbit query hangs on
>>>>>> FUTEX_WAIT_PRIVATE
>>>>>>
>>>>>> Hi, Sean,
>>>>>>
>>>>>> Thanks for bring this issue up.  It appears to be some sort of
>>>>>> deadlock.  I could look into further if you can share the sample
>>>>>> data.
>>>>>>  Is the link you give the data or the log messages?
>>>>>>
>>>>>> -- John --
>>>>>>
>>>>>> On Mar 14, 2014, at 10:53 AM, Sean McNamara
>>>>>> <[email protected] <mailto:[email protected]>>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi-
>>>>>>>
>>>>>>> I¹m trying to troubleshoot an issue that I just started seeing.
>>>>>>>  Queries seem to hang, but only for certain columns and it¹s not
>>>>>>> clear to me why.  If it¹s any help, I am using fastbit a few commits
>>>>>>> after 692.
>>>>>>>
>>>>>>> Here is the strace for the query:
>>>>>>>
>>>>>>> strace ibis -d /mnt/data/test -q "select daily_binned_datetime²
>>>>>>>
>>>>>>> http://pastebin.com/xczKJVWL
>>>>>>>
>>>>>>>
>>>>>>> Here is the tail of what ibis is doing with verbosity:
>>>>>>>
>>>>>>> fileManager::storage(0x258e630, 0) cleared
>>>>>>> array_t<i>::freeMemory this=0x24421a0 actual=0x24515f0 and m_begin=0
>>>>>>> (active references: 0, past references: 1)
>>>>>>> fileManager::storage(0x24515f0, 0) cleared
>>>>>>> fileManager::flushFile will do nothing because
>>>>>>>
>>>>>>> "/mnt/data/explore/keyidx/35000/rp13/2014/02/03/daily_binned_datetime
>>>>>>> .
>>>>>>> i
>>>>>>> d
>>>>>>> x"
>>>>>>> is not tracked by the file manager
>>>>>>> fileManager::storage(0x24515f0, 0) initialization completed
>>>>>>> array_t<i> constructed at 0x2451350 with actual=0x24515f0, m_begin=0
>>>>>>> and m_end=0
>>>>>>> fileManager::storage(0x258e630, 0) initialization completed
>>>>>>> array_t<l> constructed at 0x2451368 with actual=0x258e630, m_begin=0
>>>>>>> and m_end=0
>>>>>>> fileManager::storage(0x2451170, 0) initialization completed
>>>>>>> array_t<PN4ibis9bitvectorE> constructed at 0x2451380 with
>>>>>>> actual=0x2451170, m_begin=0 and m_end=0
>>>>>>> array_t<PN4ibis9bitvectorE>::freeMemory this=0x2451380
>>>>>>> actual=0x2451170 and m_begin=0 (active references: 0, past
>>>>>>> references: 1)
>>>>>>> fileManager::storage(0x2451170, 0) cleared
>>>>>>> fileManager::storage(0x2451170, 0x2451290) added 16 bytes to
>>>>>>> increase totalBytes to 80192
>>>>>>> fileManager::storage(0x2451170, 0x2451290) initialization completed
>>>>>>> with 16 elements
>>>>>>> fileManager::storage(0x24512e0, 0) initialization completed
>>>>>>> array_t<j> constructed at 0x24512c0 with actual=0x24512e0, m_begin=0
>>>>>>> and m_end=0
>>>>>>> bitvector (0x24512b0) constructed with m_vec at 0x24512c0   <‹ hangs
>>>>>>> here
>>>>>>>
>>>>>>>
>>>>>>> Does anyone have any insight?
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Sean
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> FastBit-users mailing list
>>>>>>> [email protected] <mailto:[email protected]>
>>>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> FastBit-users mailing list
>>>>>> [email protected]
>>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>>>>>
>>>>> _______________________________________________
>>>>> FastBit-users mailing list
>>>>> [email protected]
>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>>>> _______________________________________________
>>>>> FastBit-users mailing list
>>>>> [email protected]
>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>>>
>>>> _______________________________________________
>>>> FastBit-users mailing list
>>>> [email protected]
>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>>
>>> _______________________________________________
>>> FastBit-users mailing list
>>> [email protected]
>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>
> 
> 
> 
> _______________________________________________
> FastBit-users mailing list
> [email protected]
> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
> 
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to