You’re right - my mistake.  Thank you John!

On 11/1/14, 10:06 PM, "K. John Wu" <[email protected]> wrote:

>Hi, Steven,
>
>Sean's problem was fixed with an update I made on March 13, 2014.  The
>change is in SVN repository.  Please give the code from SVN repository
>a try when you get the chance.
>
>By the way, the command to access the SVN repository is
>
>svn checkout https://codeforge.lbl.gov/anonscm/fastbit
>
>Please feel free to let us know if you encounter any issue with the
>new code.
>
>Thanks.
>
>John
>
>
>
>
>On 11/1/14 10:31 PM, Enns, Steven wrote:
>> I’ve attached an alternative diff, which is to disable the optimization
>>in
>> ibis::category::fillIndex for when every entry has the same value.  This
>> also resolves my deadlock issues.
>> 
>> 
>> On 11/1/14, 8:36 PM, "Enns, Steven" <[email protected]> wrote:
>> 
>>> Attached is my proposed fix using recursive mutex that is confirmed to
>>> resolve deadlock.
>>>
>>> The deadlock only seems to occur when the column contains a single
>>> distinct value, so dictionary is of size 1, and the following
>>>conditional
>>> runs in ibis::category::fillIndex:
>>>
>>> if (dic.size() == 1) { // assume every entry has the given value
>>>    rlc = new ibis::direkte(this, 1, thePart->nRows());
>>>    }
>>>
>>>
>>> On 11/1/14, 4:21 PM, "Enns, Steven" <[email protected]> wrote:
>>>
>>>> I believe I have identified the cause of deadlock.
>>>> ibis::category::prepare acquires ibis::column::mutex.  Then it calls
>>>> ibis::category::fillRows, which constructs ibis::direkte::direkte,
>>>>which
>>>> attempts to acquire the column mutex again.  Perhaps
>>>>ibis::column::mutex
>>>> should be initialized with PTHREAD_MUTEX_RECURSIVE?
>>>>
>>>>
>>>> On 11/1/14, 3:53 PM, "Enns, Steven" <[email protected]> wrote:
>>>>
>>>>> Hey Sean,
>>>>>
>>>>> What specifically was wrong with your index data?  I am experiencing
>>>>>the
>>>>> same issue.  
>>>>>
>>>>> Thanks,
>>>>> Steve
>>>>>
>>>>> On 3/15/14, 4:08 PM, "Sean McNamara" <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> John-
>>>>>>
>>>>>> There was an issue with the index data that we had generated. After
>>>>>> rebuilding the indexes /w 1.3.9 everything works great!
>>>>>>
>>>>>> Sorry for the false alarm.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Sean
>>>>>>
>>>>>>
>>>>>> ________________________________________
>>>>>> From: [email protected]
>>>>>> [[email protected]] on behalf of K. John Wu
>>>>>> [[email protected]]
>>>>>> Sent: Saturday, March 15, 2014 1:52 AM
>>>>>> To: FastBit Users
>>>>>> Subject: Re: [FastBit-users] fastbit query hangs on
>>>>>>FUTEX_WAIT_PRIVATE
>>>>>>
>>>>>> Hi, Sean,
>>>>>>
>>>>>> Please check out SVN revision 706 and give it a try.  Let us know if
>>>>>> you continue to encounter problems.
>>>>>>
>>>>>> John
>>>>>>
>>>>>> PS: You can use the following command line to check out the latest
>>>>>> code from SVN
>>>>>>
>>>>>> svn checkout https://codeforge.lbl.gov/anonscm/fastbit
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 3/14/14, 2:26 PM, Sean McNamara wrote:
>>>>>>> Hey John-
>>>>>>>
>>>>>>> I just tried the 1.3.9 release and I still see the same issue.  The
>>>>>>> stacktrace is pasted below.  I believe it is getting stuck in
>>>>>>> column.cpp line 737: ibis::util::mutexLock lock(&mutex,
>>>>>>> "column::getNullMask");
>>>>>>>
>>>>>>> I can only reproduce the issue on ubuntu.  If I can get the issue
>>>>>>> reproduced on my mac /w generated data I will send it your way in
>>>>>>> case
>>>>>>> you wouldn't mind examining.  Btw- I am building with c++0x instead
>>>>>>> of
>>>>>>> c++0x11 on ubuntu since it has an older gcc that doesn't support
>>>>>>> 0x11.
>>>>>>>
>>>>>>> Thanks again,
>>>>>>>
>>>>>>> Sean
>>>>>>>
>>>>>>>
>>>>>>> #0  0x00007ffff59d989c in __lll_lock_wait () from
>>>>>>> /lib/x86_64-linux-gnu/libpthread.so.0
>>>>>>> (gdb) bt
>>>>>>> #0  0x00007ffff59d989c in __lll_lock_wait () from
>>>>>>> /lib/x86_64-linux-gnu/libpthread.so.0
>>>>>>> #1  0x00007ffff59d5065 in _L_lock_858 () from
>>>>>>> /lib/x86_64-linux-gnu/libpthread.so.0
>>>>>>> #2  0x00007ffff59d4eba in pthread_mutex_lock () from
>>>>>>> /lib/x86_64-linux-gnu/libpthread.so.0
>>>>>>> #3  0x00007ffff708e613 in
>>>>>>>ibis::column::getNullMask(ibis::bitvector&)
>>>>>>> const () from /usr/local/lib/libfastbit.so.0
>>>>>>> #4  0x00007ffff79573d0 in ibis::direkte::direkte(ibis::column
>>>>>>>const*,
>>>>>>> unsigned int, unsigned int) ()
>>>>>>>    from /usr/local/lib/libfastbit.so.0
>>>>>>> #5  0x00007ffff77f2ec3 in ibis::category::fillIndex(char const*)
>>>>>>> const
>>>>>>> () from /usr/local/lib/libfastbit.so.0
>>>>>>> #6  0x00007ffff77f6c68 in ibis::category::prepareMembers() const ()
>>>>>>> from /usr/local/lib/libfastbit.so.0
>>>>>>> #7  0x00007ffff77fbd85 in ibis::category::getDictionary() const ()
>>>>>>> from /usr/local/lib/libfastbit.so.0
>>>>>>> #8  0x00007ffff6fe085d in ibis::bord::bord(char const*, char
>>>>>>>const*,
>>>>>>> ibis::selectClause const&, std::vector<ibis::part const*,
>>>>>>> std::allocator<ibis::part const*> > const&) () from
>>>>>>> /usr/local/lib/libfastbit.so.0
>>>>>>> #9  0x00007ffff78b0c69 in ibis::filter::sift2(ibis::selectClause
>>>>>>> const&, std::vector<ibis::part const*, std::allocator<ibis::part
>>>>>>> const*> > const&, ibis::whereClause const&) () from
>>>>>>> /usr/local/lib/libfastbit.so.0
>>>>>>> #10 0x00007ffff78b8c28 in
>>>>>>>ibis::table::select(std::vector<ibis::part
>>>>>>> const*, std::allocator<ibis::part const*> > const&, char const*,
>>>>>>>char
>>>>>>> const*) () from /usr/local/lib/libfastbit.so.0
>>>>>>> #11 0x00007ffff771513b in ibis::mensa::select(char const*, char
>>>>>>> const*) const () from /usr/local/lib/libfastbit.so.0
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 
>>>>>>>--------------------------------------------------------------------
>>>>>>>--
>>>>>>> *From:* [email protected]
>>>>>>> [[email protected]] on behalf of Sean McNamara
>>>>>>> [[email protected]]
>>>>>>> *Sent:* Friday, March 14, 2014 12:19 PM
>>>>>>> *To:* FastBit Users
>>>>>>> *Subject:* Re: [FastBit-users] fastbit query hangs on
>>>>>>> FUTEX_WAIT_PRIVATE
>>>>>>>
>>>>>>> John-
>>>>>>>
>>>>>>> Unfortunately I cannot share this dataset.  I may try to make a
>>>>>>> dataset that I can share if I can repo the issue.
>>>>>>>
>>>>>>> In case it is helpful here is a stacktrace:
>>>>>>> http://pastebin.com/FT3qsLH6
>>>>>>>
>>>>>>> I tried pulling the data down to my local machine and it works fine
>>>>>>> there, no issues whatsoever. (I have a newer version of fastbit
>>>>>>> installed locally).  So first I will try deploying the latest and
>>>>>>> greatest on our cluster. I will let you know how that goes.
>>>>>>>
>>>>>>> Thanks again!
>>>>>>>
>>>>>>> Sean
>>>>>>>
>>>>>>>
>>>>>>> 
>>>>>>>--------------------------------------------------------------------
>>>>>>>--
>>>>>>> *From:* [email protected]
>>>>>>> [[email protected]] on behalf of John
>>>>>>> [[email protected]]
>>>>>>> *Sent:* Friday, March 14, 2014 12:04 PM
>>>>>>> *To:* FastBit Users
>>>>>>> *Subject:* Re: [FastBit-users] fastbit query hangs on
>>>>>>> FUTEX_WAIT_PRIVATE
>>>>>>>
>>>>>>> Hi, Sean,
>>>>>>>
>>>>>>> Thanks for bring this issue up.  It appears to be some sort of
>>>>>>> deadlock.  I could look into further if you can share the sample
>>>>>>> data.
>>>>>>>  Is the link you give the data or the log messages?
>>>>>>>
>>>>>>> -- John --
>>>>>>>
>>>>>>> On Mar 14, 2014, at 10:53 AM, Sean McNamara
>>>>>>> <[email protected] <mailto:[email protected]>>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi-
>>>>>>>>
>>>>>>>> I¹m trying to troubleshoot an issue that I just started seeing.
>>>>>>>>  Queries seem to hang, but only for certain columns and it¹s not
>>>>>>>> clear to me why.  If it¹s any help, I am using fastbit a few
>>>>>>>>commits
>>>>>>>> after 692.
>>>>>>>>
>>>>>>>> Here is the strace for the query:
>>>>>>>>
>>>>>>>> strace ibis -d /mnt/data/test -q "select daily_binned_datetime²
>>>>>>>>
>>>>>>>> http://pastebin.com/xczKJVWL
>>>>>>>>
>>>>>>>>
>>>>>>>> Here is the tail of what ibis is doing with verbosity:
>>>>>>>>
>>>>>>>> fileManager::storage(0x258e630, 0) cleared
>>>>>>>> array_t<i>::freeMemory this=0x24421a0 actual=0x24515f0 and
>>>>>>>>m_begin=0
>>>>>>>> (active references: 0, past references: 1)
>>>>>>>> fileManager::storage(0x24515f0, 0) cleared
>>>>>>>> fileManager::flushFile will do nothing because
>>>>>>>>
>>>>>>>> 
>>>>>>>>"/mnt/data/explore/keyidx/35000/rp13/2014/02/03/daily_binned_dateti
>>>>>>>>me
>>>>>>>> .
>>>>>>>> i
>>>>>>>> d
>>>>>>>> x"
>>>>>>>> is not tracked by the file manager
>>>>>>>> fileManager::storage(0x24515f0, 0) initialization completed
>>>>>>>> array_t<i> constructed at 0x2451350 with actual=0x24515f0,
>>>>>>>>m_begin=0
>>>>>>>> and m_end=0
>>>>>>>> fileManager::storage(0x258e630, 0) initialization completed
>>>>>>>> array_t<l> constructed at 0x2451368 with actual=0x258e630,
>>>>>>>>m_begin=0
>>>>>>>> and m_end=0
>>>>>>>> fileManager::storage(0x2451170, 0) initialization completed
>>>>>>>> array_t<PN4ibis9bitvectorE> constructed at 0x2451380 with
>>>>>>>> actual=0x2451170, m_begin=0 and m_end=0
>>>>>>>> array_t<PN4ibis9bitvectorE>::freeMemory this=0x2451380
>>>>>>>> actual=0x2451170 and m_begin=0 (active references: 0, past
>>>>>>>> references: 1)
>>>>>>>> fileManager::storage(0x2451170, 0) cleared
>>>>>>>> fileManager::storage(0x2451170, 0x2451290) added 16 bytes to
>>>>>>>> increase totalBytes to 80192
>>>>>>>> fileManager::storage(0x2451170, 0x2451290) initialization
>>>>>>>>completed
>>>>>>>> with 16 elements
>>>>>>>> fileManager::storage(0x24512e0, 0) initialization completed
>>>>>>>> array_t<j> constructed at 0x24512c0 with actual=0x24512e0,
>>>>>>>>m_begin=0
>>>>>>>> and m_end=0
>>>>>>>> bitvector (0x24512b0) constructed with m_vec at 0x24512c0   <‹
>>>>>>>>hangs
>>>>>>>> here
>>>>>>>>
>>>>>>>>
>>>>>>>> Does anyone have any insight?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Sean
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> FastBit-users mailing list
>>>>>>>> [email protected] <mailto:[email protected]>
>>>>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> FastBit-users mailing list
>>>>>>> [email protected]
>>>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>>>>>>
>>>>>> _______________________________________________
>>>>>> FastBit-users mailing list
>>>>>> [email protected]
>>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>>>>> _______________________________________________
>>>>>> FastBit-users mailing list
>>>>>> [email protected]
>>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>>>>
>>>>> _______________________________________________
>>>>> FastBit-users mailing list
>>>>> [email protected]
>>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>>>
>>>> _______________________________________________
>>>> FastBit-users mailing list
>>>> [email protected]
>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>>
>> 
>> 
>> 
>> _______________________________________________
>> FastBit-users mailing list
>> [email protected]
>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>> 
>_______________________________________________
>FastBit-users mailing list
>[email protected]
>https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to