Hi, Steven,

You probably want to try SVN revision 634 instead of 633 - the fix
from last night was not very consistent.

John


On 5/28/13 11:45 AM, Enns, Steven wrote:
> Thanks for the quick reply, John!  I've copied and built svn r633, so I
> don't suppose we need the tar ball.
> 
> On 5/27/13 11:28 PM, "K. John Wu" <[email protected]> wrote:
> 
>> Hi, Steven,
>>
>> Thanks for the detailed report.  There are a couple of similar places
>> with the same problem.  A slightly different fix has been used in all
>> of them in SVN revision 633.  When you get the chance, please give it
>> a try.
>>
>> Presumably, you would want to have a tar ball release to use.  If this
>> is the case, let me know, I will make a new tar ball.
>>
>> Thanks again.
>>
>> John
>>
>>
>> On 5/26/13 6:57 PM, Enns, Steven wrote:
>>> Hi everyone,
>>>
>>> I believe I've found a bug in countQuery::doEvaluate which is
>>> producing incorrect query results.  We first noticed that the
>>> following queries incorrectly returned the same results using
>>> ibis::table::select, but distinct and seemingly correct values using
>>> ibis::query::getHitRows:
>>>
>>> (behaviorSegments contains '63' OR behaviorSegments contains '662')
>>> AND (nielsenDma contains '503')
>>> (behaviorSegments contains '63' OR behaviorSegments contains '662')
>>> AND (nielsenDma contains '501')
>>> (behaviorSegments contains '63' OR behaviorSegments contains '662')
>>> AND (nielsenDma contains 'not a keyword!!')
>>>
>>> I enabled verbosity level 4 using the ibis executable to find that
>>> value of ierr after evaluating the left hand side of the AND clause
>>> was 0 (indicating 0 hit rows), causing evaluation to stop without
>>> evaluating the right hand side.
>>>
>>> Can I suggest the following fix for countQuery.cpp?  We are still
>>> using 1.3.5.  Is there a formal channel for filing bugs such as jira?
>>>
>>> 898         
>>>
>>>     case ibis::qExpr::LOGICAL_OR: {
>>>
>>> 899         
>>>
>>>     ierr = doEvaluate(term->getLeft(), mask, ht);
>>>
>>> 900         
>>>
>>>     if (ierr >= 0 && ht.cnt() < mask.cnt()) {
>>>
>>> 901         
>>>
>>>             int leftIerr = ierr;
>>>
>>> 902         
>>>
>>>         ibis::bitvector b1;
>>>
>>> 903         
>>>
>>>         if (ht.cnt() > mask.bytes() + ht.bytes()) {
>>>
>>> 904         
>>>
>>>             ibis::bitvector* newmask = mask - ht;
>>>
>>> 905         
>>>
>>>             ierr = doEvaluate(term->getRight(), *newmask, b1);
>>>
>>> 906         
>>>
>>>             delete newmask;
>>>
>>> 907         
>>>
>>>         }
>>>
>>> 908         
>>>
>>>         else {
>>>
>>> 909         
>>>
>>>             ierr = doEvaluate(term->getRight(), mask, b1);
>>>
>>> 910         
>>>
>>>         }
>>>
>>> 911         
>>>
>>>         if (ierr > 0) {
>>>
>>> 912         
>>>
>>>             ht |= b1;
>>>
>>> 913         
>>>
>>>             ierr = ht.sloppyCount();
>>>
>>> 914         
>>>
>>>         } else {
>>>
>>> 915         
>>>
>>>                 ierr = leftIerr;
>>>
>>> 916         
>>>
>>>             }
>>>
>>> 917         
>>>
>>>     }
>>>
>>> 918         
>>>
>>>     break;}
>>>
>>>
>>> ibis > where (behaviorSegments contains '63' OR behaviorSegments
>>> contains '662') AND (nielsenDma contains '503')
>>> doQuery -- processing " FROM 8r7nhJy0RL WHERE (behaviorSegments
>>> contains '63' OR behaviorSegments contains '662') AND (nielsenDma
>>> contains '503')"
>>> Constructing selectClause @ 0x7fff15ffa9c8
>>> newToken -- generated new token "s315Pp5XgCB----2" for user saenns
>>> query[s315Pp5XgCB----2]::setWhereClause -- add a new where clause
>>> "(behaviorSegments contains '63' OR behaviorSegments contains '662')
>>> AND (nielsenDma contains '503')".
>>> query[s315Pp5XgCB----2]::setWhereClause -- where "(behaviorSegments
>>> contains '63' OR behaviorSegments contains '662') AND (nielsenDma
>>> contains '503')"
>>> Translated the WHERE clause into: ((behaviorSegments CONTAINS '63' OR
>>> behaviorSegments CONTAINS '662') AND nielsenDma CONTAINS '503')
>>> query[s315Pp5XgCB----2]::evaluate -- starting to evaluate the query
>>> for user "saenns"
>>> query[s315Pp5XgCB----2]::doEvaluate(0x43d0f00: behaviorSegments
>>> CONTAINS '63', mask.cnt()=7795176) --> 209101, ierr = 209101
>>> query[s315Pp5XgCB----2]::doEvaluate(0x43d7320: behaviorSegments
>>> CONTAINS '662', mask.cnt()=7795176) --> 0, ierr = 0
>>> query[s315Pp5XgCB----2]::doEvaluate(0x43c8b20: (0x43d0f00 OR
>>> 0x43d7320), mask.cnt()=7795176) --> 209101, ierr = 209101
>>> query[s315Pp5XgCB----2]::doEvaluate(0x43d11a0: nielsenDma CONTAINS
>>> '503', mask.cnt()=209101) --> 126, ierr = 2
>>> query[s315Pp5XgCB----2]::doEvaluate(0x43cc620: (0x43c8b20 AND
>>> 0x43d11a0), mask.cnt()=7795176) --> 126, ierr = 2
>>> query[s315Pp5XgCB----2]::evaluate -- the hit contains xxx bits with
>>> 126 bits set(=1) taking up xxx bytes; the estimated clustering factor
>>> is 1; had the bits been randomly spread out, the expected size would
>>> be xxx bytes; estimated number of bytes to be read in order to access
>>> 4-byte values is xxx
>>> query[s315Pp5XgCB----2]::evaluate -- time to compute the 126 hits: 0
>>> sec(CPU), 0.19129 sec(elapsed).
>>> query[s315Pp5XgCB----2]::evaluate -- user saenns FROM 8r7nhJy0RL WHERE
>>> (behaviorSegments contains '63' OR behaviorSegments contains '662')
>>> AND (nielsenDma contains '503') ==> 126 hits.
>>> doQuery:: evaluate( FROM 8r7nhJy0RL WHERE (behaviorSegments contains
>>> '63' OR behaviorSegments contains '662') AND (nielsenDma contains
>>> '503')) produced 126 hits, took 0 CPU seconds, 0.340331 elapsed seconds
>>> countQuery::setWhereClause -- add a new where clause "(0x43d71e0 AND
>>> 0x43d7180)"
>>> countQuery::evaluate -- start timer ...
>>> countQuery::doEvaluate(0x43d7210: behaviorSegments CONTAINS '63',
>>> mask.cnt()=7795176) --> 209101, ierr = 209101
>>> countQuery::doEvaluate(0x43d7290: behaviorSegments CONTAINS '662',
>>> mask.cnt()=7795176) --> 0, ierr = 0
>>> countQuery::doEvaluate(0x43d71e0: (0x43d7210 OR 0x43d7290),
>>> mask.cnt()=7795176) --> 209101, ierr = 0
>>> countQuery::doEvaluate(0x43d91f0: (0x43d71e0 AND 0x43d7180),
>>> mask.cnt()=7795176) --> 209101, ierr = 0
>>> countQuery::evaluate -- Select count(*) From 8r7nhJy0RL Where
>>> (0x43d71e0 AND 0x43d7180) --> 209101
>>> countQuery::evaluate -- duration: 0 sec(CPU), 0.002358 sec(elapsed)
>>> Warning -- countQuery.getNumHits returned 209101, while
>>> query.getNumHits returned 126
>>> Freeing selectClause @ 0x7fff15ffa9c8
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> FastBit-users mailing list
>>> [email protected]
>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
>>>
> 
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to