Re: why is slow when use OR clause instead of IN clause

Ning Zhang Wed, 04 Aug 2010 10:14:22 -0700

Currently an expression tree (series of ORs in this case) is not collapsed to 
one operator or any other optimizations. It would be great to have this 
optimization rule to convert an OR operator tree to one IN operator. Would you 
be able to file a JIRA and contribute a patch?


On Aug 4, 2010, at 7:46 AM, Mark Tozzi wrote:

> I haven't looked at the code, but I assume the query parser would sort
> the 'in' terms and then do a binary search lookup into them for each
> row, while the 'or' terms don't have that kind of obvious relationship
> and are probably tested in sequence.  This would give the in O(log N)
> performance compared to a chain of or's having O(N) performance, per
> row queried.  For large N, that could add up.  That being said, I'm
> just speculating here.  The query parser may be smart enough to
> optimize the related or's in the same way, or it may not optimize that
> at all.  If I get a chance, I'll try to dig around and see what it's
> doing, as I have also had a lot of large 'in' queries and could use
> every drop of performance I can get.
> 
> --Mark
> 
> On Wed, Aug 4, 2010 at 9:47 AM, Edward Capriolo <[email protected]> wrote:
>> On Wed, Aug 4, 2010 at 6:10 AM, lei liu <[email protected]> wrote:
>>> Because my company reuire we use 0.4.1 version, the version don't support IN
>>> clause. I want to  use the OR clause(example:where id=1 or id=2 or id=3) to
>>> implement the IN clause(example: id in(1,2,3) ).  I know it will be slower
>>> especially when the list after "in" is very long.  Could anybody can tell me
>>> why is slow when use OR clause to implement In clause?
>>> 
>>> 
>>> Thanks,
>>> 
>>> 
>>> LiuLei
>>> 
>> 
>> I can not imagine the performance difference between 'or' or 'in'
>> would be that great but I never benchmarked it. The big looming
>> problems is that if you string enough 'or' together (say 8000) the
>> query parser which uses java beans serialization will OOM.
>> 
>> Edward
>>

Re: why is slow when use OR clause instead of IN clause

Reply via email to