I was reading a book on SQL query tuning.  The gist of it was that the
way to get the best performance (fastest execution) out of a SQL select
statement was to "create" execution plans where the most selective term
in the "where" clause is used first, the next most selective term is
used next, etc.  Those of you who write SQL statements know that's
easier said than done.

 

However, that got me to thinking about the lucene queries I do.  I often
have a Boolean query that's made of a number of sub-queries.  Suppose I
have a BooleanQuery Q which is made up of sub-queries Q1, Q2, and Q3.
Q1 is more selective (gets fewer hits if applied all by itself) than Q2
or Q3.  Q2 is more selective than Q3 (but less selective than Q1).

 

Does it matter what order I add the sub-queries to the BooleanQuery Q.
That is, is the execution speed for the search faster (slower) if I do:

 

            Q.add(Q1, BooleanClause.Occur.MUST);

            Q.add(Q2, BooleanClause.Occur.MUST);

            Q.add(Q3, BooleanClause.Occur.MUST);

 

As opposed to:

 

            Q.add(Q3, BooleanClause.Occur.MUST);

            Q.add(Q2, BooleanClause.Occur.MUST);

            Q.add(Q1, BooleanClause.Occur.MUST);

 

Or does it matter at all?

 

There are cases where I know that, for 99% of the time, certain portions
of my queries are likely to be more selective and I could affect the
order they get added to the BooleanQuery.  Those of you who know lucene
internals, is there anything worth doing here?

 

Scott 

 

Reply via email to