Michael McCandless wrote:
On Mon, May 18, 2009 at 11:31 PM, Robert Muir <rcm...@gmail.com> wrote:
I am curious about this, do you think its a better default because it avoids
the max boolean clauses problem? or because for a lot of these scoring
doesn't make much sense anyway?

I think you're referring to constant score mode default, for
MultiTermQuery & QueryParser, right?

I ran tests on a pretty big index, you pay a price for the constant
score/filter method. Its slower for the common case searches, it only starts
to win for queries that return > 10% or so the index, but its significantly
slower for narrow queries...

I'm just trying to imagine a case where queries that return > 10% or so of
the index are actually the common/default...?

Excellent points!  And this also makes clear why healthy discussion on
each default is important, as well as how good it'd be to have
Settings online so that we are free to even have such discussions
(vs being bound by back-compat which prevents any improvements
to the defaults).

I was actually referring to the fact that scores for MultiTermQuery
rewritten to BooleanQuery are often meaningless to the app (I
think?).  But you're right the performance cost of the "make a filter
up front" approach is too high for smallish queries.

Thinking more on this... I'd love to have a constant-score mode, but
implemented as a BooleanQuery, meaning the scores would be the same
(constant) regardless of whether under-the-hood the query was
rewritten to BooleanQuery vs pre-compiled up front into a BitSet.

This would then decouple scoring from rewrite method, which in turn
would give us the freedom to pick and choose the fastest impl based on
particulars of the query.

So if we had such a ConstantScoreBooleanQuery, and we fixed
MultiTermQuery to conditionally use that, then I think we'd want
MultiTermQuery to do constant scoring by default.  (And, it'd then be
free pick whether "create filter up front" or "use
ConstantScoreBooleanQuery" was most performant, query by query).

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

+1. ConstantScoreQuery is only a performance win when there are lots of matches (it seems), but the lack of TooManyClauses exceptions is also a big win. I want the best of both worlds :)

--
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to