Re: Lucene's default settings & back compatibility

DM Smith Tue, 19 May 2009 03:48:14 -0700


On May 18, 2009, at 11:31 PM, Robert Muir wrote:

I am curious about this, do you think its a better default becauseit avoids the max boolean clauses problem? or because for a lot ofthese scoring doesn't make much sense anyway?
I ran tests on a pretty big index, you pay a price for the constantscore/filter method. Its slower for the common case searches, itonly starts to win for queries that return > 10% or so the index,but its significantly slower for narrow queries...
I'm just trying to imagine a case where queries that return > 10% orso of the index are actually the common/default...?

It is common in my application, a Bible program, that indexes eachverse (think of a verse as a numbered sentence) as a separatedocument. We index everything, including words that are typically stopwords as those might be important to our end users. Besides this, thetop 280 word roots represent 90% of the occurrences.

And on searches, we return everything in book order, unless the userwants to score the result. In that case, we return a small, userconfigurable amount of hits ordered by score.

And we are using Lucene out of the box for the most part. We'vedeviated only to incrementally solve performance problems.




 * Constant score rewrite ought to be the default for most multi-term
   queries




--
Robert Muir
rcm...@gmail.com

Re: Lucene's default settings & back compatibility

Reply via email to