[jira] Commented: (LUCENE-443) ConjunctionScorer tune-up

Abdul Chaudhry (JIRA) Mon, 10 Oct 2005 23:37:35 -0700

    [ 
http://issues.apache.org/jira/browse/LUCENE-443?page=comments#action_12331775 ]


Abdul Chaudhry commented on LUCENE-443:
---------------------------------------

ok, this makes sense as the scoring engine runs something like this

while (scorer.next()) {
  int doc = scorer.doc();
  float scorer = scorer.score();
  collector.collect(doc, score);
}

That is, next() will have ordered everything, so that by the time we call the 
scorer.score() method , everything should be in-order.

Thanks, ill give that a go.

The impression I have with lucene, and correct me if Im wrong, is that complex 
queries with many terms and clauses have their bottleneck in terms of 
performance in the ordering phase, that is scorer.next() requires everything to 
be in-document order and all the scorer sub-engines must comply. Collection is 
a moot point as you probably have small numbers of hits. However, on the other 
end of the scale, for queries with one or two terms that have a very high 
frequency the bottleneck is really in collection, that is the priority queue in 
collector.collect(),
Essentially this is a sorting issue, somewhat masked and manipulated at various 
stages.
This looks to me like lucene needs a "Query Plan". 


> ConjunctionScorer tune-up
> -------------------------
>
>          Key: LUCENE-443
>          URL: http://issues.apache.org/jira/browse/LUCENE-443
>      Project: Lucene - Java
>         Type: Bug
>   Components: Search
>     Versions: 1.9
>  Environment: Linux, Java 1.5, Large Index with 4 million items and some 
> heavily nested boolean queries
>     Reporter: Abdul Chaudhry
>  Attachments: ConjunctionScorer.java, ConjunctionScorer.java
>
> I just recently ran a load test on the latest code from lucene , which is 
> using a new BooleanScore and noticed the ConjunctionScorer was crunching 
> through objects , especially while sorting as part of the skipTo call. It 
> turns a linked list into an array, sorts the array, then converts the array 
> back to a linked list for further processing by the scoring engines below.
> 'm not sure if anyone else is experiencing this as I have a very large index 
> (> 4 million items) and I am issuing some heavily nested queries
> Anyway, I decide to change the link list into an array and use a first and 
> last marker to "simulate" a linked list.
> This scaled much better during my load test as the java gargbage collector 
> was less - umm - virulent 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-443) ConjunctionScorer tune-up

Reply via email to