[ 
https://issues.apache.org/jira/browse/LUCENE-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12704198#action_12704198
 ] 

Michael McCandless commented on LUCENE-1593:
--------------------------------------------

bq. So I'm now convinced this breaks back-compat.

Woops, yes it does.  Grr.

The thing is... I'm not sure we can make such a change even in 3.0.  Ie, all 
that's "special" about 3.0 is we get to remove deprecated APIs, and begin using 
Java 1.5 language features.  I'm not sure if a sudden change in runtime 
behavior ("you must call Scorer.init() before calling next or skipTo") is 
allowed.

Maybe we could make a Weight.initializableScorer, that returns a Scorer that 
requires init() be first called.  But since Weight is an interface, we can't 
change it.  So maybe we can make a new abstract class called AbstractWeight 
(for lack of a better name), implementing Weight.  We would deprecate Weight 
(and remove it at 3.0).  We can make a new "get me a Scorer" API in 
AbstractWeight, eg, require that Scorers returned from there must have "init" 
called first, pass in an "isTopScorer" boolean, etc.  Query would have a 
"abstractWeight()" method, emulated by wrapping the "weight()" method.  Could 
something crazy like this work....?  Maybe we should break out the two goals: 
this [new] goal is simply to migrate away from Weight as interfaace to 
AbstractWeight as abstract class, then step 2 is to make the optimizations we 
are discussing here.

This is like running in a potato sack race!

> Optimizations to TopScoreDocCollector and TopFieldCollector
> -----------------------------------------------------------
>
>                 Key: LUCENE-1593
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1593
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Shai Erera
>             Fix For: 2.9
>
>         Attachments: LUCENE-1593.patch, PerfTest.java
>
>
> This is a spin-off of LUCENE-1575 and proposes to optimize TSDC and TFC code 
> to remove unnecessary checks. The plan is:
> # Ensure that IndexSearcher returns segements in increasing doc Id order, 
> instead of numDocs().
> # Change TSDC and TFC's code to not use the doc id as a tie breaker. New docs 
> will always have larger ids and therefore cannot compete.
> # Pre-populate HitQueue with sentinel values in TSDC (score = Float.NEG_INF) 
> and remove the check if reusableSD == null.
> # Also move to use "changing top" and then call adjustTop(), in case we 
> update the queue.
> # some methods in Sort explicitly add SortField.FIELD_DOC as a "tie breaker" 
> for the last SortField. But, doing so should not be necessary (since we 
> already break ties by docID), and is in fact less efficient (once the above 
> optimization is in).
> # Investigate PQ - can we deprecate insert() and have only 
> insertWithOverflow()? Add a addDummyObjects method which will populate the 
> queue without "arranging" it, just store the objects in the array (this can 
> be used to pre-populate sentinel values)?
> I will post a patch as well as some perf measurements as soon as I have them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to