[ 
https://issues.apache.org/jira/browse/LUCENE-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705711#action_12705711
 ] 

Shai Erera commented on LUCENE-1593:
------------------------------------

Ok will do that. I also would like to summarize what the latest posts here:

# Deprecate Weight and create QueryWeight (abstract class) with a new 
scorer(reader, scoreDocsInOrder), replacing the current scorer(reader) method. 
QueryWeight implements Weight, while score(reader) calls score(reader, false /* 
out-of-order */) and scorer(reader, scoreDocsInOrder) is defined abstract.
#* Also add QueryWeightWrapper to wrap a given Weight implementation. This one 
will also be deprecated, as well as package-private.
#* Add to Query variants of createWeight and weight which return QueryWeight. 
For now, I prefer to add a default impl which wraps the Weight variant instead 
of overriding in all Query extensions, and in 3.0 when we remove the Weight 
variants - override in all extending classes.
# Add to Scorer isOutOfOrder with a default to false, and override in BS to 
true.
# Modify BooleanWeight to extend QueryWeight and implement the new scorer 
method to return BS2 or BS based on the number of required scorers and 
setAllowOutOfOrder.
# Add to Collector an abstract _acceptsDocsOutOfOrder_ which returns true/false.
#* Use it in IndexSearcher.search methods, that accept a Collector, in order to 
create the appropriate Scorer, using the new QueryWeight.
#* Provide a static create method to TFC and TSDC which accept this as an 
argument and creates the proper instance.
#* Wherever we create a Collector (TSDC or TFC), always ask for out-of-order 
Scorer and check on the resulting Scorer isOutOfOrder(), so that we can create 
the optimized Collector instance.
# Modify IndexSearcher to use all of the above logic.

The only class I'm worried about, and would like to verify with you, is 
Searchable. If we want to deprecate all the search methods on IndexSearcher, 
Searcher and Searchable which accept Weight and add new ones which accept 
QueryWeight, we must do the following:
* Deprecate Searchable in favor of Searcher.
* Add to Searcher the new QueryWeight variants. Here we have two choices: (1) 
break back-compat and add them as abstract (like we've done with the new 
Collector method) or (2) add them with a default impl to call the Weight 
versions, documenting these will become abstract in 3.0.
* Have Searcher extend UnicastRemoteObject and have RemoteSearchable extend 
Searcher. That's the part I'm a little bit worried about - Searchable 
implements java.rmi.Remote, which means there could be an implementation out 
there which implements Searchable and extends something different than 
UnicastRemoteObject, like Activeable. I think there is very small chance this 
has actually happened, but would like to confirm with you guys first.
* Add a deprecated, package-private, SearchableWrapper which extends Searcher 
and delegates all calls to the Searchable member.
* Deprecate all uses of Searchable and add Searcher instead, defaulting the old 
ones to use SearchableWrapper.
* Make all the necessary changes to IndexSearcher, MultiSearcher etc. regarding 
overriding these new methods.

I really hope I covered everything in this summary.

> Optimizations to TopScoreDocCollector and TopFieldCollector
> -----------------------------------------------------------
>
>                 Key: LUCENE-1593
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1593
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Shai Erera
>             Fix For: 2.9
>
>         Attachments: LUCENE-1593.patch, LUCENE-1593.patch, PerfTest.java
>
>
> This is a spin-off of LUCENE-1575 and proposes to optimize TSDC and TFC code 
> to remove unnecessary checks. The plan is:
> # Ensure that IndexSearcher returns segements in increasing doc Id order, 
> instead of numDocs().
> # Change TSDC and TFC's code to not use the doc id as a tie breaker. New docs 
> will always have larger ids and therefore cannot compete.
> # Pre-populate HitQueue with sentinel values in TSDC (score = Float.NEG_INF) 
> and remove the check if reusableSD == null.
> # Also move to use "changing top" and then call adjustTop(), in case we 
> update the queue.
> # some methods in Sort explicitly add SortField.FIELD_DOC as a "tie breaker" 
> for the last SortField. But, doing so should not be necessary (since we 
> already break ties by docID), and is in fact less efficient (once the above 
> optimization is in).
> # Investigate PQ - can we deprecate insert() and have only 
> insertWithOverflow()? Add a addDummyObjects method which will populate the 
> queue without "arranging" it, just store the objects in the array (this can 
> be used to pre-populate sentinel values)?
> I will post a patch as well as some perf measurements as soon as I have them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to