[jira] [Updated] (LUCENE-6172) Improve the in-order / out-of-order collection decision process

Adrien Grand (JIRA) Fri, 09 Jan 2015 10:39:17 -0800

     [ 
https://issues.apache.org/jira/browse/LUCENE-6172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Adrien Grand updated LUCENE-6172:
---------------------------------
    Attachment: LUCENE-6172.patch

Here is an (in-progress) patch which should give an idea of what I'm trying to 
do. The interesting bits are mainly in Top(Docs|Field)Collector, IndexSearcher 
and BooleanWeight. There is one failing lucene/facets test and a couple of 
failing solr tests that I still need to understand.

> Improve the in-order / out-of-order collection decision process
> ---------------------------------------------------------------
>
>                 Key: LUCENE-6172
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6172
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>             Fix For: 5.0, Trunk
>
>         Attachments: LUCENE-6172.patch
>
>
> Today the logic is the following:
>  - IndexSearcher looks if the weight can score out-of-order
>  - Depending on the value it creates the appropriate top docs/field collector
> I think this has several issues:
>  - Only IndexSearcher can actually make the decision correctly, and it only 
> works for top docs/field collectors. If you want to make a multi collector in 
> order to have both facets and top docs, then you're clueless about whether 
> you should create a top docs collector that supports out-of-order collection
>  - It is quite fragile: you need to make sure that 
> Weight.scoresDocsOutOfOrder and Weight.bulkScorer agree on when they can 
> score out-of-order. Some queries like BooleanQuery duplicate the logic and 
> other queries like FilteredQuery just always return true to avoid complexity. 
> This is inefficient as this means that IndexSearcher will create a collector 
> that supports out-of-order collection while the common case actually scores 
> documents in order (leap frog between the query and the filter).
> Instead I would like to take advantage of the new collection API to make 
> out-of-order scoring an implementation detail of the bulk scorers. My current 
> idea is as follows:
>  - remove Weight.scoresDocsOutOfOrder
>  - change Collector.getLeafCollector(LeafReaderContext) to 
> Collector.getLeafCollector(LeafReaderContext, boolean canScoreOutOfOrder)
> This new boolean in Collector.getLeafCollector tells the collector that the 
> scorer supports out-of-order scoring. So by returning a leaf collector that 
> supports out-of-order collection, things will be faster.
> The new logic would be the following. First Weights cannot tell whether they 
> support out-of-order scoring or not. However when a weight knows it supports 
> out-of-order scoring, it will pass canScoreOutOfOrder=true when getting the 
> leaf collector. If the returned collector accepts documents out of order, 
> then the weight will return an out-of order scorer. Otherwise, an in-order 
> scorer is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-6172) Improve the in-order / out-of-order collection decision process

Reply via email to