[ 
https://issues.apache.org/jira/browse/SOLR-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-6581:
---------------------------------
    Attachment: SOLR-6581.patch

> Prepare CollapsingQParserPlugin and ExpandComponent for 5.0
> -----------------------------------------------------------
>
>                 Key: SOLR-6581
>                 URL: https://issues.apache.org/jira/browse/SOLR-6581
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Joel Bernstein
>            Assignee: Joel Bernstein
>            Priority: Minor
>             Fix For: 5.0
>
>         Attachments: SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, 
> SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, renames.diff
>
>
> *Background*
> The 4x implementation of the CollapsingQParserPlugin and the ExpandComponent 
> are optimized to work with a top level FieldCache. Top level FieldCaches have 
> a very fast docID to top-level ordinal lookup. Fast access to the top-level 
> ordinals allows for very high performance field collapsing on high 
> cardinality fields. 
> LUCENE-5666 unified the DocValues and FieldCache api's so that the top level 
> FieldCache is no longer in regular use. Instead all top level caches are 
> accessed through MultiDocValues. 
> There are some major advantages of using the MultiDocValues rather then a top 
> level FieldCache. But there is one disadvantage, the lookup from docId to 
> top-level ordinals is slower using MultiDocValues.
> My testing has shown that *after optimizing* the CollapsingQParserPlugin code 
> to use MultiDocValues, the performance drop is around 100%.  For some use 
> cases this performance drop is a blocker.
> *What About Faceting?*
> String faceting also relies on the top level ordinals. Is faceting 
> performance affected also? My testing has shown that the faceting performance 
> is affected much less then collapsing. 
> One possible reason for this may be that field collapsing is memory bound and 
> faceting is not. So the additional memory accesses needed for MultiDocValues 
> affects field collapsing much more then faceting.
> *Proposed Solution*
> The proposed solution is to have the default Collapse and Expand algorithm 
> use MultiDocValues, but to provide an option to use a top level FieldCache if 
> the performance of MultiDocValues is a blocker.
> The proposed mechanism for switching to the FieldCache would be a new "hint" 
> parameter. If the hint parameter is set to "FAST_QUERY" then the top-level 
> FieldCache would be used for both Collapse and Expand.
> Example syntax:
> {code}
> fq={!collapse field=x hint=FAST_QUERY}
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to