[ 
https://issues.apache.org/jira/browse/LUCENE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-4600:
---------------------------------------

    Attachment: LUCENE-4600.patch

Initial prototype patch ... I created a CountingFacetsCollector that aggregates 
per-segment, and it "hardwires" a dgap/vint decoding.

I tested using luceneutil's date faceting and it gives decent speedups for 
TermQuery:
{noformat}
                HighTerm        0.54      (2.7%)        0.63      (1.4%)   
17.6% (  13% -   22%)
                 LowTerm        7.69      (1.6%)        9.15      (2.1%)   
18.9% (  14% -   23%)
                 MedTerm        3.39      (1.2%)        4.48      (1.3%)   
32.2% (  29% -   35%)
{noformat}
                
> Facets should aggregate during collection, not at the end
> ---------------------------------------------------------
>
>                 Key: LUCENE-4600
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4600
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>         Attachments: LUCENE-4600.patch
>
>
> Today the facet module simply gathers all hits (as a bitset, optionally with 
> a float[] to hold scores as well, if you will aggregate them) during 
> collection, and then at the end when you call getFacetsResults(), it makes a 
> 2nd pass over all those hits doing the actual aggregation.
> We should investigate just aggregating as we collect instead, so we don't 
> have to tie up transient RAM (fairly small for the bit set but possibly big 
> for the float[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to