[ 
https://issues.apache.org/jira/browse/LUCENE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4600:
-------------------------------

    Attachment: LUCENE-4600.patch

Patch introduces CountingFacetsCollector, very similar to Mike's version, only 
"productized".

Made FacetsCollector abstract with a utility create() method which returns 
either CountingFacetsCollector or StandardFacetsCollector (previously, FC), 
given the parameters.

All tests were migrated to use FC.create and all pass (utilizing the new 
collector). Still, I wrote a dedicated test for the new Collector too.

Preliminary results that we have, show nice improvements w/ this Collector. 
Mike, can you paste them here?

There are some nocommits, which I will resolve before committing. But before 
that, I'd like to compare this Collector to ones that use different 
abstractions from the code, e.g. IntDecoder (vs hard-wiring to dgap+vint), 
CategoryListIterator etc.

Also, I also want to compare this Collector to one that in collect() marks a 
bitset, and does all the work in getFacetResults.
                
> Explore facets aggregation during documents collection
> ------------------------------------------------------
>
>                 Key: LUCENE-4600
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4600
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Michael McCandless
>         Attachments: LUCENE-4600-cli.patch, LUCENE-4600.patch, 
> LUCENE-4600.patch, LUCENE-4600.patch
>
>
> Today the facet module simply gathers all hits (as a bitset, optionally with 
> a float[] to hold scores as well, if you will aggregate them) during 
> collection, and then at the end when you call getFacetsResults(), it makes a 
> 2nd pass over all those hits doing the actual aggregation.
> We should investigate just aggregating as we collect instead, so we don't 
> have to tie up transient RAM (fairly small for the bit set but possibly big 
> for the float[]).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to