[
https://issues.apache.org/jira/browse/LUCENE-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated LUCENE-4600:
---------------------------------------
Attachment: LUCENE-4600.patch
Initial prototype patch ... I created a CountingFacetsCollector that aggregates
per-segment, and it "hardwires" a dgap/vint decoding.
I tested using luceneutil's date faceting and it gives decent speedups for
TermQuery:
{noformat}
HighTerm 0.54 (2.7%) 0.63 (1.4%)
17.6% ( 13% - 22%)
LowTerm 7.69 (1.6%) 9.15 (2.1%)
18.9% ( 14% - 23%)
MedTerm 3.39 (1.2%) 4.48 (1.3%)
32.2% ( 29% - 35%)
{noformat}
> Facets should aggregate during collection, not at the end
> ---------------------------------------------------------
>
> Key: LUCENE-4600
> URL: https://issues.apache.org/jira/browse/LUCENE-4600
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: LUCENE-4600.patch
>
>
> Today the facet module simply gathers all hits (as a bitset, optionally with
> a float[] to hold scores as well, if you will aggregate them) during
> collection, and then at the end when you call getFacetsResults(), it makes a
> 2nd pass over all those hits doing the actual aggregation.
> We should investigate just aggregating as we collect instead, so we don't
> have to tie up transient RAM (fairly small for the bit set but possibly big
> for the float[]).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]