[
https://issues.apache.org/jira/browse/LUCENE-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890880#comment-13890880
]
Michael McCandless commented on LUCENE-5425:
--------------------------------------------
I compared trunk to the 2nd patch here, on wikimediumall (33.3 M docs), Date
faceting, using DirectDVFormat for facets:
{noformat}
Task QPS base StdDev QPS comp StdDev
Pct diff
HighTerm 21.28 (2.8%) 18.96 (1.2%)
-10.9% ( -14% - -7%)
LowTerm 83.98 (3.6%) 75.31 (1.3%)
-10.3% ( -14% - -5%)
MedTerm 30.96 (2.6%) 27.86 (1.2%)
-10.0% ( -13% - -6%)
OrHighNotLow 12.47 (3.8%) 11.23 (3.6%)
-10.0% ( -16% - -2%)
OrHighMed 13.39 (3.6%) 12.07 (3.4%)
-9.9% ( -16% - -2%)
OrHighLow 9.94 (4.0%) 8.97 (3.7%)
-9.8% ( -16% - -2%)
OrHighNotMed 15.52 (3.4%) 14.13 (3.3%)
-8.9% ( -15% - -2%)
OrHighNotHigh 7.16 (4.4%) 6.58 (3.8%)
-8.1% ( -15% - 0%)
OrHighHigh 4.81 (4.3%) 4.42 (3.9%)
-8.1% ( -15% - 0%)
OrNotHighHigh 5.83 (4.7%) 5.49 (4.3%)
-5.9% ( -14% - 3%)
AndHighLow 292.07 (1.5%) 274.97 (2.2%)
-5.9% ( -9% - -2%)
MedPhrase 143.35 (4.7%) 135.01 (4.5%)
-5.8% ( -14% - 3%)
HighSpanNear 6.52 (4.8%) 6.23 (4.2%)
-4.5% ( -12% - 4%)
HighPhrase 3.57 (5.9%) 3.42 (5.8%)
-4.4% ( -15% - 7%)
MedSpanNear 26.30 (3.1%) 25.55 (2.7%)
-2.8% ( -8% - 3%)
AndHighMed 29.54 (1.6%) 28.81 (1.5%)
-2.5% ( -5% - 0%)
AndHighHigh 23.98 (1.5%) 23.41 (1.4%)
-2.4% ( -5% - 0%)
OrNotHighMed 18.00 (5.6%) 17.59 (4.5%)
-2.3% ( -11% - 8%)
LowSloppyPhrase 37.65 (1.9%) 36.84 (1.6%)
-2.2% ( -5% - 1%)
LowPhrase 11.98 (2.0%) 11.76 (2.3%)
-1.8% ( -6% - 2%)
LowSpanNear 9.57 (3.0%) 9.39 (2.5%)
-1.8% ( -7% - 3%)
Prefix3 75.35 (1.4%) 74.68 (2.3%)
-0.9% ( -4% - 2%)
HighSloppyPhrase 3.13 (6.9%) 3.11 (6.3%)
-0.9% ( -13% - 13%)
IntNRQ 4.12 (2.6%) 4.08 (4.2%)
-0.8% ( -7% - 6%)
MedSloppyPhrase 3.25 (3.9%) 3.22 (3.7%)
-0.7% ( -8% - 7%)
Wildcard 17.39 (2.8%) 17.39 (2.9%)
0.0% ( -5% - 5%)
OrNotHighLow 23.17 (6.9%) 23.35 (5.8%)
0.8% ( -11% - 14%)
Fuzzy1 63.79 (1.8%) 64.51 (1.8%)
1.1% ( -2% - 4%)
Fuzzy2 44.03 (2.1%) 44.71 (2.1%)
1.5% ( -2% - 5%)
Respell 46.73 (2.9%) 47.46 (2.8%)
1.6% ( -3% - 7%)
{noformat}
Looks like there is some penalty for the added abstraction ... but I agree w/
Rob: we can just have the common-case Facets impl (FastTaxonomyFacetCounts)
specialize for the normal case when it's a FixedBitSet we are iterating over ...
> Make creation of FixedBitSet in FacetsCollector overridable
> -----------------------------------------------------------
>
> Key: LUCENE-5425
> URL: https://issues.apache.org/jira/browse/LUCENE-5425
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/facet
> Affects Versions: 4.6
> Reporter: John Wang
> Attachments: facetscollector.patch, facetscollector.patch,
> fixbitset.patch
>
>
> In FacetsCollector, creation of bits in MatchingDocs are allocated per query.
> For large indexes where maxDocs are large creating a bitset of maxDoc bits
> will be expensive and would great a lot of garbage.
> Attached patch is to allow for this allocation customizable while maintaining
> current behavior.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]