[
https://issues.apache.org/jira/browse/LUCENE-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892131#comment-13892131
]
Shai Erera commented on LUCENE-5425:
------------------------------------
I ran this on a 2013 Wikipedia dump w/ 6.7M docs (full docs, not 1K) and Date
facet:
{noformat}
Task QPS base StdDev QPS comp StdDev
Pct diff
HighSloppyPhrase 1.80 (9.7%) 1.75 (6.1%) -2.9%
( -16% - 14%)
OrHighLow 5.53 (2.2%) 5.43 (2.7%) -1.9%
( -6% - 3%)
OrHighHigh 3.81 (2.2%) 3.74 (2.7%) -1.8%
( -6% - 3%)
OrHighNotLow 9.27 (2.1%) 9.13 (2.6%) -1.5%
( -5% - 3%)
HighSpanNear 3.77 (5.4%) 3.71 (6.2%) -1.4%
( -12% - 10%)
OrHighNotMed 14.84 (2.1%) 14.64 (2.5%) -1.3%
( -5% - 3%)
OrHighNotHigh 8.06 (2.4%) 7.96 (2.8%) -1.2%
( -6% - 4%)
MedSloppyPhrase 1.66 (7.2%) 1.64 (4.3%) -1.2%
( -11% - 11%)
OrNotHighLow 30.04 (4.6%) 29.71 (4.9%) -1.1%
( -10% - 8%)
OrHighMed 12.16 (2.2%) 12.04 (2.3%) -1.0%
( -5% - 3%)
HighPhrase 2.28 (10.2%) 2.26 (9.2%) -0.7%
( -18% - 20%)
OrNotHighHigh 13.08 (3.0%) 13.00 (3.1%) -0.7%
( -6% - 5%)
Respell 24.49 (3.3%) 24.33 (3.3%) -0.6%
( -7% - 6%)
OrNotHighMed 18.02 (4.1%) 17.99 (4.0%) -0.2%
( -7% - 8%)
LowPhrase 5.73 (7.0%) 5.72 (6.9%) -0.2%
( -13% - 14%)
MedSpanNear 14.97 (3.8%) 14.99 (4.3%) 0.1%
( -7% - 8%)
AndHighLow 199.51 (2.9%) 200.05 (3.6%) 0.3%
( -6% - 6%)
LowSpanNear 4.57 (4.0%) 4.59 (4.7%) 0.3%
( -8% - 9%)
MedPhrase 79.00 (7.4%) 79.23 (6.3%) 0.3%
( -12% - 15%)
Fuzzy2 25.42 (3.0%) 25.56 (3.1%) 0.6%
( -5% - 6%)
Fuzzy1 35.84 (2.7%) 36.11 (3.7%) 0.7%
( -5% - 7%)
LowSloppyPhrase 20.55 (2.7%) 20.73 (2.3%) 0.9%
( -4% - 6%)
HighTerm 22.31 (3.7%) 22.59 (2.6%) 1.2%
( -4% - 7%)
AndHighMed 16.17 (1.8%) 16.39 (2.3%) 1.3%
( -2% - 5%)
AndHighHigh 15.85 (2.3%) 16.17 (1.7%) 2.1%
( -1% - 6%)
MedTerm 26.51 (3.9%) 27.11 (4.0%) 2.3%
( -5% - 10%)
LowTerm 98.07 (4.6%) 101.55 (5.5%) 3.5%
( -6% - 14%)
IntNRQ 8.61 (4.3%) 9.20 (4.6%) 6.9%
( -1% - 16%)
Wildcard 12.96 (3.0%) 14.30 (3.6%) 10.3%
( 3% - 17%)
Prefix3 74.18 (2.7%) 96.70 (4.9%) 30.4%
( 22% - 38%)
{noformat}
Results are consistent with yours. So should we proceed w/ the API change?
> Make creation of FixedBitSet in FacetsCollector overridable
> -----------------------------------------------------------
>
> Key: LUCENE-5425
> URL: https://issues.apache.org/jira/browse/LUCENE-5425
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/facet
> Affects Versions: 4.6
> Reporter: John Wang
> Attachments: LUCENE-5425.patch, facetscollector.patch,
> facetscollector.patch, fixbitset.patch
>
>
> In FacetsCollector, creation of bits in MatchingDocs are allocated per query.
> For large indexes where maxDocs are large creating a bitset of maxDoc bits
> will be expensive and would great a lot of garbage.
> Attached patch is to allow for this allocation customizable while maintaining
> current behavior.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]