[
https://issues.apache.org/jira/browse/LUCENE-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890040#comment-13890040
]
Lei Wang commented on LUCENE-5425:
----------------------------------
tried with the lucenutil, but got some problem. I cannot get same numbers for
two identical code of trunk. even if they are all trunks, i get different
numbers:
Report after iter 19:
TaskQPS baseline StdDevQPS my_modified_version
StdDev Pct diff
OrHighMed 74.15 (7.1%) 71.24 (8.3%)
-3.9% ( -18% - 12%)
LowTerm 515.68 (15.1%) 496.20 (12.3%)
-3.8% ( -27% - 27%)
OrNotHighLow 72.22 (8.2%) 70.36 (7.6%)
-2.6% ( -17% - 14%)
OrNotHighMed 79.01 (7.3%) 77.43 (8.4%)
-2.0% ( -16% - 14%)
OrHighNotHigh 38.66 (4.5%) 37.90 (6.4%)
-2.0% ( -12% - 9%)
Respell 51.21 (7.1%) 50.23 (6.5%)
-1.9% ( -14% - 12%)
MedPhrase 69.67 (7.5%) 68.35 (7.4%)
-1.9% ( -15% - 14%)
OrHighLow 67.24 (7.8%) 66.00 (9.0%)
-1.8% ( -17% - 16%)
Fuzzy1 27.37 (5.7%) 26.96 (5.5%)
-1.5% ( -11% - 10%)
Fuzzy2 37.21 (3.8%) 36.71 (5.6%)
-1.3% ( -10% - 8%)
MedSloppyPhrase 9.94 (5.4%) 9.83 (3.9%)
-1.1% ( -9% - 8%)
LowSpanNear 8.60 (3.9%) 8.54 (3.8%)
-0.7% ( -8% - 7%)
AndHighHigh 40.23 (3.1%) 40.03 (2.5%)
-0.5% ( -5% - 5%)
HighTerm 76.07 (9.0%) 75.96 (9.1%)
-0.2% ( -16% - 19%)
OrHighHigh 11.62 (3.0%) 11.62 (4.8%)
-0.1% ( -7% - 7%)
IntNRQ 9.51 (3.9%) 9.51 (8.3%)
0.0% ( -11% - 12%)
HighPhrase 25.61 (7.0%) 25.63 (7.7%)
0.1% ( -13% - 15%)
LowSloppyPhrase 30.21 (5.2%) 30.24 (4.3%)
0.1% ( -8% - 10%)
PKLookup 212.03 (9.0%) 212.25 (11.5%)
0.1% ( -18% - 22%)
OrNotHighHigh 27.75 (3.5%) 27.80 (6.5%)
0.2% ( -9% - 10%)
OrHighNotMed 58.14 (5.9%) 58.27 (8.3%)
0.2% ( -13% - 15%)
MedSpanNear 22.73 (3.7%) 22.80 (5.1%)
0.3% ( -8% - 9%)
Wildcard 42.84 (5.0%) 42.97 (5.4%)
0.3% ( -9% - 11%)
HighSloppyPhrase 23.99 (7.4%) 24.08 (6.3%)
0.4% ( -12% - 15%)
AndHighLow 625.62 (6.6%) 629.52 (10.5%)
0.6% ( -15% - 18%)
Prefix3 77.68 (7.2%) 78.17 (6.2%)
0.6% ( -11% - 15%)
LowPhrase 14.58 (4.7%) 14.77 (5.0%)
1.3% ( -8% - 11%)
HighSpanNear 11.84 (4.3%) 11.99 (5.2%)
1.3% ( -7% - 11%)
OrHighNotLow 66.04 (8.4%) 67.28 (9.2%)
1.9% ( -14% - 21%)
AndHighMed 66.55 (4.3%) 67.91 (6.2%)
2.1% ( -8% - 13%)
MedTerm 139.78 (9.5%) 145.63 (10.3%)
4.2% ( -14% - 26%)
with the patch, the numbers are also different, but no bigger difference than
the trunk-trunk numbers:
Report after iter 19:
TaskQPS baseline StdDevQPS my_modified_version
StdDev Pct diff
AndHighLow 730.30 (11.5%) 700.95 (10.6%)
-4.0% ( -23% - 20%)
LowTerm 520.94 (10.6%) 504.25 (11.4%)
-3.2% ( -22% - 21%)
Fuzzy1 57.55 (5.1%) 56.26 (4.8%)
-2.2% ( -11% - 8%)
Respell 35.85 (4.7%) 35.18 (4.1%)
-1.9% ( -10% - 7%)
OrHighNotHigh 37.77 (7.3%) 37.19 (5.9%)
-1.5% ( -13% - 12%)
HighSloppyPhrase 12.30 (7.5%) 12.17 (7.7%)
-1.1% ( -15% - 15%)
HighPhrase 29.38 (5.2%) 29.06 (4.3%)
-1.1% ( -10% - 8%)
OrNotHighMed 25.93 (6.2%) 25.68 (5.5%)
-1.0% ( -11% - 11%)
OrNotHighHigh 19.72 (5.9%) 19.53 (4.9%)
-0.9% ( -11% - 10%)
Fuzzy2 11.30 (3.6%) 11.24 (5.1%)
-0.6% ( -8% - 8%)
PKLookup 218.16 (8.6%) 217.53 (9.3%)
-0.3% ( -16% - 19%)
LowSloppyPhrase 43.09 (5.6%) 43.00 (3.5%)
-0.2% ( -8% - 9%)
MedSpanNear 30.65 (4.4%) 30.60 (3.2%)
-0.1% ( -7% - 7%)
MedSloppyPhrase 21.71 (5.7%) 21.70 (3.8%)
-0.0% ( -8% - 9%)
Wildcard 14.67 (3.3%) 14.67 (2.6%)
-0.0% ( -5% - 6%)
HighSpanNear 0.64 (4.6%) 0.64 (5.0%)
0.1% ( -9% - 10%)
LowPhrase 21.05 (5.6%) 21.09 (7.6%)
0.2% ( -12% - 14%)
AndHighMed 175.53 (7.2%) 176.00 (8.2%)
0.3% ( -14% - 16%)
Prefix3 31.24 (3.3%) 31.37 (2.7%)
0.4% ( -5% - 6%)
OrNotHighLow 76.32 (6.3%) 76.80 (7.7%)
0.6% ( -12% - 15%)
OrHighHigh 33.43 (6.4%) 33.65 (7.6%)
0.7% ( -12% - 15%)
AndHighHigh 35.51 (3.1%) 35.76 (3.1%)
0.7% ( -5% - 7%)
IntNRQ 9.36 (4.4%) 9.43 (3.7%)
0.7% ( -7% - 9%)
HighTerm 90.42 (7.0%) 91.40 (5.3%)
1.1% ( -10% - 14%)
OrHighLow 71.32 (8.6%) 72.13 (8.2%)
1.1% ( -14% - 19%)
LowSpanNear 107.82 (6.8%) 109.19 (5.9%)
1.3% ( -10% - 14%)
OrHighMed 45.43 (8.8%) 46.09 (8.8%)
1.5% ( -14% - 20%)
MedTerm 139.24 (7.0%) 141.28 (8.4%)
1.5% ( -13% - 18%)
MedPhrase 96.51 (5.0%) 98.10 (6.8%)
1.6% ( -9% - 14%)
OrHighNotMed 50.88 (6.9%) 52.13 (8.3%)
2.5% ( -11% - 18%)
OrHighNotLow 65.31 (8.9%) 67.16 (8.8%)
2.8% ( -13% - 22%)
Btw, I copied the facet config from the nightly py, and the index looks like:
index = comp.newIndex('trunk', WIKI_MEDIUM_10M, facets = (('Date',),),
facetDVFormat='Direct')
> Make creation of FixedBitSet in FacetsCollector overridable
> -----------------------------------------------------------
>
> Key: LUCENE-5425
> URL: https://issues.apache.org/jira/browse/LUCENE-5425
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/facet
> Affects Versions: 4.6
> Reporter: John Wang
> Attachments: facetscollector.patch, facetscollector.patch,
> fixbitset.patch
>
>
> In FacetsCollector, creation of bits in MatchingDocs are allocated per query.
> For large indexes where maxDocs are large creating a bitset of maxDoc bits
> will be expensive and would great a lot of garbage.
> Attached patch is to allow for this allocation customizable while maintaining
> current behavior.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]