gsmiller commented on a change in pull request #600:
URL: https://github.com/apache/lucene/pull/600#discussion_r793022388
##########
File path:
lucene/facet/src/java/org/apache/lucene/facet/taxonomy/IntTaxonomyFacets.java
##########
@@ -62,8 +52,82 @@ protected IntTaxonomyFacets(
}
}
+ /** Rolls up any single-valued hierarchical dimensions. */
+ void rollup() throws IOException {
Review comment:
That's good feedback. I think this is a case of over-optimizing for no
good reason at the harm of readability. I un-did the optimization piece of this
and benchmarks show no impact at all, so I'll un-do it. Here's bench results on
wikimedium10m on the latest revision I'm about to post:
```
TaskQPS baseline StdDevQPS candidate
StdDev Pct diff p-value
HighTermTitleBDVSort 57.88 (15.5%) 55.20
(14.6%) -4.6% ( -30% - 30%) 0.332
BrowseDayOfYearSSDVFacets 12.99 (19.2%) 12.72
(19.3%) -2.1% ( -34% - 45%) 0.733
BrowseMonthSSDVFacets 14.34 (23.3%) 14.08
(22.6%) -1.9% ( -38% - 57%) 0.798
PKLookup 170.69 (3.7%) 168.41
(3.1%) -1.3% ( -7% - 5%) 0.222
BrowseDateSSDVFacets 2.35 (5.6%) 2.32
(5.9%) -1.3% ( -12% - 10%) 0.483
HighSpanNear 10.39 (2.2%) 10.27
(3.5%) -1.2% ( -6% - 4%) 0.207
MedSpanNear 71.39 (2.3%) 70.60
(2.6%) -1.1% ( -5% - 3%) 0.155
LowSpanNear 178.35 (2.1%) 176.55
(2.8%) -1.0% ( -5% - 4%) 0.206
HighTerm 1842.57 (3.5%) 1826.02
(2.8%) -0.9% ( -6% - 5%) 0.370
OrHighNotMed 1002.15 (3.1%) 995.06
(2.9%) -0.7% ( -6% - 5%) 0.455
MedTerm 1995.77 (3.3%) 1982.42
(2.4%) -0.7% ( -6% - 5%) 0.469
MedSloppyPhrase 14.00 (3.2%) 13.91
(3.4%) -0.6% ( -7% - 6%) 0.539
BrowseRandomLabelSSDVFacets 9.47 (9.1%) 9.41
(9.6%) -0.6% ( -17% - 19%) 0.829
AndHighMed 207.94 (5.0%) 206.72
(5.2%) -0.6% ( -10% - 10%) 0.715
AndHighLow 724.56 (4.3%) 720.58
(5.3%) -0.5% ( -9% - 9%) 0.720
MedPhrase 168.21 (3.2%) 167.33
(3.0%) -0.5% ( -6% - 5%) 0.597
OrHighNotLow 1210.56 (3.3%) 1206.33
(3.4%) -0.3% ( -6% - 6%) 0.741
LowPhrase 107.57 (2.4%) 107.26
(2.2%) -0.3% ( -4% - 4%) 0.684
AndHighHigh 79.94 (4.9%) 79.83
(4.8%) -0.1% ( -9% - 10%) 0.929
AndHighHighDayTaxoFacets 10.15 (2.3%) 10.14
(2.5%) -0.1% ( -4% - 4%) 0.881
HighPhrase 109.56 (2.4%) 109.45
(2.4%) -0.1% ( -4% - 4%) 0.896
LowTerm 2117.59 (2.2%) 2116.15
(2.2%) -0.1% ( -4% - 4%) 0.922
OrHighLow 544.17 (3.0%) 544.31
(3.4%) 0.0% ( -6% - 6%) 0.980
HighSloppyPhrase 31.48 (5.0%) 31.50
(3.5%) 0.1% ( -8% - 9%) 0.961
OrHighNotHigh 900.76 (3.2%) 901.81
(2.6%) 0.1% ( -5% - 6%) 0.898
LowSloppyPhrase 127.89 (1.7%) 128.06
(1.8%) 0.1% ( -3% - 3%) 0.817
Respell 89.60 (1.3%) 89.82
(1.0%) 0.2% ( -2% - 2%) 0.502
IntNRQ 81.33 (2.4%) 81.60
(2.1%) 0.3% ( -4% - 4%) 0.637
Wildcard 96.90 (4.1%) 97.26
(4.7%) 0.4% ( -8% - 9%) 0.796
OrNotHighHigh 985.33 (3.2%) 990.06
(2.7%) 0.5% ( -5% - 6%) 0.609
AndHighMedDayTaxoFacets 54.33 (2.4%) 54.60
(1.9%) 0.5% ( -3% - 4%) 0.466
OrNotHighMed 947.59 (4.2%) 953.57
(2.4%) 0.6% ( -5% - 7%) 0.558
LowIntervalsOrdered 17.30 (4.0%) 17.42
(2.9%) 0.7% ( -6% - 7%) 0.552
TermDTSort 109.45 (21.9%) 110.24
(21.7%) 0.7% ( -35% - 56%) 0.917
Fuzzy2 43.72 (1.0%) 44.06
(1.0%) 0.8% ( -1% - 2%) 0.015
MedTermDayTaxoFacets 49.06 (4.7%) 49.45
(4.7%) 0.8% ( -8% - 10%) 0.591
MedIntervalsOrdered 15.98 (4.7%) 16.11
(3.3%) 0.8% ( -6% - 9%) 0.517
HighIntervalsOrdered 12.35 (6.9%) 12.47
(5.7%) 0.9% ( -10% - 14%) 0.644
HighTermDayOfYearSort 129.38 (17.4%) 130.99
(15.9%) 1.2% ( -27% - 41%) 0.813
OrHighMedDayTaxoFacets 12.14 (6.5%) 12.30
(5.3%) 1.3% ( -9% - 13%) 0.496
Fuzzy1 52.90 (2.0%) 53.69
(1.9%) 1.5% ( -2% - 5%) 0.014
OrNotHighLow 1183.59 (3.0%) 1202.86
(2.4%) 1.6% ( -3% - 7%) 0.058
OrHighHigh 30.83 (5.0%) 31.33
(3.7%) 1.6% ( -6% - 10%) 0.237
BrowseMonthTaxoFacets 26.56 (26.7%) 27.08
(27.3%) 1.9% ( -41% - 76%) 0.820
Prefix3 232.96 (13.6%) 237.62
(13.5%) 2.0% ( -22% - 33%) 0.641
OrHighMed 180.16 (5.8%) 183.86
(4.4%) 2.1% ( -7% - 13%) 0.207
BrowseRandomLabelTaxoFacets 17.71 (19.6%) 18.08
(20.5%) 2.1% ( -31% - 52%) 0.739
BrowseDateTaxoFacets 20.90 (22.8%) 22.28
(24.8%) 6.6% ( -33% - 70%) 0.380
BrowseDayOfYearTaxoFacets 20.92 (23.1%) 22.38
(25.1%) 7.0% ( -33% - 71%) 0.359
HighTermMonthSort 29.10 (17.5%) 31.32
(22.2%) 7.6% ( -27% - 57%) 0.228
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]