gsmiller opened a new pull request #600:
URL: https://github.com/apache/lucene/pull/600
# Description
IntTaxonomyFacets is intended as an internal implementation detail but has
public visibility, so could also be serving as an extension point for
user-created faceting implementations. We should reduce visibility to
pkg-private.
# Solution
This follows #599 (where this class was deprecated in 9.x) by actually
reducing the visibility. It also cleans up visibility on some of the methods
and removes `increment` altogether in favor of directly access the counting
data structures.
# Tests
All existing tests pass. I also ran `luceneutil` benchmarks
(`wikimedium10m`) to make sure there were no unexpected regressions with this
change since it alters the way taxo-faceting increments count data structures
during rollup. If anything I thought there might be a small performance bump
with this since it eliminates the null check inside the loops. Maybe there is
given the pattern of taxo-faceting tasks trending towards an improvement, but
none of the p-values are close to significant so it could also just be noise.
Regardless, it doesn't look like there are any regressions:
```
TaskQPS baseline StdDevQPS candidate
StdDev Pct diff p-value
HighTermTitleBDVSort 111.43 (16.3%) 108.97
(15.0%) -2.2% ( -28% - 34%) 0.656
TermDTSort 114.41 (17.8%) 112.00
(14.7%) -2.1% ( -29% - 36%) 0.683
Prefix3 56.27 (11.6%) 55.22
(11.7%) -1.9% ( -22% - 24%) 0.612
Wildcard 52.32 (7.4%) 51.68
(7.5%) -1.2% ( -15% - 14%) 0.605
HighTerm 1744.29 (4.2%) 1723.49
(6.5%) -1.2% ( -11% - 9%) 0.490
MedTerm 1731.31 (3.9%) 1712.36
(6.1%) -1.1% ( -10% - 9%) 0.497
OrHighLow 496.22 (1.4%) 490.80
(2.0%) -1.1% ( -4% - 2%) 0.045
LowTerm 1704.51 (3.8%) 1686.34
(4.0%) -1.1% ( -8% - 6%) 0.387
MedPhrase 10.66 (2.6%) 10.56
(2.5%) -0.9% ( -5% - 4%) 0.238
BrowseMonthSSDVFacets 14.39 (20.3%) 14.27
(21.5%) -0.9% ( -35% - 51%) 0.894
OrNotHighMed 1098.53 (4.2%) 1089.42
(4.1%) -0.8% ( -8% - 7%) 0.524
HighSloppyPhrase 12.77 (3.2%) 12.67
(3.2%) -0.8% ( -7% - 5%) 0.423
Fuzzy1 59.09 (1.2%) 58.62
(1.7%) -0.8% ( -3% - 2%) 0.099
HighPhrase 206.39 (2.6%) 204.85
(2.4%) -0.7% ( -5% - 4%) 0.346
LowPhrase 301.78 (1.6%) 299.85
(2.2%) -0.6% ( -4% - 3%) 0.299
OrHighHigh 40.38 (3.5%) 40.12
(3.8%) -0.6% ( -7% - 6%) 0.584
OrNotHighLow 938.33 (2.7%) 932.47
(2.2%) -0.6% ( -5% - 4%) 0.423
OrHighMed 124.57 (3.1%) 123.79
(3.3%) -0.6% ( -6% - 5%) 0.538
HighTermDayOfYearSort 113.47 (20.3%) 112.80
(15.8%) -0.6% ( -30% - 44%) 0.919
AndHighHigh 41.58 (4.1%) 41.35
(4.4%) -0.5% ( -8% - 8%) 0.690
LowSpanNear 66.95 (2.6%) 66.64
(2.5%) -0.5% ( -5% - 4%) 0.558
PKLookup 167.41 (2.1%) 166.72
(2.6%) -0.4% ( -4% - 4%) 0.577
BrowseRandomLabelSSDVFacets 9.43 (3.3%) 9.39
(3.3%) -0.4% ( -6% - 6%) 0.705
LowSloppyPhrase 12.46 (3.3%) 12.41
(2.8%) -0.4% ( -6% - 5%) 0.694
MedSloppyPhrase 8.75 (4.1%) 8.72
(3.4%) -0.4% ( -7% - 7%) 0.748
OrHighMedDayTaxoFacets 13.39 (8.3%) 13.34
(6.2%) -0.3% ( -13% - 15%) 0.885
Fuzzy2 72.83 (1.7%) 72.61
(1.7%) -0.3% ( -3% - 3%) 0.580
AndHighMed 133.81 (3.0%) 133.43
(3.2%) -0.3% ( -6% - 6%) 0.769
AndHighLow 1065.36 (3.0%) 1063.00
(1.3%) -0.2% ( -4% - 4%) 0.763
Respell 54.25 (1.8%) 54.13
(1.6%) -0.2% ( -3% - 3%) 0.682
OrHighNotHigh 1348.35 (3.8%) 1345.48
(4.1%) -0.2% ( -7% - 8%) 0.866
OrNotHighHigh 879.98 (4.3%) 878.28
(4.1%) -0.2% ( -8% - 8%) 0.886
HighTermMonthSort 49.02 (18.1%) 48.97
(15.7%) -0.1% ( -28% - 41%) 0.983
MedSpanNear 68.05 (2.9%) 67.99
(2.7%) -0.1% ( -5% - 5%) 0.912
HighSpanNear 10.77 (2.1%) 10.77
(1.4%) -0.0% ( -3% - 3%) 0.931
LowIntervalsOrdered 10.86 (3.7%) 10.86
(4.1%) -0.0% ( -7% - 8%) 0.972
OrHighNotMed 941.33 (5.0%) 941.16
(4.9%) -0.0% ( -9% - 10%) 0.991
OrHighNotLow 927.82 (5.5%) 928.06
(5.7%) 0.0% ( -10% - 11%) 0.988
IntNRQ 225.06 (1.4%) 225.20
(1.8%) 0.1% ( -3% - 3%) 0.908
AndHighMedDayTaxoFacets 28.48 (2.2%) 28.51
(1.5%) 0.1% ( -3% - 3%) 0.821
BrowseMonthTaxoFacets 31.46 (9.4%) 31.59
(15.0%) 0.4% ( -21% - 27%) 0.920
AndHighHighDayTaxoFacets 10.44 (2.3%) 10.49
(2.8%) 0.5% ( -4% - 5%) 0.555
HighIntervalsOrdered 25.26 (7.1%) 25.38
(7.7%) 0.5% ( -13% - 16%) 0.838
MedIntervalsOrdered 25.71 (6.5%) 25.83
(7.3%) 0.5% ( -12% - 15%) 0.826
BrowseDayOfYearSSDVFacets 13.44 (16.9%) 13.51
(18.2%) 0.5% ( -29% - 42%) 0.930
MedTermDayTaxoFacets 23.72 (4.7%) 23.98
(9.0%) 1.1% ( -12% - 15%) 0.631
BrowseDayOfYearTaxoFacets 29.84 (31.0%) 31.70
(29.5%) 6.2% ( -41% - 96%) 0.514
BrowseRandomLabelTaxoFacets 22.94 (30.3%) 24.37
(29.2%) 6.2% ( -40% - 94%) 0.507
BrowseDateTaxoFacets 29.43 (30.8%) 31.29
(29.3%) 6.3% ( -41% - 95%) 0.506
```
# Checklist
Please review the following and check all that apply:
- [x] I have reviewed the guidelines for [How to
Contribute](https://wiki.apache.org/lucene/HowToContribute) and my code
conforms to the standards described there to the best of my ability.
- [x] I have created a Jira issue and added the issue ID to my pull request
title.
- [x] I have given Lucene maintainers
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
to contribute to my PR branch. (optional but recommended)
- [x] I have developed this patch against the `main` branch.
- [x] I have run `./gradlew check`.
- [ ] I have added tests for my changes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]