[
https://issues.apache.org/jira/browse/LUCENE-10334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Feng Guo updated LUCENE-10334:
------------------------------
Description:
Previous talk is here: [https://github.com/apache/lucene/pull/557]
This is trying to add a new BlockReader based on ForUtil to replace the
DirectReader we are using for NumericDocvalues
-*Benchmark based on wiki10m*- (Previous benchmark results are wrong so i
deleted it to avoid misleading, let's see the benchmark in comments.)
was:
Previous talk is here: https://github.com/apache/lucene/pull/557
This is trying to add a new BlockReader based on ForUtil to replace the
DirectReader we are using for NumericDocvalues
*Benchmark based on wiki10m*
{code:java}
TaskQPS baseline StdDevQPS my_modified_version
StdDev Pct diff p-value
OrNotHighHigh 694.17 (8.2%) 685.83
(7.0%) -1.2% ( -15% - 15%) 0.618
Respell 75.15 (2.7%) 74.32
(2.0%) -1.1% ( -5% - 3%) 0.146
Prefix3 220.11 (5.1%) 217.78
(5.8%) -1.1% ( -11% - 10%) 0.541
Wildcard 129.75 (3.7%) 128.63
(2.5%) -0.9% ( -6% - 5%) 0.383
LowSpanNear 68.54 (2.1%) 68.00
(2.4%) -0.8% ( -5% - 3%) 0.269
OrNotHighMed 732.90 (6.8%) 727.49
(5.3%) -0.7% ( -12% - 12%) 0.703
BrowseRandomLabelTaxoFacets 11879.03 (8.6%) 11799.33
(5.5%) -0.7% ( -13% - 14%) 0.769
HighSloppyPhrase 6.87 (2.9%) 6.83
(2.3%) -0.6% ( -5% - 4%) 0.496
OrHighNotMed 827.54 (9.2%) 822.94
(8.0%) -0.6% ( -16% - 18%) 0.838
MedSpanNear 18.92 (5.7%) 18.82
(5.6%) -0.5% ( -11% - 11%) 0.759
OrHighMedDayTaxoFacets 10.27 (4.0%) 10.21
(4.3%) -0.5% ( -8% - 8%) 0.676
PKLookup 207.98 (4.0%) 206.85
(2.7%) -0.5% ( -7% - 6%) 0.621
LowIntervalsOrdered 159.17 (2.3%) 158.32
(2.2%) -0.5% ( -4% - 3%) 0.445
HighSpanNear 6.32 (4.2%) 6.28
(4.1%) -0.5% ( -8% - 8%) 0.691
MedIntervalsOrdered 85.31 (3.2%) 84.88
(2.9%) -0.5% ( -6% - 5%) 0.607
HighTerm 1170.55 (5.8%) 1164.79
(3.9%) -0.5% ( -9% - 9%) 0.753
LowSloppyPhrase 14.54 (3.1%) 14.48
(2.9%) -0.4% ( -6% - 5%) 0.651
HighPhrase 112.81 (4.4%) 112.39
(4.1%) -0.4% ( -8% - 8%) 0.781
OrNotHighLow 858.02 (5.9%) 854.99
(4.8%) -0.4% ( -10% - 10%) 0.835
HighIntervalsOrdered 25.08 (2.8%) 25.00
(2.6%) -0.3% ( -5% - 5%) 0.701
MedPhrase 27.20 (2.1%) 27.11
(2.9%) -0.3% ( -5% - 4%) 0.689
MedTermDayTaxoFacets 81.55 (2.3%) 81.35
(2.9%) -0.3% ( -5% - 5%) 0.762
IntNRQ 63.36 (2.0%) 63.21
(2.5%) -0.2% ( -4% - 4%) 0.740
Fuzzy2 73.24 (5.5%) 73.10
(6.2%) -0.2% ( -11% - 12%) 0.916
AndHighMedDayTaxoFacets 76.08 (3.5%) 75.98
(3.4%) -0.1% ( -6% - 7%) 0.905
AndHighHigh 62.20 (2.0%) 62.18
(2.4%) -0.0% ( -4% - 4%) 0.954
BrowseMonthTaxoFacets 11993.48 (6.7%) 11989.53
(4.8%) -0.0% ( -10% - 12%) 0.986
OrHighNotLow 732.82 (7.2%) 732.80
(6.2%) -0.0% ( -12% - 14%) 0.999
Fuzzy1 46.43 (5.3%) 46.45
(6.0%) 0.0% ( -10% - 11%) 0.989
LowTerm 1608.25 (6.0%) 1608.84
(4.9%) 0.0% ( -10% - 11%) 0.983
OrHighMed 75.90 (2.3%) 75.93
(1.8%) 0.0% ( -3% - 4%) 0.939
LowPhrase 273.81 (2.9%) 274.04
(3.3%) 0.1% ( -5% - 6%) 0.932
AndHighLow 717.24 (6.1%) 718.17
(3.3%) 0.1% ( -8% - 10%) 0.933
AndHighHighDayTaxoFacets 39.63 (2.5%) 39.69
(2.6%) 0.1% ( -4% - 5%) 0.862
OrHighHigh 34.63 (1.8%) 34.68
(2.0%) 0.1% ( -3% - 4%) 0.821
MedSloppyPhrase 158.80 (2.8%) 159.09
(2.6%) 0.2% ( -5% - 5%) 0.832
OrHighLow 257.77 (2.9%) 258.46
(4.6%) 0.3% ( -7% - 8%) 0.826
AndHighMed 133.43 (2.1%) 133.79
(2.7%) 0.3% ( -4% - 5%) 0.726
HighTermMonthSort 145.28 (10.8%) 145.88
(11.2%) 0.4% ( -19% - 25%) 0.905
OrHighNotHigh 834.99 (6.1%) 839.62
(5.7%) 0.6% ( -10% - 13%) 0.766
TermDTSort 83.66 (9.6%) 84.30
(11.1%) 0.8% ( -18% - 23%) 0.817
BrowseDayOfYearTaxoFacets 11639.59 (5.1%) 11777.38
(6.0%) 1.2% ( -9% - 12%) 0.502
MedTerm 1473.62 (7.4%) 1493.79
(6.4%) 1.4% ( -11% - 16%) 0.530
HighTermTitleBDVSort 114.98 (16.7%) 117.30
(18.8%) 2.0% ( -28% - 45%) 0.720
HighTermDayOfYearSort 128.29 (17.2%) 132.83
(22.6%) 3.5% ( -30% - 52%) 0.577
BrowseDateTaxoFacets 19.25 (20.4%) 26.77
(3.7%) 39.1% ( 12% - 79%) 0.000
BrowseRandomLabelSSDVFacets 10.38 (3.5%) 18.03
(6.8%) 73.7% ( 61% - 87%) 0.000
BrowseMonthSSDVFacets 15.71 (3.6%) 34.59
(12.4%) 120.1% ( 100% - 141%) 0.000
BrowseDayOfYearSSDVFacets 14.31 (3.3%) 33.54
(12.9%) 134.4% ( 114% - 155%) 0.000
{code}
*candidate*
{code:java}
PERCENT CPU SAMPLES STACK
3.48% 9280
org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$BlockImpactsPostingsEnum#advance()
3.41% 9082
org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countOneSegment()
3.32% 8836
org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$EverythingEnum#nextPosition()
2.72% 7260
org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$20#ordValue()
2.03% 5423
org.apache.lucene.queries.spans.NearSpansOrdered#stretchToOrder()
1.91% 5094
org.apache.lucene.queries.spans.TermSpans#nextStartPosition()
1.90% 5063
org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$EverythingEnum#advance()
1.80% 4787
org.apache.lucene.codecs.lucene90.Lucene90PostingsReader#findFirstGreater()
1.72% 4574 org.apache.lucene.search.PhraseScorer$1#matches()
1.55% 4141 org.apache.lucene.util.PriorityQueue#upHeap()
1.55% 4141 org.apache.lucene.codecs.lucene90.ForUtil#expand8()
1.53% 4073
org.apache.lucene.queries.spans.NearSpansOrdered#nextStartPosition()
1.47% 3929 org.apache.lucene.util.packed.BlockReader#get()
1.39% 3703 org.apache.lucene.search.ConjunctionDISI#doNext()
1.35% 3593
org.apache.lucene.codecs.lucene90.Lucene90NormsProducer$3#longValue()
1.32% 3514
jdk.internal.misc.ScopedMemoryAccess#getByteInternal()
1.21% 3236
org.apache.lucene.search.similarities.BM25Similarity$BM25Scorer#score()
1.13% 3003
org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$20#nextDoc()
1.05% 2808
org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$BlockImpactsPostingsEnum#nextPosition()
1.04% 2780
org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$EverythingEnum#skipPositions()
1.03% 2750
org.apache.lucene.search.BooleanScorer$OrCollector#collect()
1.03% 2732
org.apache.lucene.search.SloppyPhraseMatcher#maxFreq()
0.99% 2627
org.apache.lucene.queries.intervals.OrderedIntervalsSource$OrderedIntervalIterator#nextInterval()
0.98% 2610
org.apache.lucene.search.MultiCollector$MultiLeafCollector#collect()
0.89% 2368
org.apache.lucene.search.SloppyPhraseMatcher#initSimple()
0.88% 2350
org.apache.lucene.queries.spans.NearSpansOrdered#advancePosition()
0.87% 2312
org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$BlockDocsEnum#advance()
0.84% 2252 org.apache.lucene.util.PriorityQueue#add()
0.82% 2176
org.apache.lucene.queries.spans.SpanScorer#setFreqCurrentDoc()
0.81% 2161 org.apache.lucene.codecs.lucene90.PForUtil#decode()
{code}
*baseline*
{code:java}
PERCENT CPU SAMPLES STACK
4.22% 12298
org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countOneSegment()
3.25% 9468
org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$BlockImpactsPostingsEnum#advance()
3.04% 8872
org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$EverythingEnum#nextPosition()
2.26% 6576
org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$20#ordValue()
2.06% 5993
org.apache.lucene.util.packed.DirectReader$DirectPackedReader12#get()
1.90% 5528
org.apache.lucene.queries.spans.NearSpansOrdered#stretchToOrder()
1.81% 5266
org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$EverythingEnum#advance()
1.75% 5116
org.apache.lucene.queries.spans.TermSpans#nextStartPosition()
1.53% 4469
org.apache.lucene.codecs.lucene90.Lucene90PostingsReader#findFirstGreater()
1.51% 4392 org.apache.lucene.search.PhraseScorer$1#matches()
1.44% 4204 org.apache.lucene.util.PriorityQueue#upHeap()
1.37% 3999
jdk.internal.misc.ScopedMemoryAccess#getByteInternal()
1.37% 3992 org.apache.lucene.codecs.lucene90.ForUtil#expand8()
1.37% 3991
org.apache.lucene.queries.spans.NearSpansOrdered#nextStartPosition()
1.33% 3869 org.apache.lucene.search.ConjunctionDISI#doNext()
1.27% 3688
org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$20#nextDoc()
1.24% 3606 java.nio.Buffer#scope()
1.23% 3593
org.apache.lucene.codecs.lucene90.Lucene90NormsProducer$3#longValue()
1.20% 3491
org.apache.lucene.util.packed.DirectReader$DirectPackedReader20#get()
1.16% 3392 java.nio.Buffer#checkIndex()
1.09% 3186
org.apache.lucene.util.packed.DirectReader$DirectPackedReader4#get()
1.09% 3164
org.apache.lucene.search.similarities.BM25Similarity$BM25Scorer#score()
1.01% 2946
org.apache.lucene.store.ByteBufferGuard#ensureValid()
0.95% 2772
org.apache.lucene.search.BooleanScorer$OrCollector#collect()
0.95% 2766
org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$EverythingEnum#skipPositions()
0.95% 2763
org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$BlockImpactsPostingsEnum#nextPosition()
0.93% 2699
org.apache.lucene.search.SloppyPhraseMatcher#maxFreq()
0.92% 2678
org.apache.lucene.search.MultiCollector$MultiLeafCollector#collect()
0.87% 2545
org.apache.lucene.queries.intervals.OrderedIntervalsSource$OrderedIntervalIterator#nextInterval()
0.85% 2479
org.apache.lucene.search.SloppyPhraseMatcher#initSimple()
{code}
> Introduce a BlockReader based on ForUtil and use it for NumericDocValues
> ------------------------------------------------------------------------
>
> Key: LUCENE-10334
> URL: https://issues.apache.org/jira/browse/LUCENE-10334
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/codecs
> Reporter: Feng Guo
> Priority: Major
> Time Spent: 50m
> Remaining Estimate: 0h
>
> Previous talk is here: [https://github.com/apache/lucene/pull/557]
> This is trying to add a new BlockReader based on ForUtil to replace the
> DirectReader we are using for NumericDocvalues
> -*Benchmark based on wiki10m*- (Previous benchmark results are wrong so i
> deleted it to avoid misleading, let's see the benchmark in comments.)
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]