zacharymorn commented on pull request #418: URL: https://github.com/apache/lucene/pull/418#issuecomment-966862380
I re-ran perf test after https://github.com/mikemccand/luceneutil/commit/0550148b67f82d446e07bd0b4fdbde24f1d6228d has been merged: Results from `python3 src/python/localrun.py -source combinedFieldsBig`: Run 1: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value CFQHighHigh 3.69 (1.8%) 2.49 (6.2%) -32.5% ( -39% - -24%) 0.000 CFQHighMed 4.95 (2.1%) 4.19 (5.9%) -15.5% ( -22% - -7%) 0.000 PKLookup 125.72 (4.5%) 126.86 (10.3%) 0.9% ( -13% - 16%) 0.719 CFQHighLow 19.92 (2.2%) 20.80 (9.5%) 4.4% ( -7% - 16%) 0.043 ``` Run 2: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value CFQHighHigh 3.61 (2.8%) 2.48 (2.9%) -31.4% ( -36% - -26%) 0.000 PKLookup 116.67 (7.1%) 123.97 (5.5%) 6.3% ( -5% - 20%) 0.002 CFQHighMed 4.97 (3.6%) 5.29 (5.5%) 6.6% ( -2% - 16%) 0.000 CFQHighLow 11.96 (4.5%) 13.99 (6.5%) 17.0% ( 5% - 29%) 0.000 ``` Run 3: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value CFQHighHigh 3.51 (4.2%) 2.44 (6.5%) -30.5% ( -39% - -20%) 0.000 PKLookup 105.72 (11.9%) 108.81 (11.2%) 2.9% ( -18% - 29%) 0.424 CFQHighMed 10.85 (4.2%) 11.60 (11.4%) 6.9% ( -8% - 23%) 0.011 CFQHighLow 15.11 (5.6%) 16.16 (9.8%) 7.0% ( -7% - 23%) 0.006 ``` Results from `python3 src/python/localrun.py -source combinedFieldsUnevenlyWeightedBig` Run 1: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value PKLookup 93.42 (13.7%) 88.23 (11.7%) -5.6% ( -27% - 23%) 0.168 CFQHighMed 4.69 (10.7%) 5.00 (18.0%) 6.6% ( -20% - 39%) 0.160 CFQHighHigh 4.51 (10.6%) 5.17 (17.7%) 14.6% ( -12% - 48%) 0.002 CFQHighLow 14.13 (8.5%) 23.11 (32.3%) 63.5% ( 20% - 114%) 0.000 ``` Run 2: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value CFQHighMed 4.77 (4.5%) 4.10 (8.3%) -14.2% ( -25% - -1%) 0.000 PKLookup 98.99 (12.3%) 101.47 (12.5%) 2.5% ( -19% - 31%) 0.522 CFQHighHigh 4.88 (5.3%) 5.98 (11.5%) 22.6% ( 5% - 41%) 0.000 CFQHighLow 11.57 (5.6%) 18.86 (18.8%) 62.9% ( 36% - 92%) 0.000 ``` Run 3: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value CFQHighHigh 3.55 (5.1%) 2.38 (9.0%) -32.9% ( -44% - -19%) 0.000 PKLookup 101.29 (7.0%) 94.22 (15.4%) -7.0% ( -27% - 16%) 0.065 CFQHighLow 15.43 (5.8%) 16.60 (11.2%) 7.6% ( -8% - 26%) 0.007 CFQHighMed 3.12 (5.1%) 3.83 (15.0%) 22.7% ( 2% - 45%) 0.000 ``` --- For one of the most negatively impacted query (-42.0%): `CFQHighHigh: at united +combinedFields=titleTokenized^4.0,body^2.0 # freq=2834104 freq=1185528`, the JFR CPU profiling result looks like the following ``` PERCENT CPU SAMPLES STACK 15.82% 13099 org.apache.lucene.sandbox.search.CombinedFieldQuery$1$1#doMergeImpactsPerField() 11.46% 9487 org.apache.lucene.sandbox.search.MultiNormsLeafSimScorer$MultiFieldNormValues#advanceExact() 4.69% 3883 org.apache.lucene.search.DisiPriorityQueue#downHeap() 3.66% 3027 org.apache.lucene.search.similarities.BM25Similarity$BM25Scorer#score() 1.93% 1598 org.apache.lucene.search.DisjunctionDISIApproximation#advance() 1.92% 1590 org.apache.lucene.sandbox.search.CombinedFieldQuery$CombinedFieldScorer#freq() 1.92% 1588 org.apache.lucene.search.TopScoreDocCollector$SimpleTopScoreDocCollector$1#collect() 1.77% 1467 org.apache.lucene.sandbox.search.CombinedFieldQuery$WeightedDisiWrapper#freq() 1.68% 1392 org.apache.lucene.search.DisiPriorityQueue#top() 1.60% 1326 org.apache.lucene.search.DisiPriorityQueue#topList() 1.54% 1276 org.apache.lucene.util.PriorityQueue#downHeap() 1.50% 1243 java.lang.Math#round() 1.45% 1201 org.apache.lucene.codecs.lucene90.Lucene90NormsProducer$3#longValue() 1.38% 1145 java.util.HashMap#resize() 1.35% 1115 org.apache.lucene.store.ByteBufferGuard#ensureValid() 1.33% 1100 org.apache.lucene.sandbox.search.CombinedFieldQuery$1$1#mergeImpactsPerField() 1.21% 1001 java.util.HashMap$HashIterator#nextNode() 1.17% 972 java.util.LinkedList#linkLast() 1.13% 934 org.apache.lucene.codecs.lucene90.Lucene90PostingsReader#findFirstGreater() 1.07% 883 org.apache.lucene.sandbox.search.CombinedFieldQuery$CombinedFieldScorer#score() 1.06% 878 org.apache.lucene.store.ByteBufferGuard#getByte() 1.02% 841 org.apache.lucene.sandbox.search.MultiNormsLeafSimScorer#getNormValue() ``` suggesting quite some CPU time is spent on merging impacts within the same field. I'm suspecting this may occur when the max score is being computed too frequently, as frequent term's skip list would be "dense" and is also used to determine `upTo` for max score: https://github.com/apache/lucene/blob/2a9adb81df314ffeb92951bbf2d99fecc94fa581/lucene/core/src/java/org/apache/lucene/search/ImpactsDISI.java#L78-L82 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org