[ 
https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113489#comment-17113489
 ] 

Michael Sokolov commented on LUCENE-9378:
-----------------------------------------

I updated luceneutil to enable sorting by a BinaryDocValues field over title 
and ran a test t across a wide range of tasks, comparing branch_8_4 (before) 
and branch_8_5 (after). In this test, all tasks have a title sort criterion 
applied. Interestingly, BrowseDateTaxoFacets shows a big improvement! But 
otherwise we see a pretty significant degradation in performance.

 
||Task||QPS before||StdDev||QPS after||StdDev||Pct diff||
|MedTerm|9.40|(3.1%)|1.68|(0.4%)|-82.1% ( -83% - -81%)|
|LowTerm|20.17|(1.8%)|3.74|(0.4%)|-81.5% ( -82% - -80%)|
|Wildcard|5.25|(3.3%)|1.02|(0.4%)|-80.6% ( -81% - -79%)|
|Prefix3|12.83|(2.3%)|2.52|(0.4%)|-80.4% ( -81% - -79%)|
|OrHighLow|3.07|(4.1%)|0.71|(0.6%)|-76.9% ( -78% - -75%)|
|HighTerm|2.79|(4.6%)|0.72|(0.5%)|-74.1% ( -75% - -72%)|
|Fuzzy2|19.88|(2.7%)|5.16|(0.5%)|-74.0% ( -75% - -72%)|
|IntNRQ|329.04|(1.4%)|85.42|(0.4%)|-74.0% ( -74% - -73%)|
|AndHighHigh|5.44|(3.1%)|1.52|(0.6%)|-72.1% ( -73% - -70%)|
|AndHighMed|7.85|(2.4%)|2.55|(0.6%)|-67.4% ( -68% - -65%)|
|LowSloppyPhrase|5.11|(2.4%)|1.90|(0.6%)|-62.9% ( -64% - -61%)|
|OrHighHigh|1.47|(4.2%)|0.56|(1.0%)|-61.7% ( -64% - -58%)|
|LowPhrase|8.21|(1.9%)|3.23|(0.6%)|-60.6% ( -61% - -59%)|
|HighSloppyPhrase|1.48|(3.2%)|0.61|(0.9%)|-58.9% ( -61% - -56%)|
|Fuzzy1|112.25|(5.7%)|46.46|(1.1%)|-58.6% ( -61% - -54%)|
|MedSloppyPhrase|2.16|(3.0%)|0.94|(0.7%)|-56.5% ( -58% - -54%)|
|OrHighMed|1.23|(4.4%)|0.54|(1.2%)|-55.9% ( -58% - -52%)|
|MedPhrase|2.87|(2.6%)|1.77|(1.0%)|-38.5% ( -40% - -35%)|
|HighPhrase|0.28|(3.3%)|0.21|(1.9%)|-24.1% ( -28% - -19%)|
|HighIntervalsOrdered|0.48|(4.7%)|0.41|(2.9%)|-16.2% ( -22% - -9%)|
|Respell|99.24|(1.7%)|86.51|(0.8%)|-12.8% ( -15% - -10%)|
|AndHighLow|302.35|(2.5%)|276.95|(2.6%)|-8.4% ( -13% - -3%)|
|BrowseDayOfYearTaxoFacets|4202.04|(3.0%)|4057.48|(2.6%)|-3.4% ( -8% - 2%)|
|BrowseMonthTaxoFacets|4160.07|(2.8%)|4080.02|(2.2%)|-1.9% ( -6% - 3%)|
|BrowseDayOfYearSSDVFacets|3.29|(4.9%)|3.29|(7.1%)|0.0% ( -11% - 12%)|
|BrowseMonthSSDVFacets|3.68|(15.7%)|3.69|(16.9%)|0.3% ( -27% - 39%)|
|BrowseDateTaxoFacets|0.54|(6.3%)|0.96|(5.8%)|77.3% ( 61% - 95%)|

> Configurable compression for BinaryDocValues
> --------------------------------------------
>
>                 Key: LUCENE-9378
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9378
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Viral Gandhi
>            Priority: Minor
>
> Lucene 8.5.1 includes a change to always [compress 
> BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This 
> caused (~30%) reduction in our red-line QPS (throughput). 
> We think users should be given some way to opt-in for this compression 
> feature instead of always being enabled which can have a substantial query 
> time cost as we saw during our upgrade. [~mikemccand] suggested one possible 
> approach by introducing a *mode* in Lucene84DocValuesFormat (COMPRESSED and 
> UNCOMPRESSED) and allowing users to create a custom Codec subclassing the 
> default Codec and pick the format they want.
> Idea is similar to Lucene50StoredFieldsFormat which has two modes, 
> Mode.BEST_SPEED and Mode.BEST_COMPRESSION.
> Here's related issues for adding benchmark covering BINARY doc values 
> query-time performance - [https://github.com/mikemccand/luceneutil/issues/61]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to