[ 
https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17230701#comment-17230701
 ] 

ASF subversion and git services commented on LUCENE-9378:
---------------------------------------------------------

Commit 06877b2c6e47bc481a79d7bedd8ea4fb099f1b4c in lucene-solr's branch 
refs/heads/master from Adrien Grand
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=06877b2 ]

LUCENE-9378: Make it possible to configure how to trade speed for compression 
on doc values. (#2069)

This adds a switch to `Lucene80DocValuesFormat` which allows to
configure whether to prioritize retrieval speed over compression ratio
or the other way around. When prioritizing retrieval speed, binary doc
values are written using the exact same format as before more aggressive
compression got introduced.

> Configurable compression for BinaryDocValues
> --------------------------------------------
>
>                 Key: LUCENE-9378
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9378
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Viral Gandhi
>            Priority: Major
>         Attachments: hotspots-v76x.png, hotspots-v76x.png, hotspots-v76x.png, 
> hotspots-v76x.png, hotspots-v76x.png, hotspots-v77x.png, hotspots-v77x.png, 
> hotspots-v77x.png, hotspots-v77x.png, image-2020-06-12-22-17-30-339.png, 
> image-2020-06-12-22-17-53-961.png, image-2020-06-12-22-18-24-527.png, 
> image-2020-06-12-22-18-48-919.png, snapshot-v77x.nps, snapshot-v77x.nps, 
> snapshot-v77x.nps, snapshots-v76x.nps, snapshots-v76x.nps, snapshots-v76x.nps
>
>          Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> Lucene 8.5.1 includes a change to always [compress 
> BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This 
> caused (~30%) reduction in our red-line QPS (throughput). 
> We think users should be given some way to opt-in for this compression 
> feature instead of always being enabled which can have a substantial query 
> time cost as we saw during our upgrade. [~mikemccand] suggested one possible 
> approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and 
> UNCOMPRESSED) and allowing users to create a custom Codec subclassing the 
> default Codec and pick the format they want.
> Idea is similar to Lucene50StoredFieldsFormat which has two modes, 
> Mode.BEST_SPEED and Mode.BEST_COMPRESSION.
> Here's related issues for adding benchmark covering BINARY doc values 
> query-time performance - [https://github.com/mikemccand/luceneutil/issues/61]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to