[ 
https://issues.apache.org/jira/browse/LUCENE-8834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859757#comment-16859757
 ] 

ASF subversion and git services commented on LUCENE-8834:
---------------------------------------------------------

Commit 97ca9df7ef3733acd4babf10610797e36ac1d996 in lucene-solr's branch 
refs/heads/master from Tim Underwood
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=97ca9df ]

LUCENE-8834: Cache the SortedNumericDocValues.docValueCount() value whenever it 
is used in a loop (#698)



> Cache the SortedNumericDocValues.docValueCount() value whenever it is used in 
> a loop
> ------------------------------------------------------------------------------------
>
>                 Key: LUCENE-8834
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8834
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Tim Underwood
>            Priority: Minor
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> While troubleshooting some multi-valued facet performance problems in Solr I 
> noticed that caching the SortedNumericDocValues.docValueCount() value when 
> used as a loop condition provides a performance improvement.
> Specifically, going from something like this:
> {code:java}
> for (int i = 1; i < longs.docValueCount(); i++) {
>   ...
> }
> {code}
> to this:
> {code:java}
> final int docValueCount = longs.docValueCount();
> for (int i = 1; i < docValueCount; i++) {
>   ...
> }
> {code}
> or this:
> {code:java}
> for (int i = 1, count = longs.docValueCount(); i < count; i++) {
>   ...
> }
> {code}
> provides a faceting performance improvement when trying to facet using doc 
> values on a multi-valued field with more than a few values per document.
> This patch modifies most of the places in Lucene/Solr that were not already 
> using this pattern.
> h2. Unscientific Manual Benchmarks
> I focused on the change to NumericFacets.java and 
> FacetFieldProcessorByHashDV.java since that is what I was specifically trying 
> to improve.
> Details about my setup:
> * Index was created using Lucene/Solr 7.6.0 (I'm in the process of upgrading 
> to 8.1.1)
> * Total Docs: 5,736,951
> *  I'm faceting on a single multi-valued field that has 63,070,176 total 
> values indexed (10.99 values on average per document.)
> * OpenJDK 11
> h3. Lucene/Solr 7.6.0:
> ||Facet Type||QTime Before Patch||QTime After Patch||
> |Legacy Facets|1042ms|854ms|
> |JSON Facets|823ms|783ms|
> h3. Lucene/Solr 8.1.1 (using the 7.6.0 index):
> ||Facet Type||QTime Before Patch||QTime After Patch||
> |Legacy Facets|1043ms|777ms|
> |JSON Facets|827ms|792ms|
> The reported QTime is simply the lowest QTime I was able to get after 
> repeating the query a few dozen times. So not very scientific but it was 
> repeatable (removing the patch increased the times, reapplying the patch 
> decreased the times).
>  The patch touches both Lucene and Solr code which is why I have filed this 
> as a LUCENE issue.  I can re-organized and break it apart if needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to