[ https://issues.apache.org/jira/browse/LUCENE-8834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16859757#comment-16859757 ]
ASF subversion and git services commented on LUCENE-8834: --------------------------------------------------------- Commit 97ca9df7ef3733acd4babf10610797e36ac1d996 in lucene-solr's branch refs/heads/master from Tim Underwood [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=97ca9df ] LUCENE-8834: Cache the SortedNumericDocValues.docValueCount() value whenever it is used in a loop (#698) > Cache the SortedNumericDocValues.docValueCount() value whenever it is used in > a loop > ------------------------------------------------------------------------------------ > > Key: LUCENE-8834 > URL: https://issues.apache.org/jira/browse/LUCENE-8834 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Tim Underwood > Priority: Minor > Time Spent: 1h 10m > Remaining Estimate: 0h > > While troubleshooting some multi-valued facet performance problems in Solr I > noticed that caching the SortedNumericDocValues.docValueCount() value when > used as a loop condition provides a performance improvement. > Specifically, going from something like this: > {code:java} > for (int i = 1; i < longs.docValueCount(); i++) { > ... > } > {code} > to this: > {code:java} > final int docValueCount = longs.docValueCount(); > for (int i = 1; i < docValueCount; i++) { > ... > } > {code} > or this: > {code:java} > for (int i = 1, count = longs.docValueCount(); i < count; i++) { > ... > } > {code} > provides a faceting performance improvement when trying to facet using doc > values on a multi-valued field with more than a few values per document. > This patch modifies most of the places in Lucene/Solr that were not already > using this pattern. > h2. Unscientific Manual Benchmarks > I focused on the change to NumericFacets.java and > FacetFieldProcessorByHashDV.java since that is what I was specifically trying > to improve. > Details about my setup: > * Index was created using Lucene/Solr 7.6.0 (I'm in the process of upgrading > to 8.1.1) > * Total Docs: 5,736,951 > * I'm faceting on a single multi-valued field that has 63,070,176 total > values indexed (10.99 values on average per document.) > * OpenJDK 11 > h3. Lucene/Solr 7.6.0: > ||Facet Type||QTime Before Patch||QTime After Patch|| > |Legacy Facets|1042ms|854ms| > |JSON Facets|823ms|783ms| > h3. Lucene/Solr 8.1.1 (using the 7.6.0 index): > ||Facet Type||QTime Before Patch||QTime After Patch|| > |Legacy Facets|1043ms|777ms| > |JSON Facets|827ms|792ms| > The reported QTime is simply the lowest QTime I was able to get after > repeating the query a few dozen times. So not very scientific but it was > repeatable (removing the patch increased the times, reapplying the patch > decreased the times). > The patch touches both Lucene and Solr code which is why I have filed this > as a LUCENE issue. I can re-organized and break it apart if needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org