[ 
https://issues.apache.org/jira/browse/SOLR-8922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15233682#comment-15233682
 ] 

Yonik Seeley commented on SOLR-8922:
------------------------------------

OK, it took me *much* longer to do the benchmarks than planned.  It was 
sometimes difficult to get stable numbers (mostly due to variations in how 
hotspot can optimize/deoptimize things I imagine).

I took the first patch here and further optimized it, getting rid of some of 
the branches in the inner loop.  On the best runs, this did not seem to make 
too much of a difference, but it became apparent that slower runs were much 
more frequent before this optimization (the average seemed to be about 15% 
better).

Benchmark of new patch vs trunk:
10M doc index, 20% chance of a document missing the value for a field.
Queries consisted of many filters (using the filter() support in the query 
syntax)... 50 per request, with a filterCache size of 1 to generally avoid 
cache hits.  The large number of filters per request is to just try and make 
docset generation the bottleneck.

Single-threaded performance: (nterms is the number of unique terms in the field 
across the entire index)
|| nterms || perf improvement ||
|10|1.88%|
|100|-1.05%|
|1000|38.25%|
|10000|75.10%|
|100000|88.86%|
|1000000|94.49%|

Single-threaded analysis: one could expect single threaded performance of the 
patch to be slower... if we promote to a bit set, we will have allocated more 
memory and done more work.  Also, with a single thread doing requests, other 
CPU cores are free to concurrently perform GC.  The fact that the patch was 
*faster* for the field with 10 unique terms is most likely measurement 
inaccuracies.  nterms=10 had the most instability across runs, while nterms=100 
had the next most instability.  Standard deviation of performance results 
dropped with increasing nterms values.

Multi-threaded performance (8 threads on a 4 core CPU):
|| nterms || perf improvement ||
|100|14.49%|
|1000|93.36%|
|10000|179.07%|
|100000|216.65%|
|1000000|148.45%|


> DocSetCollector can allocate massive garbage on large indexes
> -------------------------------------------------------------
>
>                 Key: SOLR-8922
>                 URL: https://issues.apache.org/jira/browse/SOLR-8922
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Jeff Wartes
>            Assignee: Yonik Seeley
>         Attachments: SOLR-8922.patch
>
>
> After reaching a point of diminishing returns tuning the GC collector, I 
> decided to take a look at where the garbage was coming from. To my surprise, 
> it turned out that for my index and query set, almost 60% of the garbage was 
> coming from this single line:
> https://github.com/apache/lucene-solr/blob/94c04237cce44cac1e40e1b8b6ee6a6addc001a5/solr/core/src/java/org/apache/solr/search/DocSetCollector.java#L49
> This is due to the simple fact that I have 86M documents in my shards. 
> Allocating a scratch array big enough to track a result set 1/64th of my 
> index (1.3M) is also almost certainly excessive, considering my 99.9th 
> percentile hit count is less than 56k.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to