[ 
https://issues.apache.org/jira/browse/SOLR-16555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640070#comment-17640070
 ] 

ASF subversion and git services commented on SOLR-16555:
--------------------------------------------------------

Commit 1a964804025f7fddf31cbd54b51e2619decbcae2 in lucene-solr's branch 
refs/heads/branch_8_11 from Kevin Risden
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1a964804025 ]

SOLR-16555: SolrIndexSearcher - FilterCache intersections/andNot should not 
clone bitsets repeatedly (#1184) (#2675)

Co-authored-by: David Smiley <[email protected]>

> SolrIndexSearcher - FilterCache intersections/andNot should not clone bitsets 
> repeatedly
> ----------------------------------------------------------------------------------------
>
>                 Key: SOLR-16555
>                 URL: https://issues.apache.org/jira/browse/SOLR-16555
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: query
>            Reporter: Kevin Risden
>            Assignee: Kevin Risden
>            Priority: Major
>              Labels: performance
>             Fix For: main (10.0)
>
>         Attachments: Screenshot 2022-11-16 at 14.52.37.png, Screenshot 
> 2022-11-16 at 14.53.23.png, Screenshot 2022-11-16 at 14.53.35.png, Screenshot 
> 2022-11-17 at 13.03.21.png, Screenshot 2022-11-17 at 13.25.57.png, Screenshot 
> 2022-11-17 at 13.28.06.png
>
>          Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> SolrIndexSearcher takes the bitset from the result and tries to combine it 
> with all the cached filter queries. Currently this duplicates the bitset 
> multiple times based on the number of filter queries. It looks like this 
> isn't necessary and instead could just operate on the bitset itself or a 
> single mutable copy of the bitset.
> Lines 1219 to 1225
> https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1219
> ----
> I've been using async profiler 
> (https://github.com/jvm-profiling-tools/async-profiler) to look at some 
> performance stuff with Solr for a client. Originally I looked at CPU in the 
> profile and found that I could also capture and look at memory allocations 
> during the same run. This led me to finding this crazy amount of memory 
> allocation over a short period of time.
> Async profiler is being run with the following parameters which captures cpu, 
> memory, and lock information for a 300 second period on some pid.
> {code:java}
> /opt/async-profiler/profiler.sh -a -d 300 -o jfr -e cpu,alloc,lock -f 
> /tmp/profile.jfr PID_GOES_HERE
> {code}
> The resulting JFR for this is ~100-200MB usually and so not going to attach 
> it here since it has some client specific methods in some calls in it.
> However screenshots of the findings from loading the jfr in both IntelliJ and 
> Java Mission Control you can see some of the findings:
>  !Screenshot 2022-11-17 at 13.28.06.png|width=750! 
> The memory allocated from SolrIndexSearcher#getProcessedFilter is ~60% of 
> total memory allocated during the 5 minute profile period.
>  !Screenshot 2022-11-16 at 14.52.37.png|width=750! 
> This shows that in 5 minutes ~1TB (yes thats TB=terabyte or 1000GB) of memory 
> allocations for SolrIndexSearcher#getProcessedFilter
>  !Screenshot 2022-11-16 at 14.53.23.png|width=750! 
> ~680GB was allocated from BitDocSet#intersection
>  !Screenshot 2022-11-16 at 14.53.35.png|width=750! 
> ~315GB was allocated from BitDocSet#andNot
> Based on CPU profiling, it is amazing to me but G1 garbage collector is 
> keeping up. Each of these objects are very short lived.
> This was during some load testing and able to give some query types in 
> question:
> * ~30 queries/second
> * ~5 fq parameters per query
> * so ~9000 queries in 5 minutes with ~45000 fq clauses
> * 10GB heap for the Solr instance with 128GB ram on the node and index size 
> completely fits in memory.
> * this is one shard on the node for testing and ~23 million documents in the 
> shard - optimized so no deletes.
> * This was tested with Solr 8.8, but as far as I can tell the code has not 
> changed for the main branch significantly (slight refactoring of class 
> hierachy after SOLR-14256 but didn't affect intersection/andNot).
> Based on my rough calculations, that is ~24MB of heap per filter query clause 
> (1.06TB/45000) or ~120MB of heap per query (assuming 5 fq per query).
> ----
> I loaded the same JFR up in Java Mission Control to see if there were other 
> insights and found TLAB memory allocation details.
>  !Screenshot 2022-11-17 at 13.25.57.png|width=750! 
> Since most of these are large allocations, Java mission control is very 
> helpful in saying that there are a large number of allocations outside of 
> TLAB (~80%) and that you probably shouldn't do that.
>  !Screenshot 2022-11-17 at 13.03.21.png|width=750! 
> From what I understand about allocations outside of TLAB is that they should 
> in theory be a small portion of allocations, but they aren't...
> * https://shipilev.net/jvm/anatomy-quarks/4-tlab-allocation/
> * https://www.opsian.com/blog/jvm-tlabs-important-multicore/
> * 
> https://stackoverflow.com/questions/26351243/allocations-in-new-tlab-vs-allocations-outside-tlab



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to