Jeff Wartes created SOLR-8944:
---------------------------------

             Summary: Improve geospatial garbage generation
                 Key: SOLR-8944
                 URL: https://issues.apache.org/jira/browse/SOLR-8944
             Project: Solr
          Issue Type: Improvement
            Reporter: Jeff Wartes


I’ve been continuing some analysis into JVM garbage sources in my Solr index. 
(5.4, 86M docs/core, 56k 99.9th percentile hit count with my query corpus)

After applying SOLR-8922, I find my biggest source of garbage by a literal 
order of magnitude (by size) is the long[] allocated by FixedBitSet. From the 
backtraces, it appears the biggest source of FixBitSet creation in my case (by 
two orders of magnitude) is my use of queries that involve geospatial filtering.

Specifically, IntersectsPrefixTreeQuery.getDocIdSet, here:
https://github.com/apache/lucene-solr/blob/569b6ca9ca439ee82734622f35f6b6342c0e9228/lucene/spatial-extras/src/java/org/apache/lucene/spatial/prefix/IntersectsPrefixTreeQuery.java#L60

Has this been considered for optimization? I can think of a few paths:

1. Persistent Object pools - FixedBitSet size is allocated based on maxDoc, 
which presumably changes less frequently than queries are issued. If an 
existing FixedBitSet were not available from a pool, the worst case (create a 
new one) would be no worse than the current behavior. The complication would be 
enforcement around when to return the object to the pool, but it looks like 
this has some lifecycle hooks already.
2. I note that a thing called a SparseFixedBitSet already exists, and puts 
considerable effort into allocating smaller chunks only as necessary. Is this 
not usable for this purpose? How significant is the performance difference?

I'd be happy to spend some time on a patch, but I was hoping for a little more 
data around the current choices before choosing an approach.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to