[
https://issues.apache.org/jira/browse/LUCENE-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Smiley resolved LUCENE-7211.
----------------------------------
Resolution: Fixed
> Spatial RPT Intersects should use DocIdSetBuilder to save memory/GC
> -------------------------------------------------------------------
>
> Key: LUCENE-7211
> URL: https://issues.apache.org/jira/browse/LUCENE-7211
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/spatial
> Reporter: Jeff Wartes
> Assignee: David Smiley
> Labels: spatialrecursiveprefixtreefieldtype
> Fix For: 6.1
>
> Attachments:
> SOLR-8944-Use-DocIdSetBuilder-instead-of-FixedBitSet.patch
>
>
> I’ve been continuing some analysis into JVM garbage sources in my Solr index.
> (5.4, 86M docs/core, 56k 99.9th percentile hit count with my query corpus)
> After applying SOLR-8922, I find my biggest source of garbage by a literal
> order of magnitude (by size) is the long[] allocated by FixedBitSet. From the
> backtraces, it appears the biggest source of FixBitSet creation in my case
> (by two orders of magnitude) is my use of queries that involve geospatial
> filtering.
> Specifically, IntersectsPrefixTreeQuery.getDocIdSet, here:
> https://github.com/apache/lucene-solr/blob/569b6ca9ca439ee82734622f35f6b6342c0e9228/lucene/spatial-extras/src/java/org/apache/lucene/spatial/prefix/IntersectsPrefixTreeQuery.java#L60
> Has this been considered for optimization? I can think of a few paths:
> 1. Persistent Object pools - FixedBitSet size is allocated based on maxDoc,
> which presumably changes less frequently than queries are issued. If an
> existing FixedBitSet were not available from a pool, the worst case (create a
> new one) would be no worse than the current behavior. The complication would
> be enforcement around when to return the object to the pool, but it looks
> like this has some lifecycle hooks already.
> 2. I note that a thing called a SparseFixedBitSet already exists, and puts
> considerable effort into allocating smaller chunks only as necessary. Is this
> not usable for this purpose? How significant is the performance difference?
> I'd be happy to spend some time on a patch, but I was hoping for a little
> more data around the current choices before choosing an approach.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]