Hi All,

What I have found with Solr 4.6.0 to 4.7.1 is that memory usage continues
to grow with facet queries.

Originally I saw the issue with 40 facets over 60 collections (distributed
search). Memory usage would spike and solr would become unresponsive like
https://issues.apache.org/jira/browse/SOLR-2855

Then I tried to determine a safe limit at which the search would work
without breaking solr. But what I found is that I can break solr in the
same way with one facet (with many distinct values) and one collection. By
holding F5 (reload) in the browser for 10 seconds memory usage continues to
grow.

e.g.
http://localhost:8000/solr/collection/select?facet=true&facet.mincount=1&q=*:*&facet.threads=5&facet.field=id

I realize that faceting on 'id' is extreme but it seems to highlight the
issue that memory usage continues to grow (leak?) with each new query until
solr eventually breaks.

This does not happen with the 'old' method 'facet.method=enum' - memory
usage is stable and solr is unbreakable with my hold-reload test.

This post
http://shal.in/post/285908948/inside-solr-improvements-in-faceted-search-performance
describes the new/current facet method and states
"The structure is thrown away and re-created lazily on a commit. There
might be a few concerns around the garbage accumulated by the (re)-creation
of the many arrays needed for this structure. However, the performance gain
is significant enough to warrant the trade-off."

The wiki http://wiki.apache.org/solr/SimpleFacetParameters#facet.method
says the new/default method 'tends to use less memory'.

I use autoCommit (1min) on my collections - does mean there's a one minute
(or longer with no new docs) window where facet queries will effectively
'leak'?

Test setup. JDK 1.7.0u40 64-bit; Solr 4.7.1; 3 instances; 64GB each; 17m
docs; 2 replicas.

Cheers,
Damien.

Reply via email to