On 3/20/2014 6:54 PM, Harish Agarwal wrote: > I'm transitioning my index from a 3.x version to >4.6. I'm running a large > heap (20G), primarily to accomodate a large facet cache (~5G), but have > been able to run it on 3.x stably. > > On 4.6.0 after stress testing I'm finding that all of my shards are > spending all of their time in GC. After taking a heap dump and analyzing, > it appears that org.apache.lucene.util.WeakIdentityMap is using many Gs of > memory. Does anyone have any insight into which Solr component(s) use this > and whether this kind of memory consumption is to be expected?
I can't really say what WeakIdentityMap is doing. I can trace the only usage in Lucene to MMapDirectory, but it doesn't make a lot of sense for this to use a lot of memory, unless this is the source of the memory misreporting that Java 7 seems to do with MMap. See this message in a recent thread on this mailing list: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201403.mbox/%3c53285ca1.9000...@elyograg.org%3E If you have a lot of facets, one approach for performance is to use facet.method=enum so that your Java heap does not need to be super large. This does not actually reduce the overall system memory requirements. It just shifts the responsibility for caching to the operating system instead of Solr, and requires that you have enough memory to put a majority of the index into the OS disk cache. Ideally, there would be enough RAM for the entire index to fit. http://wiki.apache.org/solr/SolrPerformanceProblems Another option for facet memory optimization is docValues. One caveat: It is my understanding that the docValues content is the same as a stored field. Depending on your schema definition, this may be different than the indexed values that facets normally use. The docValues feature also helps with sorting. Thanks, Shawn