Hmmm, it would be better to open a Solr JIRA and attach this as a patch. Although we've had some folks provide a Git-based rather than an SVN-based patch.
Anyone can open a JIRA, but you must create a signon to do that. It'd get more attention that way.... Best Erick On Tue, Jun 5, 2012 at 2:19 PM, Gregg Donovan <gregg...@gmail.com> wrote: > We've encountered GC spikes at Etsy after adding new > ExternalFileFields a decent number of times. I was always a little > confused by this behavior -- isn't it just one big float[]? why does > that cause problems for the GC? -- but looking at the FileFloatSource > code a little more carefully, I wonder if this is due to using a > WeakHashMap that is only cleaned by GC or manual invocation of a > request handler. > > FileFloatSource stores a WeakHashMap containing <IndexReader,<float[] > or CreationPlaceholder>>. In the code[1], it mentions that the > implementation is modeled after the FieldCache implementation. > However, the FieldCacheImpl adds listeners for IndexReader close > events and uses those to purge its caches. [2] Should we be doing the > same in FileFloatSource? > > Here's a mostly untested patch[3] with a possible implementation. > There are probably better ways to do it (e.g. I don't love using > another WeakHashMap), but I found it tough to hook into the > IndexReader lifecycle without a) relying on classes other than > FileFloatSource b) changing the public API of FIleFloatSource or c) > changing the implementation too much. > > There is a RequestHandler inside of FileFloatSource > (ReloadCacheRequestHandler) that can be used to clear the cache > entirely[4], but this is sub-optimal for us for a few reasons: > > --It clears the entire cache. ExternalFileFields often take some > non-trivial time to load and we prefer to do so during SolrCore > warmups. Clearing the entire cache while serving traffic would likely > cause user-facing requests to timeout. > --It forces an extra commit with its consequent cache cycling, etc.. > > I'm thinking of ways to monitor the size of FileFloatSource's cache to > track its size against GC pause times, but it seems tricky because > even calling WeakHashMap#size() has side-effects. Any ideas? > > Overall, what do you think? Does relying on GC to clean this cache > make sense as a possible cause of GC spikiness? If so, does the patch > [3] look like a decent approach? > > Thanks! > > --Gregg > > [1] https://github.com/apache/lucene-solr/blob/a3914cb5c0243913b827762db2d616ad7cc6801d/solr/core/src/java/org/apache/solr/search/function/FileFloatSource.java#L135 > [2] https://github.com/apache/lucene-solr/blob/1c0eee5c5cdfddcc715369dad9d35c81027bddca/lucene/core/src/java/org/apache/lucene/search/FieldCacheImpl.java#L166 > [3] https://gist.github.com/2876371 > [4] https://github.com/apache/lucene-solr/blob/a3914cb5c0243913b827762db2d616ad7cc6801d/solr/core/src/java/org/apache/solr/search/function/FileFloatSource.java#L310