Hi, I could not provide a stack trace and IMHO it won't provide some useful information. But we've made a good progress in the analysis.
We took a deeper look at what happened, when an "external-file-field"-Request is sent to SOLR: * SOLR looks if there is a file for the requested query, e.g. "trousers" * If so, then SOLR loads the "trousers"-file and generates a HashMap-Entry consisting of a FileFloatSource-Object and a FloatArray with the size of the number of documents in the SOLR-index. Every document matched by the query gains the score-value, which is provided in the external-score-file. For every(!) other document SOLR writes a zero in that FloatArray * if SOLR does not find a file for the query-Request, then SOLR still generates a HashMapEntry with score zero for every document In our case we have about 8.5 Mio. documents in our index and one of those Arrays occupies about 34MB Heap Space. Having e.g. 100 different queries and using external file field for sorting the result, SOLR occupies about 3.4GB of Heap Space. The problem might be the use of WeakHashMap [1], which prevents the Garbage Collector from cleaning up unused Keys. What do you think could be a possible solution for this whole problem? (except from "don't use external file fields" ;) Regards Sven [1]: "A hashtable-based Map implementation with weak keys. An entry in a WeakHashMap will automatically be removed when its key is no longer in ordinary use. More precisely, the presence of a mapping for a given key will not prevent the key from being discarded by the garbage collector, that is, made finalizable, finalized, and then reclaimed. When a key has been discarded its entry is effectively removed from the map, so this class behaves somewhat differently than other Map implementations." -----Ursprüngliche Nachricht----- Von: mtnes...@gmail.com [mailto:mtnes...@gmail.com] Im Auftrag von Simon Rosenthal Gesendet: Mittwoch, 8. Juni 2011 03:56 An: solr-user@lucene.apache.org Betreff: Re: How to deal with many files using solr external file field Can you provide a stack trace for the OOM eexception ? On Tue, Jun 7, 2011 at 4:25 PM, Bohnsack, Sven <sven.bohns...@shopping24.de>wrote: > Hi all, > > we're using solr 1.4 and external file field ([1]) for sorting our > searchresults. We have about 40.000 Terms, for which we use this sorting > option. > Currently we're running into massive OutOfMemory-Problems and were not > pretty sure, what's the matter. It seems that the garbage collector stops > working or some processes are going wild. However, solr starts to allocate > more and more RAM until we experience this OutOfMemory-Exception. > > > We noticed the following: > > For some terms one could see in the solr log that there appear some > java.io.FileNotFoundExceptions, when solr tries to load an external file for > a term for which there is not such a file, e.g. solr tries to load the > external score file for "trousers" but there ist none in the > /solr/data-Folder. > > Question: is it possible, that those exceptions are responsible for the > OutOfMemory-Problem or could it be due to the large(?) number of 40k terms > for which we want to sort the result via external file field? > > I'm looking forward for your answers, suggestions and ideas :) > > > Regards > Sven > > > [1]: > http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html >