Hi, Is anyone running Lucene trunk/HEAD version in a serious production system? Anyone noticed any memory leaks?
I'm asking because I recently bravely went from 1.9.1 to 2.1-dev (trunk from about a week ago) and all of a sudden my application that was previosly consuming about 1.5GB (-Xmx1500m) now consumes 2.2GB, and blows up after it exhausts the whole heap and the GC can't make any more room there after running for about 3-6 hours and handling several tens of thousands of queries. I'd love to go back to 2.0.0, or even back to 1.9.1 and run that for a while and just double-check that it really is the the Lucene upgrade that is the source of the leak, but unfortunately because of LUCENE-701 (lockless commits), I can't go back that easily without reindexing... Moreover, I just looked at CHANGES.txt from 1.9.1 to present, and I think the biggest change since then was LUCENE-701. LUCENE-672 (segment merge policy) was also pretty big, but from what I can tell, the memory leak is somewhere in the search part, not indexing part. There have been a number of other search-time optimizations since 2.0.0, so it's hard to tell what the cause is. Of course, it could turn out to be a leak in my own code, but I'm pretty sure my changes were limited to removal of deprecated methods, so I can start using 2.1. IndexDescriptor indexDescriptor = getIndexDescriptorFromCache(indexID); try { // if this is a known index if (indexDescriptor != null) { cacheHits++; // if the index has changed since this Searcher was created, make a new Searcher long currentVersion = IndexReader.getCurrentVersion(indexID); if (currentVersion > indexDescriptor.lastKnownVersion) { hitButChanged++; // modified index detected indexDescriptor.lastKnownVersion = currentVersion; indexDescriptor.searcher = new LuceneSearcher(new File(indexID)); } else { // index not modified, reusing searcher } } // if this is a new index else { cacheMisses++; File indexDir = validateIndex(indexID); indexDescriptor = new IndexDescriptor(); indexDescriptor.indexDir = indexDir; indexDescriptor.lastKnownVersion = IndexReader.getCurrentVersion(indexDir); indexDescriptor.searcher = new LuceneSearcher(indexDir); } return cacheIndexDescriptor(indexDescriptor); } catch (IOException e) { throw new SearcherException("Cannot open index: " + indexID, e); } So this is just caching of "IndexDescriptor" objects, which have "LuceneSearcher" objects in them. The cache is a small LRU cache with max size of 37. The app actually consists of a few tens of thousands of Lucene indices, so this small cache results in only 20% cache hit ratio. And then the LuceneSearcher ctor looks like this: LuceneSearcher(File indexDir) throws IOException { _indexDir = FSDirectory.getDirectory(indexDir, false); _searcher = new IndexSearcher(_indexDir); } This _searcher (IndexSearcher) is then used in various search methods of this class. There are no close() calls anywhere. In other words, I don't explicitly close IndexSearchers, I just let them get GC collected. This stuff has been working for well for 1-2 years, and I just started exhausting the JVM heap about a week ago when I went from 1.9.1 to 2.1-dev. Any other overly brave/crazy souls out there who are running the bleeding edge version in production environment? This is running on FedoraCore3 under JDK 1.5_09 (latest 1.5). Thanks, Otis --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]