ARGH! I should have done this long before. When I run your test application (using Lucene.Net 2.9.4 in trunk) in a loop, I see no memory leakage.
private static void Main(string[] args) { for (int i = 0; i < 10; i++) { Main2(null); Console.WriteLine(">>> MEM: " + GC.GetTotalMemory(true)); } } DIGY. From: Ayende Rahien [mailto:aye...@ayende.com] Sent: Thursday, August 25, 2011 10:23 AM To: Itamar Syn-Hershko Cc: Digy; lucene-net-user@lucene.apache.org Subject: Re: [Lucene.Net] Memory leaks Itamar, You need to have some way of knowing that the item that you have there has a Close method, no? The only method that IDisposable have is Dispose(), IsDisposing and stuff like that isn't part of it. The CTL need some way to close its values when it is done, but in order to be able to do so, we need an interface to call it. A lot of the stuff in Lucene implements Closable, which is basically IDisposable On Thu, Aug 25, 2011 at 10:18 AM, Itamar Syn-Hershko <ita...@code972.com> wrote: Whats wrong in putting a call to Close() in Close() ? IDisposable in itself requires explicit request from the client (and can be called on dtor), so it is no different from IDisposable in that regard. And if you are calling Close() from client code, you might just as well do the rest of the cleaning up there as well by calling subsequent Close()-es from there. The only benefit of IDisposable is for lengthy disposing (the IsDisposing or however thats called), and I'm pretty sure we can live without it here. On Thu, Aug 25, 2011 at 10:09 AM, Ayende Rahien <aye...@ayende.com> wrote: The code looks good. What worries me is what happen when I have an object that has CTL that reside inside the CTL. That would mean that the second CTL would never be freed. Most specifically, look at FieldsReaderLocal, used in SegmentReader to hold a FieldsReader. The problem is that FieldsReader itself holds a CloseableThreadLocal (fieldsStreamTL). Let us consider the case of a SegmentReader that is used in two threads, then is closed. Its thread local variables are closed, but nothing tells the FieldsReader that _it_ is closed, so the fieldsStreamTL is never closed. The value will only be cleaned when you the thread is closed, and that may never happen (think thread pool threads, which are almost never allowed to die). That is why I made the thing implement IDisposable, because you have multiple levels of CTL. On Wed, Aug 24, 2011 at 11:05 PM, Digy <digyd...@gmail.com> wrote: I created a new version of CloseableThreadLocal for 2.9.4g branch which passes unit tests and is more similar to java version. I implemented java's "ThreadLocal" class with two different approach. One uses Thread.AllocateDataSlot (which is tooo slow) and the other one uses the same technique in current implementation. Propably It will not solve the memory-leakage problem, but It may be a starting point to work on. http://people.apache.org/~digy/CloseableThreadLocal.cs http://people.apache.org/~digy/CloseableThreadLocal.java (v 3.0.2) DIGY From: Ayende Rahien [mailto:aye...@ayende.com] Sent: Wednesday, August 24, 2011 4:26 PM To: Itamar Syn-Hershko Cc: lucene-net-user@lucene.apache.org; Digy Subject: Re: [Lucene.Net] Memory leaks That make sense, and explains how it works. Basically, there is no rooted hard reference anywhere to the values. It looks like the actual problem is happening when you have instances in a thread that is no longer used, the _values_ aren't weak references, and they are leaking. On Wed, Aug 24, 2011 at 3:41 PM, Itamar Syn-Hershko <ita...@code972.com> wrote: I moved this to a private conversation by accident, so back on the mailing list now. There seem to be a fundamental difference between the original Java impl and Lucene.NET's, which may as well be causing the leak. In the Java version ThreadLocal is a weak reference, and a hard reference is persisted as well along with the thread holding it (using a synchronized map). This allows to easily identify stale refs - if the thread holding the weak ref is not alive anymore, the hard ref is removed and the GC can process it. In Lucene.NET's implementation only a static map of weak references is kept. If I read this right, unlike Java's implementation where hard references are kept and dereferenced from within the ThreadLocal when the thread is dead, here there is nothing that will tell the GC to reclaim the memory referenced by the Weak ref. The Weak-Hard combination that is missing here is probably whats causing all this mess. Ayende's solution which uses explicit disposal is virtually the missing piece but in different clothing, since it requires changes to the core. If the .NET version was made to match Java's that probably won't be necessary. Before I rewrite CloseableThreadLocal to match the Java implementation - is there a reason it was made this way? On Wed, Aug 24, 2011 at 10:51 AM, Ayende Rahien <aye...@ayende.com> wrote: Itamar, Can you try writing the same code in Java and see if we get the same results? On Tue, Aug 23, 2011 at 8:58 PM, Itamar Syn-Hershko <ita...@code972.com> wrote: How come the Java version doesn't suffer from this issue? if your answer is going to be due to differences with the JVM, then IDisposable may well be our best option here On Tue, Aug 23, 2011 at 8:51 PM, Digy <digyd...@gmail.com> wrote: Exactly, but I couldn't find a nice solution yet. Any help welcomed J Btw, Can you open a new issue for this ? Other people may also want to work on it. https://issues.apache.org/jira/browse/LUCENENET Thanks, DIGY From: Ayende Rahien [mailto:aye...@ayende.com] Sent: Tuesday, August 23, 2011 8:32 PM To: Digy Cc: Itamar Syn-Hershko Subject: Re: [Lucene.Net] Memory leaks I think that the main issue is that you have per thread state, but no way of releasing this per thread state when you are closing the searcher. On Tue, Aug 23, 2011 at 7:59 PM, Digy <digyd...@gmail.com> wrote: Yes, IndexSearcher is expected to be thread safe. I am not saying, there is a misuse of it. Instead I try to understand the real cause of it. IDisposable solution seems to hide the real bug. DIGY From: Ayende Rahien [mailto:aye...@ayende.com] Sent: Tuesday, August 23, 2011 7:48 PM To: Digy Cc: Itamar Syn-Hershko Subject: Re: [Lucene.Net] Memory leaks Digy, IndexSearcher is thread safe, right? >From the docs (http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/search/Index Searcher.html): NOTE: IndexSearcher instances are completely thread safe, meaning multiple threads can call any of its methods, concurrently. If your application requires external synchronization, you should not synchronize on the IndexSearcher instance; use your own (non-Lucene) objects instead. The problem is that when you use an index searcher in a different thread then the one you close it on, you are going to leak some memory. On Tue, Aug 23, 2011 at 7:35 PM, Digy <digyd...@gmail.com> wrote: If I change the searching code as below, It works as expected for (int i = 0; i < 150; i++) { ExecuteInParallel(() => { var searchers = new[] { new IndexSearcher(directories[0], true), new IndexSearcher(directories[1], true), }; { var topDocs = searchers[0].Search(new TermQuery(new Term("val", "test")), 15); for (int j = 0; j < topDocs.totalHits; j++) { searchers[0].Doc(j); } topDocs = searchers[1].Search(new TermQuery(new Term("val", "test")), 15); for (int j = 0; j < topDocs.totalHits; j++) { searchers[1].Doc(j); } TryAddSlot(slotsField, allSlotsFromAllThreads); }; foreach (var indexSearcher in searchers) { indexSearcher.Close(); } }); } Since IndexSearcher is closed in the thread where caching is done. I have to think about it some more but I don't think that adding IDisposables to some classes would be a good solution. It is more related to threading issues. Anyway, I am working on it. DIGY From: Ayende Rahien [mailto:aye...@ayende.com] Sent: Tuesday, August 23, 2011 6:05 PM To: digy digy Cc: Itamar Syn-Hershko Subject: Re: [Lucene.Net] Memory leaks It appears that we need to use searching as well to reproduce this. Here is another version of the program. I apologize for the code, but I basically needed to ensure that we would get something resembling what we have in RavenDB, which is a multi threaded access pattern. On Tue, Aug 23, 2011 at 5:43 PM, digy digy <digyd...@gmail.com> wrote: If I add GC.Collect(); after simpleAnalyzer.Close(); It outputs zeroes. Am I missing something? DIGY On Tue, Aug 23, 2011 at 5:07 PM, Itamar Syn-Hershko <ita...@code972.com> wrote: See attached. The key to this problem is multi-threading. In this sample app, some of the threads do not dispose of their resources properly, even though Close is called correctly. The suggested patch is confirmed to fix this behavior, but there may be other pit-falls our use cases didn't reveal. Itamar. On Tue, Aug 23, 2011 at 4:19 PM, digy digy <digyd...@gmail.com> wrote: Hi Itamar, Can you send a simple self-contained test case showing the bug? DIGY On Tue, Aug 23, 2011 at 2:41 PM, Itamar Syn-Hershko <ita...@code972.com> wrote: Hi, We recently discovered another memory leak within Lucene.NET, that is related to LUCENENET-358 (CloseableThreadLocal memory leak in LocalDataStoreSlot). As it turns out, disposables weren't disposed properly. With RavenDB, that caused memory consumption to always grow. We ended up modifying some core files to assure proper disposal, and things are looking better now. We are still testing, and will follow up if more fixes will be required. Here are the relevant diffs in our fork of Lucene.NET: https://github.com/ayende/ravendb/commit/a5031049599968f85e733f22a1ed568c097 03c30 https://github.com/ayende/ravendb/commit/bd9ca1ab154732d27985c2d237917af8bbd 5fbf3 The full discussion and bug hunting is here: http://groups.google.com/group/ravendb/browse_thread/thread/9c6340987407aece /1d86d6a61175233c I'm also attaching a diff between our fork to the Lucene.Net_2_9_2_RC2 tag. Will be happy to see this incorporated into 2.9.4 (and for it to be released already...). Itamar.