There was a small mistake - there is a single TermInfoReader per segment.

-----Original Message-----
From: Robert Engels [mailto:[EMAIL PROTECTED]
Sent: Wednesday, March 22, 2006 11:37 AM
To: java-dev@lucene.apache.org
Subject: RE: [jira] Created: (LUCENE-529) TermInfosReader and other +
instance ThreadLocal => transient/odd memory leaks =>
OutOfMemoryException


There is only a single TermInfoReader per index. In order to share this 
instance with multiple threads, and avoid the overhead of creating new 
enumerators for each request, the enumerator for the thread is stored in a 
thread local. Normally, in a server application, threads are pooled, so new 
threads are not constantly created and destroyed, so the memory leak is 
insiginificant.

The same reasoning holds true for the SegmentReader class.


-----Original Message-----
From: Andy Hind (JIRA) [mailto:[EMAIL PROTECTED]
Sent: Wednesday, March 22, 2006 11:07 AM
To: java-dev@lucene.apache.org
Subject: [jira] Created: (LUCENE-529) TermInfosReader and other +
instance ThreadLocal => transient/odd memory leaks =>
OutOfMemoryException


TermInfosReader and other + instance ThreadLocal => transient/odd memory leaks 
=>  OutOfMemoryException 
--------------------------------------------------------------------------------------------------------

         Key: LUCENE-529
         URL: http://issues.apache.org/jira/browse/LUCENE-529
     Project: Lucene - Java
        Type: Bug
  Components: Index  
    Versions: 1.9    
 Environment: Lucene 1.4.3 with 1.5.0_04 JVM or newer......will aplpy to 1.9 
code 
    Reporter: Andy Hind


TermInfosReader uses an instance level ThreadLocal for enumerators.
This is a transient/odd memory leak in lucene 1.4.3-1.9 and applies to current 
JVMs, 
not just an old JVM issue as described in the finalizer of the 1.9 code.

There is also an instance level thread local in SegmentReader....which will 
have the same issue.
There may be other uses which also need to be fixed.

I don't understand the intended use for these variables.....however

Each ThreadLocal has its own hashcode used for look up, see the ThreadLocal 
source code. Each instance of TermInfosReader will be creating an instance of 
the thread local. All this does is create an instance variable on each thread 
when it accesses the thread local. Setting it to null in the finaliser will set 
it to null on one thread, the finalizer thread, where it has never been 
created.  There is no point to this :-(

I assume there is a good concurrency reason why an instance variable can not be 
used...

I have not used multi-threaded searching, but I have used a lot of threads each 
making searchers and searching.
1.4.3 has a clear memory leak caused by this thread local. This use case above 
is definitely solved by setting the thread local to null in the close(). This 
at least has a chance of being on the correct thread :-) 
I know reusing Searchers would help but that is my choice and I will get to 
that later .... 

Now you wnat to know why....

Thread locals are stored in a table of entries. Each entry is *weak reference* 
to the key (Here the TermInfosReader instance)  and a *simple reference* to the 
thread local value. When the instance is GCed its key becomes null. 
This is now a stale entry in the table.
Stale entries are cleared up in an ad hoc way and until they are cleared up the 
value will not be garbage collected.
Until the instance is GCed it is a valid key and its presence may cause the 
table to expand.
See the ThreadLocal code.

So if you have lots of threads, all creating thread locals rapidly, you can get 
each thread holding a large table of thread locals which all contain many stale 
entries and preventing some objects from being garbage collected. 
The limited GC of the thread local table is not enough to save you from running 
out of memory.  

Summary:
========
- remove finalizer()
- set the thread local to null in close() 
  - values will be available for gc 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to