For every IndexReader that is opened
- there is one SegmentReader for every segment in the index 
   - with its thread local
   - for each of these there is a TermInfosReader + its thread local.

So I get 2 * (no of index segments) thread locals.

I am creating index readers for a main index and transactional updates
and layering the two. At the moment this is an issue, under stress
testing, using tomcat, with thread pooling, with a pretty big changing
index, left running for a few hours, it blows up.

Thread locals are also used in other areas of the app.

It would be better if threads were created and destroyed!

It is certainly not insignificant for me and gives a JVM that creeps up
in size pretty steadily over time.

I have fixed this issue locally in the code and it works.

Regards

Andy

 


-----Original Message-----
From: Robert Engels [mailto:[EMAIL PROTECTED] 
Sent: 22 March 2006 17:46
To: java-dev@lucene.apache.org
Subject: RE: [jira] Created: (LUCENE-529) TermInfosReader and other +
instance ThreadLocal => transient/odd memory leaks =>
OutOfMemoryException

There was a small mistake - there is a single TermInfoReader per
segment.

-----Original Message-----
From: Robert Engels [mailto:[EMAIL PROTECTED]
Sent: Wednesday, March 22, 2006 11:37 AM
To: java-dev@lucene.apache.org
Subject: RE: [jira] Created: (LUCENE-529) TermInfosReader and other +
instance ThreadLocal => transient/odd memory leaks =>
OutOfMemoryException


There is only a single TermInfoReader per index. In order to share this
instance with multiple threads, and avoid the overhead of creating new
enumerators for each request, the enumerator for the thread is stored in
a thread local. Normally, in a server application, threads are pooled,
so new threads are not constantly created and destroyed, so the memory
leak is insiginificant.

The same reasoning holds true for the SegmentReader class.


-----Original Message-----
From: Andy Hind (JIRA) [mailto:[EMAIL PROTECTED]
Sent: Wednesday, March 22, 2006 11:07 AM
To: java-dev@lucene.apache.org
Subject: [jira] Created: (LUCENE-529) TermInfosReader and other +
instance ThreadLocal => transient/odd memory leaks =>
OutOfMemoryException


TermInfosReader and other + instance ThreadLocal => transient/odd memory
leaks =>  OutOfMemoryException 
------------------------------------------------------------------------
--------------------------------

         Key: LUCENE-529
         URL: http://issues.apache.org/jira/browse/LUCENE-529
     Project: Lucene - Java
        Type: Bug
  Components: Index  
    Versions: 1.9    
 Environment: Lucene 1.4.3 with 1.5.0_04 JVM or newer......will aplpy to
1.9 code 
    Reporter: Andy Hind


TermInfosReader uses an instance level ThreadLocal for enumerators.
This is a transient/odd memory leak in lucene 1.4.3-1.9 and applies to
current JVMs, 
not just an old JVM issue as described in the finalizer of the 1.9 code.

There is also an instance level thread local in SegmentReader....which
will have the same issue.
There may be other uses which also need to be fixed.

I don't understand the intended use for these variables.....however

Each ThreadLocal has its own hashcode used for look up, see the
ThreadLocal source code. Each instance of TermInfosReader will be
creating an instance of the thread local. All this does is create an
instance variable on each thread when it accesses the thread local.
Setting it to null in the finaliser will set it to null on one thread,
the finalizer thread, where it has never been created.  There is no
point to this :-(

I assume there is a good concurrency reason why an instance variable can
not be used...

I have not used multi-threaded searching, but I have used a lot of
threads each making searchers and searching.
1.4.3 has a clear memory leak caused by this thread local. This use case
above is definitely solved by setting the thread local to null in the
close(). This at least has a chance of being on the correct thread :-) 
I know reusing Searchers would help but that is my choice and I will get
to that later .... 

Now you wnat to know why....

Thread locals are stored in a table of entries. Each entry is *weak
reference* to the key (Here the TermInfosReader instance)  and a *simple
reference* to the thread local value. When the instance is GCed its key
becomes null. 
This is now a stale entry in the table.
Stale entries are cleared up in an ad hoc way and until they are cleared
up the value will not be garbage collected.
Until the instance is GCed it is a valid key and its presence may cause
the table to expand.
See the ThreadLocal code.

So if you have lots of threads, all creating thread locals rapidly, you
can get each thread holding a large table of thread locals which all
contain many stale entries and preventing some objects from being
garbage collected. 
The limited GC of the thread local table is not enough to save you from
running out of memory.  

Summary:
========
- remove finalizer()
- set the thread local to null in close() 
  - values will be available for gc 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to