Re: ThreadLocal leak (was Re: Leaking org.apache.lucene.index.* objects)

Otis Gospodnetic Tue, 19 Dec 2006 11:23:01 -0800

Hi,

I _think_ Robert is right and ThreadLocals are not the problem (I tried getting 
rid of them, and replacing them with an instance var last week, but run into 
problems with multi-threaded unit tests).
What I'm seeing while profiling (and in production) is the accumulation of 
these:

org.apache.lucene.search.FieldCacheImpl$Entry
org.apache.lucene.search.FieldCacheImpl$CreationPlaceholder

This is related to http://issues.apache.org/jira/browse/LUCENE-651 (the commit 
for that patch also happens to coincide with when I started seeing the leak).

The number of those 2 FieldCache* instances seems to be about 2 x # of unique 
IndexReaders (indices) that have been searched so far.

I reversed that LUCENE-651 patch locally, and now I am *not* seeing the number 
of FieldCacheImpl$Entry instances grow any more! Hm, hm, hm, hm. :)  
FieldCacheImpl$CreationPlaceholder came with LUCENE-651, so I'm not seeing any 
of them, of course.

I _think_ the reason I see this leak and most other people don't, is
because I'm running an application with LOTS of frequently-changing
indices (look at http://simpy.com ).  That FieldCacheImpl$CreationPlaceholder 
uses IndexReaders as
keys in a WeakHashMap *and* another HashMap.  Even though I close my 
IndexSearchers and they close their underlying IndexReaders, I think 
FieldCacheImpl is holding onto them, thus not allowing them to be GCed.  Here 
is what the dump of my production heap is showing right now, for instance:

$ jmap -histo:live `jps | grep Server | awk '{print $1}'` | grep SegmentReader
 46:      6630      583440  org.apache.lucene.index.SegmentReader

$ jmap -histo:live `jps | grep Server | awk '{print $1}'` | grep MultiReader
110:       735       41160  org.apache.lucene.index.MultiReader

$ jmap -histo:live `jps | grep Server | awk '{print $1}'` | grep FieldCache
 74:      7444      178656  org.apache.lucene.search.FieldCacheImpl$Entry
 85:      7434      118944  
org.apache.lucene.search.FieldCacheImpl$CreationPlaceholder

6630 + 735 = 7365 ~ 7434

But, like I said, I'm closing IndexSearchers (and they are closing their 
IndexReaders), so there are only 114 open of them at the moment:

$ jmap -histo:live `jps | grep Server | awk '{print $1}'` | grep IndexSearcher
337:       114        2736  org.apache.lucene.search.IndexSearcher

So, I think that LUCENE-651 introduced a bug.
I think the reason why Robert and others are not seeing this is because 1) few 
people are using 2.1-dev AND 2) running apps with LOTs of indices.  If I had an 
app with a single or a handful of indices, I would probably never have noticed 
this leak.

Can anyone else have a look at the patch for LUCENE-651 (or just at 
FieldCacheImpl) and see if there is a quick fix?
Also, wouldn't it be better if IndexReaders had a reference to their copy 
FieldCache, which they can kill when they are closed?  What we have now is a 
chunky FieldCacheImpl that has references to IndexReaders, and doesn't know 
when they are closed, and thus it can't remove them and their data from its 
internal caches?

Thanks,
Otis

----- Original Message ----
From: Mark Miller <[EMAIL PROTECTED]>
To: java-dev@lucene.apache.org
Sent: Sunday, December 17, 2006 9:22:05 PM
Subject: Re: ThreadLocal leak (was Re: Leaking org.apache.lucene.index.* 
objects)

I think that Otis originally said that the problem came up when he 
started using the latest build off of the trunk. I don't believe he 
experienced the problem on 2.0. Anyone on the bleeding edge not noticing 
the leak? I am going to be deploying with 2.trunk soon and am very 
interested <G>.

robert engels wrote:
> Our search server also used long-lived threads. No memory problems 
> whatsoever.
>
> On Dec 17, 2006, at 3:51 PM, Paul Smith wrote:
>
>>
>> On 16/12/2006, at 6:15 PM, Otis Gospodnetic wrote:
>>
>>> Moving to java-dev, I think this belongs here.
>>> I've been looking at this problem some more today and reading about 
>>> ThreadLocals.  It's easy to misuse them and end up with memory 
>>> leaks, apparently... and I think we may have this problem here.
>>>
>>> The problem here is that ThreadLocals are tied to Threads, and I 
>>> think the assumption in TermInfosReader and SegmentReader is that 
>>> (search) Threads are short-lived: they come in, scan the index, do 
>>> the search, return and die.  In this scenario, their ThreadLocals go 
>>> to heaven with them, too, and memory is freed up.
>>
>> Otis, we have an index server being served inside Tomcat, where an 
>> Application instance makes a search request via vanilla HTTP post, so 
>> our connector threads definitely do stay alive for quite a while.  
>> We're using Lucene 2.0, and our Index server is THE most stable of 
>> all our components, up for over month (before being taken down for 
>> updates) searching hundreds of various sized indexes sized up to 7Gb 
>> in size, serving 1-2 requests/second during peak usage.
>>
>> No memory leak spotted at our end, but I'm watching this thread with 
>> interest! :)
>>
>> cheers,
>>
>> Paul Smith
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: ThreadLocal leak (was Re: Leaking org.apache.lucene.index.* objects)

Reply via email to