Chris, if possible, could you try out this patch to see if it fixes
the leak you're seeing? Thanks!
Mike
Michael McCandless (JIRA) wrote:
[ https://issues.apache.org/jira/browse/LUCENE-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated LUCENE-1383:
---------------------------------------
Attachment: LUCENE-1383.patch
Attached patch. All tests pass.
The patch adds o.a.l.util.CloseableThreadLocal. It's a wrapper
around ThreadLocal that wraps the values inside a WeakReference, but
then also holds a strong reference to the value (to ensure GC
doesn't reclaim it) until you call the close method. On calling
close, GC is then free to reclaim all values you had stored,
regardless of how long it takes ThreadLocal's implementation to
actually release its references.
There are a couple places in Lucene where I left the current usage
of ThreadLocal.
First, Analyzer.java uses ThreadLocal to hold reusable token
streams. There is no "close" called for Analyzer, so unless we are
willing to add a finalizer to call CloseableThreadLocal.close() I
think we can leave it.
Second, some of the contrib/benchmark tasks use ThreadLocal to store
per-thread DateFormat which should use tiny memory.
Workaround ThreadLocal's "leak"
-------------------------------
Key: LUCENE-1383
URL: https://issues.apache.org/jira/browse/LUCENE-1383
Project: Lucene - Java
Issue Type: Bug
Components: Index
Affects Versions: 1.9, 2.0.0, 2.1, 2.2, 2.3, 2.3.1, 2.3.2
Reporter: Michael McCandless
Assignee: Michael McCandless
Fix For: 2.4
Attachments: LUCENE-1383.patch
Java's ThreadLocal is dangerous to use because it is able to take a
surprisingly very long time to release references to the values you
store in it. Even when a ThreadLocal instance itself is GC'd, hard
references to the values you had stored in it are easily kept for
quite some time later.
While this is not technically a "memory leak", because eventually
(when the underlying Map that stores the values cleans up its "stale"
references) the hard reference will be cleared, and GC can proceed,
its end behavior is not different from a memory leak in that under
the
right situation you can easily tie up far more memory than you'd
expect, and then hit unexpected OOM error despite allocating an
extremely large heap to your JVM.
Lucene users have hit this many times. Here's the most recent
thread:
http://mail-archives.apache.org/mod_mbox/lucene-java-dev/200809.mbox/%3C6e3ae6310809091157j7a9fe46bxcc31f6e63305fcdc%40mail.gmail.com%3E
And here's another:
http://mail-archives.apache.org/mod_mbox/lucene-java-dev/200807.mbox/%3CF5FC94B2-E5C7-40C0-8B73-E12245B91CEE%40mikemccandless.com%3E
And then there's LUCENE-436 and LUCENE-529 at least.
A google search for "ThreadLocal leak" yields many compelling hits.
Sun does this for performance reasons, but I think it's a terrible
trap and we should work around it with Lucene.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]