Hello Frederic,
I'm CCing java-dev@lucene.apache.org as Michael McCandless has been
very helpful on IRC in discussing the ThreadLocal implication, and it
would be nice you could provide first-hand information.

There's a good reading to start from at
http://issues.apache.org/jira/browse/LUCENE-1383
Basically your proposal is having a problem which is that when you
close the ThreadLocal it's only going to cleanup the resources stored
by the current thread, not by others; setting the reference to null
also won't help:
Quoting the TLocal source's comment:
    "* However, since reference queues are not
     * used, stale entries are guaranteed to be removed only when
     * the table starts running out of space."

About your issues:
> 1. A ThreadLocal object should normally be a singleton used has key to the 
> thread map. Here it is repeatedly created and destroy!
It's only built in the constructor, and destroyed on close. So it's
lifecycle is linked to the Analyzer / FieldCache using it, probably a
long time, or the appropriate time to cleanup things.

> 2. Setting "t = null;" is not affecting the garbage collection of the 
> ThreadLocal map since t is the key (hard ref) of the thread map.
Well "t" is unfortunately being reused as a variable name: "t = null;"
is clearing the reference to the threadlocal, which really is the key
of the map used by the threadlocal and referenced by the current
Thread instance, and TLocal uses weak *Keys* not values (and the key
is the TLocal itself).

> 3. There are no call to t.remove() which will really clean the Map entry.
You could add one, but it would only cleanup the garbage from the
current thread, so it's ok but not enough. The current impl is making
sure all stuff is collected by wrapping it all in weak values.
Actually some stuff is not collected: the WeakReferences themselves,
but pointing to going-to-be-collected stuff. These WeakReferences are
going to be removed when the TLocal table is full, and should be
harmless (?).
As you pointed out, since Lucene 3 it's releasing what is possible to
release eagerly, but it's a very small slight optimization: you still
need the weak/hardref trick to clean the other values.

> 4. A ThreadLocal Map is already a WeakReference for the value.
No, it's on the keys: a collected ThreadLocal will be cleaned up for.
eventually :-/

> 5. Leaving objects on a ThreadLocal after it is out of your control is bad 
> practice. Another task may reuse the Thread and found dirty objects there.
Agree, but having weak values it's not a big issue. Also it's not
meant to be used by faint hearted, just people writing their own
Analyzer could have this wrong :)

> 6. We found (in all our tests) the hardRef Map to be completely unnecessary 
> in Lucene 2.4.1, but here I'm lacking more in depth knowledge of the 
> lifecycle of the objects added to this CloseableThreadLocal.
Well as it's being used as a cache functionality will be the same,
performance should be worse. AFAIK all TokenFilters are able to
rebuild what they need when get() returns null, you might have a
problem on the unlikely case of
org.apache.lucene.util.CloseableThreadLocal:68 having the assertion
fail, but again not affecting functionality (assuming assertions are
disabled).

A "vanilla" ThreadLocal is obviously faster than this, but then we end
up reverting LUCENE-1383 and so introducing more pressure on the GC.

It would be very interesting to find out why your implementation is
performing better? Maybe because in your case Analyzers are used by
one thread at a time, and so you're not leaking memory?
Could you tell more about this to lucene-dev directly?

Regards,
Sanne

2010/1/6 Frederic Simon <fr...@jfrog.org>:
> Thanks Emmanuel,
> Yes the main issue is that the hardRef map in this class was forcing all the
> objects to go to the Old generation space in the JVM GC, instead of staying
> at a ThreadLocal level. So, all objects put in the CloseableThreadLocal were
> GC only on full GC. On heavy lucene usage, it generated around 500Mb of heap
> for each 5 secs until full GC kicks in. Our problem is that we really a lot
> on SoftReference for our cache and so this Lucene behavior is really bad for
> us (Customer feedback:
> http://old.nabble.com/What's-the-memory-requirements-for-2.1.3--to27026622.html#a27026622
> ).
> With my class all objects stay in young gen and so the performance boost for
> us was huge.
>
> The issues with the class:
>
> A ThreadLocal object should normally be a singleton used has key to the
> thread map. Here it is reapeatdly created and destroy!
> Setting "t = null;" is not affecting the garbage collection of the
> ThreadLocal map since t is the key (hard ref) of the thread map.
> There are no call to t.remove() which will really clean the Map entry.
> A ThreadLocal Map is already a WeakReference for the value.
> Leaving objects on a ThreadLocal after it is out of your control is bad
> practice. Another task may reuse the Thread and found dirty objects there.
> We found (in all our tests) the hardRef Map to be completely unnecessary in
> Lucene 2.4.1, but here I'm lacking more in depth knowledge of the lifecycle
> of the objects added to this CloseableThreadLocal.
>
> In Lucene 3.0.0 I saw that they are now calling t.remove(), but the hardRef
> Map is still present.
> Hope this help,
> Fred.
> On Wed, Jan 6, 2010 at 11:08 AM, Emmanuel Bernard <emman...@hibernate.org>
> wrote:
>>
>> Hey Sanne,
>> Fred is one of the founder of Artifactory / jFrog and he has had a few
>> perf issues with Lucene.
>> Particularly the CloseableThreadLocal class was very memory intensive wo
>> real reason (at least in Java 5).
>>
>> Can you check out Fred's proposal and if you think it makes sense, push it
>> as a patch to the Lucene community. You are more of a public face than Fred
>> or me on the Lucene code, so the chances of acceptations will be higher :)
>>
>> The culprit class is
>>
>> http://svn.apache.org/repos/asf/lucene/java/tags/lucene_2_4_1/src/java/org/apache/lucene/util/CloseableThreadLocal.java
>>
>> and the proposed change is to reuse raw Thread locals
>>
>>
>> http://subversion.jfrog.org/artifactory/public/externals/lucene/jfrog/lucene-core/src/main/java/org/apache/lucene/util/CloseableThreadLocal.java
>>
>> I will let Fred comment on some of the behaviors he has seen while
>> profiling and why the change makes sense.
>>
>> Cheers
>>
>> Emmanuel
>
>
> --
> Co. Founder and Chief Architect
> JFrog Ltd
> http://www.jfrog.org/
> http://twitter.com/freddy33
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to