Looks fine, and there's the only problem: all this does not work. Possibly, this is JVM issue, possibly, there's another reason, but when I reproduce the described situation with the web-app, its classloader cannot be unloaded, and the profiler shows that the cause is that ThreadLocal. When I apply the workaround (run code which uses Lucene in a Thread which dies shortly after executing the code), the problem disappears.
I'm not holding references to this ThreadLocal anywhere in my code, it's package-local field in package-local class of Lucene. Robert Engels wrote: > > You are mistaken - Yonik's comment in that thread is correct > (although it is not just at table resize - any time a ThreadLocal is > added, and any time the ThreadLocal is not found in its direct hash > it might clear entries). > > The ThreadLocals map only has a WeakReference to the ThreadLocal, so > if that is the only reference, it will be GC'd - eventually, and then > it will be cleared as new ThreadLocals are created. > > With a static reference, the thread can reference the ThreadLocal at > any time, and thus the WeakReference will not be cleared. > > If the object is VERY large, and new ThreadLocals are not created it > could cause a problem, but I don't think that is the case with > Lucene, as the objects stored in ThreadLocals are designed to live > for the life of the SegmentReader/IndexReader and thread. > > > On Jul 12, 2008, at 2:12 AM, Roman Puchkovskiy wrote: > >> >> Well, possibly I'm mistaken, but it seems that this affects non- >> static fields >> too. Please see >> http://www.nabble.com/ThreadLocal-in-SegmentReader-to18306230.html >> where the >> use case is described in the details. >> In short: it seems that the scope of ThreadLocals does not matter. >> What >> really matters is that they are referenced by ThreadLocals map in >> the thread >> which is still alive. >> >> >> Robert Engels wrote: >>> >>> This is only an issue for static ThreadLocals ... >>> >>> On Jul 11, 2008, at 11:32 PM, Roman Puchkovskiy wrote: >>> >>>> >>>> The problem here is not because ThreadLocal instances are not GC'd >>>> (they are >>>> GC'd, and your test shows this clearly). >>>> But even one instance which is not removed from its Thread is >>>> enough to >>>> prevent the classloader from being unloaded, and that's the problem. >>>> >>>> >>>> Michael McCandless-2 wrote: >>>>> >>>>> OK, I created a simple test to test this (attached). The test just >>>>> runs 10 threads, each one creating a 100 KB byte array which is >>>>> stored >>>>> into a ThreadLocal, and then periodically the ThreadLocal is >>>>> replaced >>>>> with a new one. This is to test whether GC of a ThreadLocal, even >>>>> though the thread is still alive, in fact leads to GC of the >>>>> objects >>>>> held in the ThreadLocal. >>>>> >>>>> Indeed on Sun JRE 1.4, 1.5 and 1.6 it appears that the objects >>>>> are in >>>>> fact properly collected. >>>>> >>>>> So this is not a leak but rather a "delayed collection" issue. >>>>> Java's >>>>> GC is never guaranteed to be immediate, and apparently when using >>>>> ThreadLocals it's even less immediate than "normal". In the >>>>> original >>>>> issue, if other things create ThreadLocals, then eventually >>>>> Lucene's >>>>> unreferenced ThreadLocals would be properly collected. >>>>> >>>>> So I think we continue to use non-static ThreadLocals in Lucene... >>>>> >>>>> Mike >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> robert engels wrote: >>>>> >>>>>> Once again, these are "static" thread locals. A completely >>>>>> different >>>>>> issue. Since the object is available statically, the weak >>>>>> reference >>>>>> cannot be cleared so stale entries will never be cleared as >>>>>> long as >>>>>> the thread is alive. >>>>>> >>>>>> On Jul 9, 2008, at 4:46 PM, Adrian Tarau wrote: >>>>>> >>>>>>> Just a few examples of "problems" using ThreadLocals. >>>>>>> >>>>>>> http://opensource.atlassian.com/projects/hibernate/browse/ >>>>>>> HHH-2481 >>>>>>> http://www.theserverside.com/news/thread.tss?thread_id=41473 >>>>>>> >>>>>>> Once again, I'm not pointing to Lucene SegmentReader as a "bad" >>>>>>> implementation, and maybe the current "problems" of ThreadLocals >>>>>>> are not a problem for SegmentReader but it seems safer to use >>>>>>> ThreadLocals to pass context information which is cleared when >>>>>>> the >>>>>>> call exits instead of storing long-lived objects. >>>>>>> >>>>>>> >>>>>>> robert engels wrote: >>>>>>>> >>>>>>>> Aside from the pre-1.5 thread local "perceived leak", there >>>>>>>> are no >>>>>>>> issues with ThreadLocals if used properly. >>>>>>>> >>>>>>>> There is no need for try/finally blocks, unless you MUST release >>>>>>>> resources immediately, usually this is not the case, which is >>>>>>>> why >>>>>>>> a ThreadLocal is used in the first place. >>>>>>>> >>>>>>>> From the ThreadLocalMap javadoc... >>>>>>>> >>>>>>>> /** >>>>>>>> * ThreadLocalMap is a customized hash map suitable only for >>>>>>>> * maintaining thread local values. No operations are >>>>>>>> exported >>>>>>>> * outside of the ThreadLocal class. The class is package >>>>>>>> private to >>>>>>>> * allow declaration of fields in class Thread. To help >>>>>>>> deal >>>>>>>> with >>>>>>>> * very large and long-lived usages, the hash table entries >>>>>>>> use >>>>>>>> * WeakReferences for keys. However, since reference queues >>>>>>>> are not >>>>>>>> * used, stale entries are guaranteed to be removed only >>>>>>>> when >>>>>>>> * the table starts running out of space. >>>>>>>> */ >>>>>>>> >>>>>>>> /** >>>>>>>> * Heuristically scan some cells looking for stale >>>>>>>> entries. >>>>>>>> * This is invoked when either a new element is >>>>>>>> added, or >>>>>>>> * another stale one has been expunged. It performs a >>>>>>>> * logarithmic number of scans, as a balance between no >>>>>>>> * scanning (fast but retains garbage) and a number of >>>>>>>> scans >>>>>>>> * proportional to number of elements, that would >>>>>>>> find all >>>>>>>> * garbage but would cause some insertions to take O(n) >>>>>>>> time. >>>>>>>> * >>>>>>>> * @param i a position known NOT to hold a stale entry. >>>>>>>> The >>>>>>>> * scan starts at the element after i. >>>>>>>> * >>>>>>>> * @param n scan control: <tt>log2(n)</tt> cells are >>>>>>>> scanned, >>>>>>>> * unless a stale entry one is found, in which case >>>>>>>> * <tt>log2(table.length)-1</tt> additional cells are >>>>>>>> scanned. >>>>>>>> * When called from insertions, this parameter is the >>>>>>>> number >>>>>>>> * of elements, but when from replaceStaleEntry, it >>>>>>>> is the >>>>>>>> * table length. (Note: all this could be changed to be >>>>>>>> either >>>>>>>> * more or less aggressive by weighting n instead of >>>>>>>> just >>>>>>>> * using straight log n. But this version is simple, >>>>>>>> fast, >>>>>>>> and >>>>>>>> * seems to work well.) >>>>>>>> * >>>>>>>> * @return true if any stale entries have been removed. >>>>>>>> */ >>>>>>>> >>>>>>>> >>>>>>>> The instance ThreadLocals (and what the refer to) will be GC'd >>>>>>>> when the containing Object is GC'd. >>>>>>>> >>>>>>>> There IS NO MEMORY LEAK in ThreadLocal. If the ThreadLocal >>>>>>>> refers >>>>>>>> to an object that has native resources (e.g. file handles), >>>>>>>> it may >>>>>>>> not be released until other thread locals are created by the >>>>>>>> thread (or the thread terminates). >>>>>>>> >>>>>>>> You can avoid this "delay" by calling remove(), but in most >>>>>>>> applications it should never be necessary - unless a very >>>>>>>> strange >>>>>>>> usage... >>>>>>>> >>>>>>>> On Jul 9, 2008, at 2:37 PM, Adrian Tarau wrote: >>>>>>>> >>>>>>>>> From what I know, storing objects in ThreadLocal is safe as >>>>>>>>> long >>>>>>>>> as you release the object within a try {} finall {} block or >>>>>>>>> store objects which are independent of the rest of the code(no >>>>>>>>> dependencies).Otherwise it can get pretty tricky(memory leaks, >>>>>>>>> classloader problems) after awhile. >>>>>>>>> >>>>>>>>> It is pretty convenient to pass HTTP request information with a >>>>>>>>> ThreadLocal in a servlet(but you should cleanup the variable >>>>>>>>> before leaving the servlet) but I'm not sure how safe it is in >>>>>>>>> this case. >>>>>>>>> >>>>>>>>> robert engels wrote: >>>>>>>>>> Using synchronization is a poor/invalid substitute for thread >>>>>>>>>> locals in many cases. >>>>>>>>>> >>>>>>>>>> The point of the thread local in these referenced cases is too >>>>>>>>>> allow streaming reads on a file descriptor. if you use a >>>>>>>>>> shared >>>>>>>>>> file descriptor/buffer you are going to continually invalidate >>>>>>>>>> the buffer. >>>>>>>>>> >>>>>>>>>> On Jul 8, 2008, at 5:12 AM, Michael McCandless wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Well ... SegmentReader uses ThreadLocal to hold a thread- >>>>>>>>>>> private instance of TermVectorsReader, to avoid synchronizing >>>>>>>>>>> per-document when loading term vectors. >>>>>>>>>>> >>>>>>>>>>> Clearing this ThreadLocal value per call to SegmentReader's >>>>>>>>>>> methods that load term vectors would defeat its purpose. >>>>>>>>>>> >>>>>>>>>>> Though, of course, we then synchronize on the underlying file >>>>>>>>>>> (when using FSDirectory), so perhaps we are really not saving >>>>>>>>>>> much by using ThreadLocal here. But we are looking to relax >>>>>>>>>>> that low level synchronization with LUCENE-753. >>>>>>>>>>> >>>>>>>>>>> Maybe we could make our own ThreadLocal that just uses a >>>>>>>>>>> HashMap, which we'd have to synchronize on when getting the >>>>>>>>>>> per- >>>>>>>>>>> thread instances. Or, go back to sharing a single >>>>>>>>>>> TermVectorsReader and synchronize per-document. >>>>>>>>>>> >>>>>>>>>>> Jason has suggested moving to a model where you ask the >>>>>>>>>>> IndexReader for an object that can return term vectors / >>>>>>>>>>> stored >>>>>>>>>>> fields / etc, and then you interact with that many times to >>>>>>>>>>> retrieve each doc. We could then synchronize only on >>>>>>>>>>> retrieving that object, and provide a thread-private >>>>>>>>>>> instance. >>>>>>>>>>> >>>>>>>>>>> It seems like we should move away from using ThreadLocal in >>>>>>>>>>> Lucene and do "normal" synchronization instead. >>>>>>>>>>> >>>>>>>>>>> Mike >>>>>>>>>>> >>>>>>>>>>> Adrian Tarau wrote: >>>>>>>>>>> >>>>>>>>>>>> Usually ThreadLocal.remove() should be called at the end >>>>>>>>>>>> (in a >>>>>>>>>>>> finally block), before the current call leaves your code. >>>>>>>>>>>> >>>>>>>>>>>> Ex : if during searching ThreadLocal is used, every search >>>>>>>>>>>> (..) >>>>>>>>>>>> method should cleanup any ThreadLocal variables, or even >>>>>>>>>>>> deeper in the implementation. When the call leaves Lucene >>>>>>>>>>>> any >>>>>>>>>>>> used ThreadLocal should be cleaned up. >>>>>>>>>>>> >>>>>>>>>>>> Michael McCandless wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> ThreadLocal, which we use in several places in Lucene, >>>>>>>>>>>>> causes >>>>>>>>>>>>> a leak in app servers because the classloader never fully >>>>>>>>>>>>> deallocates Lucene's classes because the ThreadLocal is >>>>>>>>>>>>> holding strong references. >>>>>>>>>>>>> >>>>>>>>>>>>> Yet, ThreadLocal is very convenient for avoiding >>>>>>>>>>>>> synchronization. >>>>>>>>>>>>> >>>>>>>>>>>>> Does anyone have any ideas on how to solve this w/o falling >>>>>>>>>>>>> back to "normal" synchronization? >>>>>>>>>>>>> >>>>>>>>>>>>> Mike >>>>>>>>>>>>> >>>>>>>>>>>>> Begin forwarded message: >>>>>>>>>>>>> >>>>>>>>>>>>>> From: "Yonik Seeley" <[EMAIL PROTECTED]> >>>>>>>>>>>>>> Date: July 7, 2008 3:30:28 PM EDT >>>>>>>>>>>>>> To: [EMAIL PROTECTED] >>>>>>>>>>>>>> Subject: Re: ThreadLocal in SegmentReader >>>>>>>>>>>>>> Reply-To: [EMAIL PROTECTED] >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Jul 7, 2008 at 2:43 PM, Michael McCandless >>>>>>>>>>>>>> <[EMAIL PROTECTED]> wrote: >>>>>>>>>>>>>>> So now I'm confused: the SegmentReader itself should no >>>>>>>>>>>>>>> longer be reachable, >>>>>>>>>>>>>>> assuming you are not holding any references to your >>>>>>>>>>>>>>> IndexReader. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Which means the ThreadLocal instance should no longer be >>>>>>>>>>>>>>> reachable. >>>>>>>>>>>>>> >>>>>>>>>>>>>> It will still be referenced from the Thread(s) >>>>>>>>>>>>>> ThreadLocalMap >>>>>>>>>>>>>> The key (the ThreadLocal) will be weakly referenced, >>>>>>>>>>>>>> but the >>>>>>>>>>>>>> values >>>>>>>>>>>>>> (now stale) are strongly referenced and won't be actually >>>>>>>>>>>>>> removed >>>>>>>>>>>>>> until the table is resized (under the Java6 impl at >>>>>>>>>>>>>> least). >>>>>>>>>>>>>> Nice huh? >>>>>>>>>>>>>> >>>>>>>>>>>>>> -Yonik >>>>>>>>>>>>>> >>>>>>>>>>>>>> ---------------------------------------------------------- >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> --------- >>>>>>>>>>>>>> To unsubscribe, e-mail: java-user- >>>>>>>>>>>>>> [EMAIL PROTECTED] >>>>>>>>>>>>>> For additional commands, e-mail: java-user- >>>>>>>>>>>>>> [EMAIL PROTECTED] >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> ----------------------------------------------------------- >>>>>>>>>>>>> -- >>>>>>>>>>>>> -------- >>>>>>>>>>>>> To unsubscribe, e-mail: java-dev- >>>>>>>>>>>>> [EMAIL PROTECTED] >>>>>>>>>>>>> For additional commands, e-mail: java-dev- >>>>>>>>>>>>> [EMAIL PROTECTED] >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> ------------------------------------------------------------ >>>>>>>>>>>> -- >>>>>>>>>>>> ------- >>>>>>>>>>>> To unsubscribe, e-mail: java-dev- >>>>>>>>>>>> [EMAIL PROTECTED] >>>>>>>>>>>> For additional commands, e-mail: java-dev- >>>>>>>>>>>> [EMAIL PROTECTED] >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> ------------------------------------------------------------- >>>>>>>>>>> -- >>>>>>>>>>> ------ >>>>>>>>>>> To unsubscribe, e-mail: java-dev- >>>>>>>>>>> [EMAIL PROTECTED] >>>>>>>>>>> For additional commands, e-mail: java-dev- >>>>>>>>>>> [EMAIL PROTECTED] >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -------------------------------------------------------------- >>>>>>>>>> -- >>>>>>>>>> ----- >>>>>>>>>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>>>>>>>>> For additional commands, e-mail: java-dev- >>>>>>>>>> [EMAIL PROTECTED] >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> --------------------------------------------------------------- >>>>>>>>> -- >>>>>>>>> ---- >>>>>>>>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>>>>>>>> For additional commands, e-mail: java-dev- >>>>>>>>> [EMAIL PROTECTED] >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------- >>>>> -- >>>>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>>>> For additional commands, e-mail: [EMAIL PROTECTED] >>>>> >>>> >>>> -- >>>> View this message in context: http://www.nabble.com/Fwd%3A- >>>> ThreadLocal-in-SegmentReader-tp18326720p18416022.html >>>> Sent from the Lucene - Java Developer mailing list archive at >>>> Nabble.com. >>>> >>>> >>>> -------------------------------------------------------------------- >>>> - >>>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>>> For additional commands, e-mail: [EMAIL PROTECTED] >>>> >>> >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>> For additional commands, e-mail: [EMAIL PROTECTED] >>> >>> >>> >> >> -- >> View this message in context: http://www.nabble.com/Fwd%3A- >> ThreadLocal-in-SegmentReader-tp18326720p18416841.html >> Sent from the Lucene - Java Developer mailing list archive at >> Nabble.com. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > -- View this message in context: http://www.nabble.com/Fwd%3A-ThreadLocal-in-SegmentReader-tp18326720p18437679.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]