[ 
https://issues.apache.org/jira/browse/ABDERA-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Berry updated ABDERA-153:
-------------------------------

    Attachment: ExtensionFactoryMap.patch

The patch file for the memory leak in WeakHashMap

> Serious memory leak in ExtensionFactoryMap
> ------------------------------------------
>
>                 Key: ABDERA-153
>                 URL: https://issues.apache.org/jira/browse/ABDERA-153
>             Project: Abdera
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>         Environment: N/A
>            Reporter: Chris Berry
>            Priority: Critical
>         Attachments: ExtensionFactoryMap.patch
>
>
> Greetings,
> Some time ago we reported a very serious memory leak in Abdera 0.3.0.
> It was reported to the abdera users list on Nov 7, 2007 as "memory leak".
> That message is replayed below...
> At the time we could workaround the bug in our client, but over time this 
> became impossible, and we had to hack ExtensionFactoryMap.
> Attached is a patch to org.apache.abdera.factory.ExtensionFactoryMap
> This patch simply removes the problem -- the WeakHashMap.
> This change has a negligible effect on performance (as measured in load 
> testing) 
> But without it, abdera 0.3.0 will get OutOfMemoryExceptions quite quickly 
> under load, rendering it unusable.
> BTW: this code has been in Production for many weeks...
> We were hoping that you could apply this patch to 0.3.0??
> Or better;  produce 0.3.1
> We are not able to upgrade to 0.4.0 yet, and would rather not have a hacked 
> copy of 0.3.0.
> Also, others not able to upgrade yet (since the upgrade is not entirely 
> transparent) may benefit from the patch
> I will also open a JIRA on this.
> Thanks,
> -- Chris 
>       From:   [EMAIL PROTECTED]
>       Subject:        memory leak
>       Date:   November 5, 2007 3:49:24 PM CST
>       To:     [EMAIL PROTECTED]
>       Reply-To:       [EMAIL PROTECTED]
>                     Hey All -
> I'm working on a data service based on Abdera (working with Chris Berry, 
> who's a regular on these lists...)  When we were running our first battery of 
> serious load testing on our system, we encountered memory-leaky behavior, and 
> a profiler showed us that we were indeed leaking hundreds of megabytes a 
> minute, all traceable back to the wrappers field on 
> org.apache.abdera.factory.ExtensionFactoryMap.  This field is a map from 
> elements to their wrappers, if any.  At first, I was puzzled by the memory 
> leak, as the field is initialized thusly:
>       this.wrappers = Collections.synchronizedMap( new 
> WeakHashMap<Element,Element>());
> clearly, the implementor took care to make sure that this cache would not 
> leak by making it a WeakHashMap, which generally guarantees that the map 
> itself will not keep a key and its corresponding entry from being garbage 
> collected.  I dug throughout our application code to find if we were actually 
> holding other references to these objects, and I googled for anyone having 
> problems with esoteric interactions between Collections.synchronizedMap and 
> WeakHashMaps - found nothing there.  Then I went back to square one and 
> re-read the WeakHashMap javadoc very carefully.  Here's the relevant section:
>       Implementation note: The value objects in a WeakHashMap are held by 
> ordinary strong references. Thus care should be taken to ensure that value 
> objects do not strongly refer to their own keys, either directly or 
> indirectly, since that will prevent the keys from being discarded. Note that 
> a value object may refer indirectly to its key via the WeakHashMap itself; 
> that is, a value object may strongly refer to some other key object whose 
> associated value object, in turn, strongly refers to the key of the first 
> value object. One way to deal with this is to wrap values themselves within 
> WeakReferences before inserting, as in: m.put(key, new WeakReference(value)), 
> and then unwrapping upon each get.
> This is why there is a memory leak - the map is a mapping from elements to 
> their wrappers - by the very nature of the object being a wrapper of the 
> element, it will usually have a strong reference to the element itself, which 
> is the key! You can verify that Abdera wrappers, in general, will do this by 
> looking at org.apache.abdera.model.ElementWrapper, which takes the element 
> being wrapped as its constructor argument, and holds a strong reference to it 
> as an instance variable.
> This map is an optimization to memoize the calls to getElementWrapper() and 
> not reconstruct them more than is necessary - it is not needed for abdera to 
> function properly, so we have temporarily worked around the problem in our 
> own application like so - we used to acquire our FOMFactory by calling 
> abdera.getFactory() on our org.apache.abdera.Abdera instance, and re-using 
> that singleton throughout our application.  Now we construct a new FOMFactory 
> with new FOMFactory(abdera) once per request to the server, and since the 
> only appreciable state on the factory is this map itself, this is a valid 
> work-around.
> I'd initially planned to really fix this issue and submit a patch along with 
> this message, but digging a little deeper, I'm not sure that the correct fix 
> is crystal clear...  We could do as the javadoc above suggests, and wrap the 
> values with WeakReferences to plug the leak, or we could use a LinkedHashMap 
> configured as an LRU cache to just bound the cache, so it can't grow out of 
> control - but right now, I don't think that either of those solutions would 
> be correct, because it seems to me that none of the objects in the hierarchy 
> rooted at FOMElement define equals() and/or hashCode() methods, so all of the 
> objects are cached based on their actual object identity.  It seems that in 
> the all likely use cases, instances of FOMElement and its descendants are 
> re-parsed on every request to a server running abdera, and so what we will 
> see is cache misses virtually 100% of the time, so even though we'd have 
> plugged the leak, strictly speaking, we would have ignored the underlying 
> issue that we're caching data on every request that will be fundamentally 
> unable to be retrieved on subsequent requests.  This is based only on my 
> reading over the code for a few hours, so I could be missing something, and I 
> also might be forgetting about a use case that demands and makes proper use 
> of this memoization, but as it stands right now, my recommended fix would 
> probably be to just cut out the cache altogether, and allow for wrappers to 
> get constructed fresh every time they are requested.  One more possibility is 
> that the cache is actually a useful optimization, but only during the scope 
> of one request - in which case the "work-around" we are using now is actually 
> the best practice, and the fix would be to remove the factory instance on the 
> Abdera class...
> I'd like to hear from the Abdera developers what their thoughts are on this 
> issue, and what the best resolution is likely to be.  This is no longer a 
> pressing issue for our team, but it is potentially a time bomb waiting to 
> blow up for any project dependent on Abdera.
> thanks!  (and thanks for Abdera, generally - we're easily a year ahead of 
> where we'd be on this project without it!)
> -Bryon (long-time listener, first-time caller)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to