[ https://issues.apache.org/jira/browse/ABDERA-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Berry updated ABDERA-153: ------------------------------- Attachment: ExtensionFactoryMap.patch The patch file for the memory leak in WeakHashMap > Serious memory leak in ExtensionFactoryMap > ------------------------------------------ > > Key: ABDERA-153 > URL: https://issues.apache.org/jira/browse/ABDERA-153 > Project: Abdera > Issue Type: Bug > Affects Versions: 0.3.0 > Environment: N/A > Reporter: Chris Berry > Priority: Critical > Attachments: ExtensionFactoryMap.patch > > > Greetings, > Some time ago we reported a very serious memory leak in Abdera 0.3.0. > It was reported to the abdera users list on Nov 7, 2007 as "memory leak". > That message is replayed below... > At the time we could workaround the bug in our client, but over time this > became impossible, and we had to hack ExtensionFactoryMap. > Attached is a patch to org.apache.abdera.factory.ExtensionFactoryMap > This patch simply removes the problem -- the WeakHashMap. > This change has a negligible effect on performance (as measured in load > testing) > But without it, abdera 0.3.0 will get OutOfMemoryExceptions quite quickly > under load, rendering it unusable. > BTW: this code has been in Production for many weeks... > We were hoping that you could apply this patch to 0.3.0?? > Or better; produce 0.3.1 > We are not able to upgrade to 0.4.0 yet, and would rather not have a hacked > copy of 0.3.0. > Also, others not able to upgrade yet (since the upgrade is not entirely > transparent) may benefit from the patch > I will also open a JIRA on this. > Thanks, > -- Chris > From: [EMAIL PROTECTED] > Subject: memory leak > Date: November 5, 2007 3:49:24 PM CST > To: [EMAIL PROTECTED] > Reply-To: [EMAIL PROTECTED] > Hey All - > I'm working on a data service based on Abdera (working with Chris Berry, > who's a regular on these lists...) When we were running our first battery of > serious load testing on our system, we encountered memory-leaky behavior, and > a profiler showed us that we were indeed leaking hundreds of megabytes a > minute, all traceable back to the wrappers field on > org.apache.abdera.factory.ExtensionFactoryMap. This field is a map from > elements to their wrappers, if any. At first, I was puzzled by the memory > leak, as the field is initialized thusly: > this.wrappers = Collections.synchronizedMap( new > WeakHashMap<Element,Element>()); > clearly, the implementor took care to make sure that this cache would not > leak by making it a WeakHashMap, which generally guarantees that the map > itself will not keep a key and its corresponding entry from being garbage > collected. I dug throughout our application code to find if we were actually > holding other references to these objects, and I googled for anyone having > problems with esoteric interactions between Collections.synchronizedMap and > WeakHashMaps - found nothing there. Then I went back to square one and > re-read the WeakHashMap javadoc very carefully. Here's the relevant section: > Implementation note: The value objects in a WeakHashMap are held by > ordinary strong references. Thus care should be taken to ensure that value > objects do not strongly refer to their own keys, either directly or > indirectly, since that will prevent the keys from being discarded. Note that > a value object may refer indirectly to its key via the WeakHashMap itself; > that is, a value object may strongly refer to some other key object whose > associated value object, in turn, strongly refers to the key of the first > value object. One way to deal with this is to wrap values themselves within > WeakReferences before inserting, as in: m.put(key, new WeakReference(value)), > and then unwrapping upon each get. > This is why there is a memory leak - the map is a mapping from elements to > their wrappers - by the very nature of the object being a wrapper of the > element, it will usually have a strong reference to the element itself, which > is the key! You can verify that Abdera wrappers, in general, will do this by > looking at org.apache.abdera.model.ElementWrapper, which takes the element > being wrapped as its constructor argument, and holds a strong reference to it > as an instance variable. > This map is an optimization to memoize the calls to getElementWrapper() and > not reconstruct them more than is necessary - it is not needed for abdera to > function properly, so we have temporarily worked around the problem in our > own application like so - we used to acquire our FOMFactory by calling > abdera.getFactory() on our org.apache.abdera.Abdera instance, and re-using > that singleton throughout our application. Now we construct a new FOMFactory > with new FOMFactory(abdera) once per request to the server, and since the > only appreciable state on the factory is this map itself, this is a valid > work-around. > I'd initially planned to really fix this issue and submit a patch along with > this message, but digging a little deeper, I'm not sure that the correct fix > is crystal clear... We could do as the javadoc above suggests, and wrap the > values with WeakReferences to plug the leak, or we could use a LinkedHashMap > configured as an LRU cache to just bound the cache, so it can't grow out of > control - but right now, I don't think that either of those solutions would > be correct, because it seems to me that none of the objects in the hierarchy > rooted at FOMElement define equals() and/or hashCode() methods, so all of the > objects are cached based on their actual object identity. It seems that in > the all likely use cases, instances of FOMElement and its descendants are > re-parsed on every request to a server running abdera, and so what we will > see is cache misses virtually 100% of the time, so even though we'd have > plugged the leak, strictly speaking, we would have ignored the underlying > issue that we're caching data on every request that will be fundamentally > unable to be retrieved on subsequent requests. This is based only on my > reading over the code for a few hours, so I could be missing something, and I > also might be forgetting about a use case that demands and makes proper use > of this memoization, but as it stands right now, my recommended fix would > probably be to just cut out the cache altogether, and allow for wrappers to > get constructed fresh every time they are requested. One more possibility is > that the cache is actually a useful optimization, but only during the scope > of one request - in which case the "work-around" we are using now is actually > the best practice, and the fix would be to remove the factory instance on the > Abdera class... > I'd like to hear from the Abdera developers what their thoughts are on this > issue, and what the best resolution is likely to be. This is no longer a > pressing issue for our team, but it is potentially a time bomb waiting to > blow up for any project dependent on Abdera. > thanks! (and thanks for Abdera, generally - we're easily a year ahead of > where we'd be on this project without it!) > -Bryon (long-time listener, first-time caller) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.