memory leak

Bryon Jacob Mon, 05 Nov 2007 13:49:59 -0800

Hey All -

I'm working on a data service based on Abdera (working with ChrisBerry, who's a regular on these lists...) When we were running ourfirst battery of serious load testing on our system, we encounteredmemory-leaky behavior, and a profiler showed us that we were indeedleaking hundreds of megabytes a minute, all traceable back to thewrappers field on org.apache.abdera.factory.ExtensionFactoryMap. Thisfield is a map from elements to their wrappers, if any. At first, Iwas puzzled by the memory leak, as the field is initialized thusly:

this.wrappers = Collections.synchronizedMap( newWeakHashMap<Element,Element>());

clearly, the implementor took care to make sure that this cache wouldnot leak by making it a WeakHashMap, which generally guarantees thatthe map itself will not keep a key and its corresponding entry frombeing garbage collected. I dug throughout our application code tofind if we were actually holding other references to these objects,and I googled for anyone having problems with esoteric interactionsbetween Collections.synchronizedMap and WeakHashMaps - found nothingthere. Then I went back to square one and re-read the WeakHashMapjavadoc very carefully. Here's the relevant section:

Implementation note: The value objects in a WeakHashMap are held byordinary strong references. Thus care should be taken to ensure thatvalue objects do not strongly refer to their own keys, either directlyor indirectly, since that will prevent the keys from being discarded.Note that a value object may refer indirectly to its key via theWeakHashMap itself; that is, a value object may strongly refer to someother key object whose associated value object, in turn, stronglyrefers to the key of the first value object. One way to deal with thisis to wrap values themselves within WeakReferences before inserting,as in: m.put(key, new WeakReference(value)), and then unwrapping uponeach get.

This is why there is a memory leak - the map is a mapping fromelements to their wrappers - by the very nature of the object being awrapper of the element, it will usually have a strong reference to theelement itself, which is the key! You can verify that Abdera wrappers,in general, will do this by looking atorg.apache.abdera.model.ElementWrapper, which takes the element beingwrapped as its constructor argument, and holds a strong reference toit as an instance variable.

This map is an optimization to memoize the calls togetElementWrapper() and not reconstruct them more than is necessary -it is not needed for abdera to function properly, so we havetemporarily worked around the problem in our own application like so -we used to acquire our FOMFactory by calling abdera.getFactory() onour org.apache.abdera.Abdera instance, and re-using that singletonthroughout our application. Now we construct a new FOMFactory withnew FOMFactory(abdera) once per request to the server, and since theonly appreciable state on the factory is this map itself, this is avalid work-around.

I'd initially planned to really fix this issue and submit a patchalong with this message, but digging a little deeper, I'm not surethat the correct fix is crystal clear... We could do as the javadocabove suggests, and wrap the values with WeakReferences to plug theleak, or we could use a LinkedHashMap configured as an LRU cache tojust bound the cache, so it can't grow out of control - but right now,I don't think that either of those solutions would be correct, becauseit seems to me that none of the objects in the hierarchy rooted atFOMElement define equals() and/or hashCode() methods, so all of theobjects are cached based on their actual object identity. It seemsthat in the all likely use cases, instances of FOMElement and itsdescendants are re-parsed on every request to a server running abdera,and so what we will see is cache misses virtually 100% of the time, soeven though we'd have plugged the leak, strictly speaking, we wouldhave ignored the underlying issue that we're caching data on everyrequest that will be fundamentally unable to be retrieved onsubsequent requests. This is based only on my reading over the codefor a few hours, so I could be missing something, and I also might beforgetting about a use case that demands and makes proper use of thismemoization, but as it stands right now, my recommended fix wouldprobably be to just cut out the cache altogether, and allow forwrappers to get constructed fresh every time they are requested. Onemore possibility is that the cache is actually a useful optimization,but only during the scope of one request - in which case the "work-around" we are using now is actually the best practice, and the fixwould be to remove the factory instance on the Abdera class...

I'd like to hear from the Abdera developers what their thoughts are onthis issue, and what the best resolution is likely to be. This is nolonger a pressing issue for our team, but it is potentially a timebomb waiting to blow up for any project dependent on Abdera.

thanks! (and thanks for Abdera, generally - we're easily a year aheadof where we'd be on this project without it!)


-Bryon (long-time listener, first-time caller)

memory leak

Reply via email to