Hi Ludger, I recall our discussion [1] from last year. This is roughly the solution I had in mind.
If you haven't already I would kindly ask you to fax in an Individual Contributor License Agreement (CLA) [2] to Apache. It's a project requirement [3] for accepting new features and other enhancements. Signing a CLA will cover your improvements to the DOM as well as future contributions to Xerces (and other Apache projects). Thanks. [1] http://mail-archives.apache.org/mod_mbox/xerces-j-dev/200703.mbox/[EMAIL PROTECTED] [2] http://www.apache.org/licenses/icla.txt [3] http://xml.apache.org/xerces2-j/charter.html Michael Glavassevich XML Parser Development IBM Toronto Lab E-mail: [EMAIL PROTECTED] E-mail: [EMAIL PROTECTED] "Ludger Buenger" <[EMAIL PROTECTED]> wrote on 04/10/2008 08:21:02 AM: > Besides your two possible solutions I see also a third way to solve > this which also solves the problem what should happen if a node gets > discarted while Userdata still persists: > > 3) use a WeakHashMap (http://java.sun.com/j2se/1.4. > 2/docs/api/java/util/WeakHashMap.html) instead of a HashMap for > storing the userdata hash table inside DocumentImpl > > > WeakHashMaps maintain only a weak reference to the key (i.e. the > node) keeping it garbage collectable and thus will also free the > user data if the node is garbage collected. > > > Already on Monday I opened a jira issue also provided a patch for this, see > https://issues.apache.org/jira/browse/XERCESJ-1298 > > > Best regards, > > Ludger Bünger > > > -- > Dipl.-Inf. Ludger Bünger > Product Development > - - - - - - - - - - - - - - - - > RealObjects GmbH > Altenkesseler Str. 17/B4 > 66115 Saarbrücken, Germany > Tel +49 (0)681 98579 0 > Fax +49 (0)681 98579 29 > http://www.realobjects.com > [EMAIL PROTECTED] > > > > -----Ursprüngliche Nachricht----- > Von: Christian Roth [mailto:[EMAIL PROTECTED] > Gesendet: Donnerstag, 10. April 2008 12:55 > An: Xerces-J Developers > Betreff: setUserData() implementation possible source of memory leak > > Affects: Xerces-J 2.9.1 > > ** Summary ** > Once a Node had attached a user data object, that Node object will never > be eligible for garbage collection (even if removed from the tree) > unless its parent Document is garbage collected. This happens even if > the user data is removed from that Node as described in the documentation. > > > ** Description of cause ** > User data is stored on the Document object for all Nodes, not on the > individual Nodes themselves. When user data is first attached to a Node, > a new entry in the Hashtable is created with the Node object itself as > key, and a secondary Hashtable as value, into which the user data KV- > pairs are stored in. > > When user data is deleted, the respective KV pairs are removed from the > secondary Hashtable as expected. However, even when the secondary > Hashtable for a Node gets empty, its top-level Hashtable entry in the > Document is not removed alongside. Since that entry's key still refers > to the node, the latter cannot get garbage collected. > > The following pseudo code will leak Node n until the Document doc is > discarded: > > { > Node n = doc.createElement( "a" ); > n.setUserData( "key", "value", null ); > n.setUserData( "key", null, null ); > } > > > ** Possible remedies ** > I can see two ways of solving the issue: > > 1) Do not use the node itself as the key in the primary Hashtable. > 2) Discard a Node's entry in the primary Hashtable as soon as its > secondary (key-) Hashtable gets empty. > > In my tests during tracking down this issue, I settled for solution 2). > > > ** Suggested patch ** > I'm therefore suggesting the following patch to > org.apache.xerces.dom.CoreDocumentImpl.java, setUserData() (#2297): > > After removing a key from the Node's secondary Hashtable t, we check if > it is now empty. If it is, no user data is attached to that Node any > longer, and we remove its entry in the primary Hashtable 'userData', > thereby removing the incoming reference to it for it to be eligible for > garbage collection. > > Here's the complete function code with my suggested patch applied, > bracketed by > > // -> added CR > ... > // <- > > --snip-- > public Object setUserData(Node n, String key, > Object data, UserDataHandler handler) { > if (data == null) { > if (userData != null) { > Hashtable t = (Hashtable) userData.get(n); > if (t != null) { > Object o = t.remove(key); > > // -> added CR > if( t.size() == 0 ) > userData.remove( n ); > // <- > > if (o != null) { > UserDataRecord r = (UserDataRecord) o; > return r.fData; > } > } > } > return null; > } > else { > Hashtable t; > if (userData == null) { > userData = new Hashtable(); > t = new Hashtable(); > userData.put(n, t); > } > else { > t = (Hashtable) userData.get(n); > if (t == null) { > t = new Hashtable(); > userData.put(n, t); > } > } > Object o = t.put(key, new UserDataRecord(data, handler)); > if (o != null) { > UserDataRecord r = (UserDataRecord) o; > return r.fData; > } > return null; > } > } > --snip-- > > > Kind regards > Christian Roth > > -- > Christian Roth (CTO) * Phone: +49 (0)89 89 04 32 95 > infinity-loop GmbH * Neideckstr. 25 * 81249 München * Germany > HRB 136 783 (AG München) * Geschäftsführer: Dr. Stefan Hermann > Web: http://www.infinity-loop.de > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
