Note to self: Check ML archives first, then ask! http://marc.theaimsgroup.com/?t=107034266700002&r=1&w=2
On 26.06.2005 16:02:29 Jeremias Maerki wrote: > Ok, that's a different concern and it's valid. I don't buy the > performance argument, though, especially since the intern() method will > also cost something. Does anyone have any numbers? > > On 26.06.2005 15:33:08 Andreas L. Delmelle wrote: > > > -----Original Message----- > > > From: Glen Mazza [mailto:[EMAIL PROTECTED] > > > > > > > Hi Glen, > > > > <snip /> > > > > > > Another option: validateChildNode() is called from only one place, > > > FOTreeBuilder.startElement(). At that point, we can also feed vcN() the > > > parameter "namespaceURI.intern()" instead of just "namespaceURI". This > > > could be slightly faster for some VCN()'s that compare against multiple > > > URI's--but I would think .intern() is much slower than .equals() for the > > > reason given above. > > > > The other idea to consider is the impact on memory. The string values of > > interned strings are only stored once. There is indeed the overhead of a > > call to .intern(), but the workings of that method will be nearly as > > optimized as the .equals() method. Look up the string value in a hashtable: > > if it doesn't exist, create a new one and return an internalized reference > > to the value that's already stored in that hashtable. If it does, just > > return the reference. > > The source string value is immediately discarded in any case, only the > > reference is kept. > > > > The benefit of interning can be most appreciated in cases where the strings' > > lengths are long enough --exceed the size of a reference-- AND the number of > > them being created is large enough. Both are the case for a lot of namespace > > URIs, and node or attribute names. > > This is precisely the reason why the SAX parser feature for string > > interning --http://xml.org/sax/features/string-interning -- defaults to > > 'true' in Xerces-J (and can't be set to 'false'). > > > > To put it quite bluntly: my concern is that we would essentially be adapting > > our code to make it possible for people to use it to waste resources. Feed > > it interned strings and it will work. Why would one really want to create a > > separate string object for all occurences of a given namespace URI in a > > random document, and at the very same time expect us to take into account > > that they didn't intern those strings themselves... > > > > I still think Nils would gain more by manipulating his setup so that these > > types of strings are already interned, more than we would gain by changing > > our code to allow for them to be 'just' strings. > > > > > > Cheers, > > > > Andreas > > > > Jeremias Maerki Jeremias Maerki