Ok, that's a different concern and it's valid. I don't buy the
performance argument, though, especially since the intern() method will
also cost something. Does anyone have any numbers?

On 26.06.2005 15:33:08 Andreas L. Delmelle wrote:
> > -----Original Message-----
> > From: Glen Mazza [mailto:[EMAIL PROTECTED]
> >
> Hi Glen,
> <snip />
> >
> > Another option:  validateChildNode() is called from only one place,
> > FOTreeBuilder.startElement().  At that point, we can also feed vcN() the
> > parameter "namespaceURI.intern()" instead of just "namespaceURI".  This
> > could be slightly faster for some VCN()'s that compare against multiple
> > URI's--but I would think .intern() is much slower than .equals() for the
> > reason given above.
> The other idea to consider is the impact on memory. The string values of
> interned strings are only stored once. There is indeed the overhead of a
> call to .intern(), but the workings of that method will be nearly as
> optimized as the .equals() method. Look up the string value in a hashtable:
> if it doesn't exist, create a new one and return an internalized reference
> to the value that's already stored in that hashtable. If it does, just
> return the reference.
> The source string value is immediately discarded in any case, only the
> reference is kept.
> The benefit of interning can be most appreciated in cases where the strings'
> lengths are long enough --exceed the size of a reference-- AND the number of
> them being created is large enough. Both are the case for a lot of namespace
> URIs, and node or attribute names.
> This is precisely the reason why the SAX parser feature for string
> interning --http://xml.org/sax/features/string-interning -- defaults to
> 'true' in Xerces-J (and can't be set to 'false').
> To put it quite bluntly: my concern is that we would essentially be adapting
> our code to make it possible for people to use it to waste resources. Feed
> it interned strings and it will work. Why would one really want to create a
> separate string object for all occurences of a given namespace URI in a
> random document, and at the very same time expect us to take into account
> that they didn't intern those strings themselves...
> I still think Nils would gain more by manipulating his setup so that these
> types of strings are already interned, more than we would gain by changing
> our code to allow for them to be 'just' strings.
> Cheers,
> Andreas

Jeremias Maerki

Reply via email to