> -----Original Message-----
> From: Glen Mazza [mailto:[EMAIL PROTECTED]
>

Hi Glen,

<snip />
>
> Another option:  validateChildNode() is called from only one place,
> FOTreeBuilder.startElement().  At that point, we can also feed vcN() the
> parameter "namespaceURI.intern()" instead of just "namespaceURI".  This
> could be slightly faster for some VCN()'s that compare against multiple
> URI's--but I would think .intern() is much slower than .equals() for the
> reason given above.

The other idea to consider is the impact on memory. The string values of
interned strings are only stored once. There is indeed the overhead of a
call to .intern(), but the workings of that method will be nearly as
optimized as the .equals() method. Look up the string value in a hashtable:
if it doesn't exist, create a new one and return an internalized reference
to the value that's already stored in that hashtable. If it does, just
return the reference.
The source string value is immediately discarded in any case, only the
reference is kept.

The benefit of interning can be most appreciated in cases where the strings'
lengths are long enough --exceed the size of a reference-- AND the number of
them being created is large enough. Both are the case for a lot of namespace
URIs, and node or attribute names.
This is precisely the reason why the SAX parser feature for string
interning --http://xml.org/sax/features/string-interning -- defaults to
'true' in Xerces-J (and can't be set to 'false').

To put it quite bluntly: my concern is that we would essentially be adapting
our code to make it possible for people to use it to waste resources. Feed
it interned strings and it will work. Why would one really want to create a
separate string object for all occurences of a given namespace URI in a
random document, and at the very same time expect us to take into account
that they didn't intern those strings themselves...

I still think Nils would gain more by manipulating his setup so that these
types of strings are already interned, more than we would gain by changing
our code to allow for them to be 'just' strings.


Cheers,

Andreas

Reply via email to