Hi Dave,

The strings didn't need to be interned for Xerces' internals to work
correctly (though the code has since evolved to depend on that now). It's
just cheaper to do the intern once and cache it in the SymbolTable than to
do it later, possibly multiple times at the API layer. Some history here
[1] if you're interested.

Thanks.

[1] http://issues.apache.org/jira/browse/XERCESJ-6

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: [EMAIL PROTECTED]
E-mail: [EMAIL PROTECTED]

"Dave Brosius" <[EMAIL PROTECTED]> wrote on 01/09/2008 10:27:20 PM:

> Clearly based on your response, and the fact that the Soft referenced
table
> also interns, i completely misunderstood (and still do) what the
SymbolTable
> class is used for.
>
> I guess i'll have to take another attempt at understanding what it is
being
> used for.
>
>
> ----- Original Message -----
> From: "Michael Glavassevich" <[EMAIL PROTECTED]>
> To: <[email protected]>
> Sent: Wednesday, January 09, 2008 4:16 PM
> Subject: Re: Interning strategy
>
>
> > Hi Dave,
> >
> > It's being interned for the application. Allows your SAX content
handler
> > to
> > compare the names of elements, attributes, etc... using reference
> > comparison [1] instead of equals for better performance. There's an
> > alternate implementation of the SymbolTable [2] which is more sensitive
to
> > memory usage. It allows interned strings to be garbage collected if
> > they're
> > only reachable through the SymbolTable.
> >
> > Thanks.
> >
> > [1] http://xerces.apache.org/xerces2-j/features.html#string-interning
> > [2]
> > http://xerces.apache.org/xerces2-
> j/javadocs/xerces2/org/apache/xerces/util/SoftReferenceSymbolTable.html
> >
> > Michael Glavassevich
> > XML Parser Development
> > IBM Toronto Lab
> > E-mail: [EMAIL PROTECTED]
> > E-mail: [EMAIL PROTECTED]
> >
> > "Dave Brosius" <[EMAIL PROTECTED]> wrote on 01/09/2008 01:01:06
AM:
> >
> >> Greetings, i was purusing old mailing list emails, and stumbled onto
the
> >> following email sent some time ago :)
> >>
> >> Luckily, from a quick perusal of the code, it appears that the email
> > still
> >> applies.
> >>
> >> I have a question about the implementation of SymbolTable
> >>
> >> As expected, it appears to me to that it does hashing to find a
bucket,
> > then
> >> walks the chain of pointers from the bucket to find a string that is
> >> 'equals'
> >>
> >> Only if it doesn't exist is a new one added. All of this makes sense.
> >>
> >> The question i have then, is why when you add an entry
> >>
> >> public Entry(String symbol, Entry next) {
> >>     this.symbol = symbol.intern();
> >>     characters = new char[symbol.length()];
> >>     symbol.getChars(0, characters.length, characters, 0);
> >>     this.next = next;
> >> }
> >>
> >> does the code intern the string? Isn't the point of this class to stop
> >> pollution of the constant pool and perm gen? (besides allowing for
> > alternate
> >> hashing?)
> >> Given that the one String that lives in the SymbolTable is returned, i
> > would
> >> think intern is redundant.
> >>
> >> thanks,
> >> dave
> >>
> >> ----- Original Message -----
> >> From: "Michael Glavassevich" <[EMAIL PROTECTED]>
> >> To: <[email protected]>
> >> Sent: Sunday, July 24, 2005 11:57 AM
> >> Subject: Re: Interning strategy
> >>
> >>
> >> Elliotte Harold <[EMAIL PROTECTED]> wrote on 07/22/2005 09:35:02
PM:
> >>
> >> > Suppose I turn on interning in the parser by setting the SAX
property
> >> > http://xml.org/sax/features/string-interning to true. Will Xerces
> > simply
> >>
> >> > invoke the String.intern() method on the strings it creates or does
it
> >> > do something fancier like maintaining its own pool of string
constants
> >> > and reuse those?
> >>
> >> It maintains a pool. See org.apache.xerces.util.SymbolTable,
specifically
> >> the addSymbol() methods.
> >>
> >> > --
> >> > Elliotte Rusty Harold  [EMAIL PROTECTED]
> >> > XML in a Nutshell 3rd Edition Just Published!
> >> > http://www.cafeconleche.org/books/xian3/
> >> >
http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim
> >> >
> >> >
---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> >> > For additional commands, e-mail: [EMAIL PROTECTED]
> >> >
> >>
> >> Michael Glavassevich
> >> XML Parser Development
> >> IBM Toronto Lab
> >> E-mail: [EMAIL PROTECTED]
> >> E-mail: [EMAIL PROTECTED]
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [EMAIL PROTECTED]
> >> For additional commands, e-mail: [EMAIL PROTECTED]
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [EMAIL PROTECTED]
> >> For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to