So, while I don't have my access yet, Colm asked me if I'd take a look
at the == vs equals() issue (relevant bugs: 40897[1], 45637[2], 46681[3])
My executive summary is that clearly, as things stand, the current code
favors optimization over correctness. Rarely is this a good thing.
Colm notes[4] that the reliance on intern'ed strings (and thus the
ability to use ==) occurs sporadically throughout the code and not just
within the ElementChecker implementations. He specifically mentioned
that the various C14N implementations, and indeed the == is used about 6
times there for string comparison.
My recommendation then is two fold:
- Ensure that nothing other than namespace bits are compared via ==.
I don't know that this occurs but the code should definitely be reviewed
to ensure that.
- Create a new "NamespaceEqualityChecker" that provides methods for
checking the various bits of a namespace (URIs, prefixes) and use it
anywhere that either == or equals() is used today. Implementations
based on == and equals() would be provided with the default
implementation being equals()-based. A configuration option should then
be made available to control which impl gets used. Additionally, it
might even be possible to add some smarts that could detect known "good"
parsers that use interning and automatically use the == based
implementation.
I do not recommend changing any part of the code without addressing the
whole codebase (i.e. all the =='s need to be fixed or no change should
be made) because of the possibility of creating new, unwanted, effects.
The current functionality is undesirable but better the devil you know.
I think that this should be addressed in the upcoming 1.4.4 release. If
quick consensus can be reached I'm willing to do the work with a window
of time I have available over the next 2-3 weeks.
[1] https://issues.apache.org/bugzilla/show_bug.cgi?id=40897
[2] https://issues.apache.org/bugzilla/show_bug.cgi?id=45637
[3] https://issues.apache.org/bugzilla/show_bug.cgi?id=46681
[4] https://issues.apache.org/bugzilla/show_bug.cgi?id=45637#c1
--
Chad La Joie
http://itumi.biz
trusted identities, delivered