Hi Chad,

> My recommendation then is two fold:
>  - Ensure that nothing other than namespace bits are compared via ==. I
> don't know that this occurs but the code should definitely be reviewed to
> ensure that.

Agreed.

>  - Create a new "NamespaceEqualityChecker" that provides methods for
> checking the various bits of a namespace (URIs, prefixes) and use it
> anywhere that either == or equals() is used today.  Implementations based on
> == and equals() would be provided with the default implementation being
> equals()-based.  A configuration option should then be made available to
> control which impl gets used.

Sounds good. Have you thought about backwards compatibility issues
though, in relation to the ElementCheckerImpl? We don't want to break
backwards compatibility for a minor release if at all possible.

> I think that this should be addressed in the upcoming 1.4.4 release.  If
> quick consensus can be reached I'm willing to do the work with a window of
> time I have available over the next 2-3 weeks.

That would be great! It would be interesting to run some benchmarks
when you're done to compare both options.

Colm.


On Mon, Aug 2, 2010 at 3:11 PM, Chad La Joie <laj...@itumi.biz> wrote:
> So, while I don't have my access yet, Colm asked me if I'd take a look at
> the == vs equals() issue (relevant bugs: 40897[1], 45637[2], 46681[3])
>
> My executive summary is that clearly, as things stand, the current code
> favors optimization over correctness.  Rarely is this a good thing.
>
> Colm notes[4] that the reliance on intern'ed strings (and thus the ability
> to use ==) occurs sporadically throughout the code and not just within the
> ElementChecker implementations.  He specifically mentioned that the various
> C14N implementations, and indeed the == is used about 6 times there for
> string comparison.
>
> My recommendation then is two fold:
>  - Ensure that nothing other than namespace bits are compared via ==. I
> don't know that this occurs but the code should definitely be reviewed to
> ensure that.
>
>  - Create a new "NamespaceEqualityChecker" that provides methods for
> checking the various bits of a namespace (URIs, prefixes) and use it
> anywhere that either == or equals() is used today.  Implementations based on
> == and equals() would be provided with the default implementation being
> equals()-based.  A configuration option should then be made available to
> control which impl gets used.  Additionally, it might even be possible to
> add some smarts that could detect known "good" parsers that use interning
> and automatically use the == based implementation.
>
> I do not recommend changing any part of the code without addressing the
> whole codebase (i.e. all the =='s need to be fixed or no change should be
> made) because of the possibility of creating new, unwanted, effects.  The
> current functionality is undesirable but better the devil you know.
>
> I think that this should be addressed in the upcoming 1.4.4 release.  If
> quick consensus can be reached I'm willing to do the work with a window of
> time I have available over the next 2-3 weeks.
>
> [1] https://issues.apache.org/bugzilla/show_bug.cgi?id=40897
> [2] https://issues.apache.org/bugzilla/show_bug.cgi?id=45637
> [3] https://issues.apache.org/bugzilla/show_bug.cgi?id=46681
> [4] https://issues.apache.org/bugzilla/show_bug.cgi?id=45637#c1
> --
> Chad La Joie
> http://itumi.biz
> trusted identities, delivered
>

Reply via email to