Hello Chad, What command line options did you use? My testings were more reliable if use 100 warms-up let the jit run its magic, and then go for the timed test. Also are you running both tests in the same invocation if you do, the second will be handicap, as the first one will be just inline the second will have a switch to see if it is one interface or the other.
Regards, Raul On Mon, Aug 9, 2010 at 4:19 PM, Chad La Joie <laj...@itumi.biz> wrote: > So, I have some unexpected results from this work. > > I implemented a helper class that checked the equality of element local > names, attribute local names, namespace URIs, and namespace prefixes (i.e. > everything that Xerces always interns). Then I made sure to replace all == > != and equals() that I could find with the appropriate call. > > To test, I picked the Canonicalizer20010315ExclusiveTest test case and made > two alterations to the test22*excl methods: > - do one c14n operation out the timing loop just to make sure all the > classes are in memory, constants are loaded, etc. > - in a 100 iteration loop, create a new canonicalizer, canonicalize a DOM > tree, and time it using nanosecond time > > I did this for the example2_2_1.xml[1], example2_2_2.xml[2], example > 2_2_3.xml[3] input files (test221excl, test221excl, test223excl > respectively). > > Here are the results, measured in nanosecond timing. "total" indicates the > total time spent in all 100 runs, i.e. the summation of each of the 100 > results. > > test221excl: > equals() == > min 101000 99000 > max 123000 191000 > median 103000 105000 > avg 103760 106540 > total 10376000 10654000 > > test222excl: > equals() == > min 99000 101000 > max 192000 128000 > median 100000 108000 > avg 102110 108480 > total 10211000 10848000 > > test223excl (an XPath nodeset canonicalization) > equals() == > min 254000 248000 > max 290000 353000 > median 266000 265000 > avg 266820 265800 > total 26682000 26580000 > > So, what these numbers appear to suggest is that, in fact, equals() is more > often faster than ==. This seems counter-intuitive unless the JVM has > specialized optimization for the String.equals() method. > > Can anyone see where my testing is likely to be flawed? > > [1] > http://svn.apache.org/viewvc/xml/security/trunk/data/org/apache/xml/security/c14n/inExcl/example2_2_1.xml?revision=350494&view=markup > [2] > http://svn.apache.org/viewvc/xml/security/trunk/data/org/apache/xml/security/c14n/inExcl/example2_2_2.xml?revision=350494&view=markup > [3] > http://svn.apache.org/viewvc/xml/security/trunk/data/org/apache/xml/security/c14n/inExcl/example2_2_3.xml?revision=350915&view=markup > > On 8/2/10 10:11 AM, Chad La Joie wrote: > >> So, while I don't have my access yet, Colm asked me if I'd take a look >> at the == vs equals() issue (relevant bugs: 40897[1], 45637[2], 46681[3]) >> >> My executive summary is that clearly, as things stand, the current code >> favors optimization over correctness. Rarely is this a good thing. >> >> Colm notes[4] that the reliance on intern'ed strings (and thus the >> ability to use ==) occurs sporadically throughout the code and not just >> within the ElementChecker implementations. He specifically mentioned >> that the various C14N implementations, and indeed the == is used about 6 >> times there for string comparison. >> >> My recommendation then is two fold: >> - Ensure that nothing other than namespace bits are compared via ==. I >> don't know that this occurs but the code should definitely be reviewed >> to ensure that. >> >> - Create a new "NamespaceEqualityChecker" that provides methods for >> checking the various bits of a namespace (URIs, prefixes) and use it >> anywhere that either == or equals() is used today. Implementations based >> on == and equals() would be provided with the default implementation >> being equals()-based. A configuration option should then be made >> available to control which impl gets used. Additionally, it might even >> be possible to add some smarts that could detect known "good" parsers >> that use interning and automatically use the == based implementation. >> >> I do not recommend changing any part of the code without addressing the >> whole codebase (i.e. all the =='s need to be fixed or no change should >> be made) because of the possibility of creating new, unwanted, effects. >> The current functionality is undesirable but better the devil you know. >> >> I think that this should be addressed in the upcoming 1.4.4 release. If >> quick consensus can be reached I'm willing to do the work with a window >> of time I have available over the next 2-3 weeks. >> >> [1] https://issues.apache.org/bugzilla/show_bug.cgi?id=40897 >> [2] https://issues.apache.org/bugzilla/show_bug.cgi?id=45637 >> [3] https://issues.apache.org/bugzilla/show_bug.cgi?id=46681 >> [4] https://issues.apache.org/bugzilla/show_bug.cgi?id=45637#c1 >> > > -- > Chad La Joie > http://itumi.biz > trusted identities, delivered >