Okay, I certainly have a number of SAML documents lying around so I'll
try with those as well. And, of course, I'll report back the results I get.
On 8/10/10 4:46 AM, Raul Benito wrote:
As the original author of the changes of equals to == in intern
namespaces, I can tell that original in 1.4 and 1.5 and with my data
(that was the verification of a SAML/Liberty AuthnReq in a multi thread
tests, and the old Juice JCE provider). The change was 10% to 20% faster.
The SAML is one of the real example of signing and has some url with
common prefixes and same length url.
The Juice provider also helps to get rid of the signing/digest cost (a
verification is two c14n one of the signing part and c14n of the
signature), but i think just a c14n is a good way of measure it.
Also take into account that the == vs equals debate is more a memory
workload cache problem, if we have to iterate over and over every char
just to see if it is not equals, we trash the cache (That's why i used
the multi thread to simulate a server decoding requests with more or
less the same code, but in different times and different "workload")
Nevertheless if you have test with a more modern jre and the code
.equals is behaving better, just go ahead and kiss goodbye to the ==.
Clive, using the .hashCode for strings in this case is not a big
speed-up as it is going to go through all the chars of the string,
trashing cache again, and multiplying and adding the result to an
integer, instead of a fail in the first different char or just summarize
to a boolean.\
Regards,
On Tue, Aug 10, 2010 at 2:37 AM, Clive Brettingham-Moore
<xml...@brettingham-moore.net <mailto:xml...@brettingham-moore.net>> wrote:
Have to agree .equals is the way to go, since correctness of == is too
reliant on what must be considered implementation optimisations in the
parser.
Benchmarking in JVM is notoriously difficult, but it does look like
there is no gross difference, which should kill any objections to doing
it correctly.
Since I recently spend far to long researching this for an unrelated
problem I'll add my 10c to the detail discussion.
On 10/08/10 01:23, Chad La Joie wrote:
> Not necessarily, there are a number of not equal checks in there that
> should, in theory, perform better if you only use == only. In such a
> case, the use of != will just be a single check while !equals() will
> result in a char-by-char comparison.
Actually, the next thing String.equals tests is length equality - so
character comparison will only be reached if the strings are the same
length.
Since the char by char comparison returns on the first mismatch, then
only same length strings with shared prefixes will show the expected
slowness. (namespace URIs are likely to share prefixes, but I think are
not particularly likely to be the same length, unless actually equal)...
thus String.equals is only likely to be slow where comparing long
distinct but equal strings (so intern or alternative string pooling
techniques needed for == benefit .equals without all the nasty
loopholes: even if .equals is occasionally slow, at least it is always
right).
In circumstances where doing repeated tests with many length and prefix
matches, adding a hash code inequality test ((s1.hashCode()==
s2.hashCode())&&s1.equals(s2)) could prevent practically all
char-by-char checks for !equal cases (but if the same strings are never
repeatedly used, the hash code calculation could be an issue; nb intern
results in hash calculation for all strings anyway)... pooling is still
needed to speed up matches for equality though.
Re VM options I would feel -server is definitely the right test bed,
both because of the more aggressive JIT, and also because the code is
likely to see heaviest real world cases in -server VMs.
--
Chad La Joie
http://itumi.biz
trusted identities, delivered