I agree with your decision based on test. It will be risky and will have littele gain to use == for string comparison.
Eric On Tue, Aug 24, 2010 at 2:11 PM, Chad La Joie <laj...@itumi.biz> wrote: > Okay, I'll prepare a patch for you by the end of the week. > > > On 8/24/10 2:23 PM, Colm O hEigeartaigh wrote: > >> Sounds fine to me. >> >> Colm. >> >> On Mon, Aug 23, 2010 at 8:55 PM, Chad La Joie<laj...@itumi.biz> wrote: >> >>> Okay, getting back to this. >>> >>> I tried my tests again this time with: >>> - a 7.5MB SAML metadata document (so lots of comparisons) >>> - 100 warm up runs then 100 timed runs >>> - an explicit GC between each run to keep it from happening during the >>> runs >>> since the DOMs were so large >>> >>> No real difference in results. equals() was faster. >>> >>> So, at this point, I can't see any reason to do anything other than >>> equals(). It's the actual correct way of doing the comparison in that it >>> will always return the proper result and the JVM definitely seems to be >>> optimizing its use. >>> >>> On 8/10/10 7:53 AM, Chad La Joie wrote: >>> >>>> >>>> Okay, I certainly have a number of SAML documents lying around so I'll >>>> try with those as well. And, of course, I'll report back the results I >>>> get. >>>> >>>> On 8/10/10 4:46 AM, Raul Benito wrote: >>>> >>>>> >>>>> As the original author of the changes of equals to == in intern >>>>> namespaces, I can tell that original in 1.4 and 1.5 and with my data >>>>> (that was the verification of a SAML/Liberty AuthnReq in a multi thread >>>>> tests, and the old Juice JCE provider). The change was 10% to 20% >>>>> faster. >>>>> The SAML is one of the real example of signing and has some url with >>>>> common prefixes and same length url. >>>>> The Juice provider also helps to get rid of the signing/digest cost (a >>>>> verification is two c14n one of the signing part and c14n of the >>>>> signature), but i think just a c14n is a good way of measure it. >>>>> Also take into account that the == vs equals debate is more a memory >>>>> workload cache problem, if we have to iterate over and over every char >>>>> just to see if it is not equals, we trash the cache (That's why i used >>>>> the multi thread to simulate a server decoding requests with more or >>>>> less the same code, but in different times and different "workload") >>>>> Nevertheless if you have test with a more modern jre and the code >>>>> .equals is behaving better, just go ahead and kiss goodbye to the ==. >>>>> >>>>> Clive, using the .hashCode for strings in this case is not a big >>>>> speed-up as it is going to go through all the chars of the string, >>>>> trashing cache again, and multiplying and adding the result to an >>>>> integer, instead of a fail in the first different char or just >>>>> summarize >>>>> to a boolean.\ >>>>> >>>>> Regards, >>>>> >>>>> >>>>> On Tue, Aug 10, 2010 at 2:37 AM, Clive Brettingham-Moore >>>>> <xml...@brettingham-moore.net<mailto:xml...@brettingham-moore.net>> >>>>> wrote: >>>>> >>>>> Have to agree .equals is the way to go, since correctness of == is too >>>>> reliant on what must be considered implementation optimisations in the >>>>> parser. >>>>> >>>>> Benchmarking in JVM is notoriously difficult, but it does look like >>>>> there is no gross difference, which should kill any objections to doing >>>>> it correctly. >>>>> >>>>> Since I recently spend far to long researching this for an unrelated >>>>> problem I'll add my 10c to the detail discussion. >>>>> >>>>> On 10/08/10 01:23, Chad La Joie wrote: >>>>> >>>>> Not necessarily, there are a number of not equal checks in there that >>>>>> should, in theory, perform better if you only use == only. In such a >>>>>> case, the use of != will just be a single check while !equals() will >>>>>> result in a char-by-char comparison. >>>>>> >>>>> >>>>> Actually, the next thing String.equals tests is length equality - so >>>>> character comparison will only be reached if the strings are the same >>>>> length. >>>>> >>>>> Since the char by char comparison returns on the first mismatch, then >>>>> only same length strings with shared prefixes will show the expected >>>>> slowness. (namespace URIs are likely to share prefixes, but I think are >>>>> not particularly likely to be the same length, unless actually >>>>> equal)... >>>>> thus String.equals is only likely to be slow where comparing long >>>>> distinct but equal strings (so intern or alternative string pooling >>>>> techniques needed for == benefit .equals without all the nasty >>>>> loopholes: even if .equals is occasionally slow, at least it is always >>>>> right). >>>>> >>>>> In circumstances where doing repeated tests with many length and prefix >>>>> matches, adding a hash code inequality test ((s1.hashCode()== >>>>> s2.hashCode())&&s1.equals(s2)) could prevent practically all >>>>> char-by-char checks for !equal cases (but if the same strings are never >>>>> repeatedly used, the hash code calculation could be an issue; nb intern >>>>> results in hash calculation for all strings anyway)... pooling is still >>>>> needed to speed up matches for equality though. >>>>> >>>>> Re VM options I would feel -server is definitely the right test bed, >>>>> both because of the more aggressive JIT, and also because the code is >>>>> likely to see heaviest real world cases in -server VMs. >>>>> >>>>> >>>>> >>>> >>> -- >>> Chad La Joie >>> http://itumi.biz >>> trusted identities, delivered >>> >>> >> > -- > Chad La Joie > http://itumi.biz > trusted identities, delivered >