I agree with your decision based on test. It will be risky and will have
littele gain to use == for string comparison.

Eric

On Tue, Aug 24, 2010 at 2:11 PM, Chad La Joie <laj...@itumi.biz> wrote:

> Okay, I'll prepare a patch for you by the end of the week.
>
>
> On 8/24/10 2:23 PM, Colm O hEigeartaigh wrote:
>
>> Sounds fine to me.
>>
>> Colm.
>>
>> On Mon, Aug 23, 2010 at 8:55 PM, Chad La Joie<laj...@itumi.biz>  wrote:
>>
>>> Okay, getting back to this.
>>>
>>> I tried my tests again this time with:
>>>  - a 7.5MB SAML metadata document (so lots of comparisons)
>>>  - 100 warm up runs then 100 timed runs
>>>  - an explicit GC between each run to keep it from happening during the
>>> runs
>>> since the DOMs were so large
>>>
>>> No real difference in results. equals() was faster.
>>>
>>> So, at this point, I can't see any reason to do anything other than
>>> equals().  It's the actual correct way of doing the comparison in that it
>>> will always return the proper result and the JVM definitely seems to be
>>> optimizing its use.
>>>
>>> On 8/10/10 7:53 AM, Chad La Joie wrote:
>>>
>>>>
>>>> Okay, I certainly have a number of SAML documents lying around so I'll
>>>> try with those as well. And, of course, I'll report back the results I
>>>> get.
>>>>
>>>> On 8/10/10 4:46 AM, Raul Benito wrote:
>>>>
>>>>>
>>>>> As the original author of the changes of equals to == in intern
>>>>> namespaces, I can tell that original in 1.4 and 1.5 and with my data
>>>>> (that was the verification of a SAML/Liberty AuthnReq in a multi thread
>>>>> tests, and the old Juice JCE provider). The change was 10% to 20%
>>>>> faster.
>>>>> The SAML is one of the real example of signing and has some url with
>>>>> common prefixes and same length url.
>>>>> The Juice provider also helps to get rid of the signing/digest cost (a
>>>>> verification is two c14n one of the signing part and c14n of the
>>>>> signature), but i think just a c14n is a good way of measure it.
>>>>> Also take into account that the == vs equals debate is more a memory
>>>>> workload cache problem, if we have to iterate over and over every char
>>>>> just to see if it is not equals, we trash the cache (That's why i used
>>>>> the multi thread to simulate a server decoding requests with more or
>>>>> less the same code, but in different times and different "workload")
>>>>> Nevertheless if you have test with a more modern jre and the code
>>>>> .equals is behaving better, just go ahead and kiss goodbye to the ==.
>>>>>
>>>>> Clive, using the .hashCode for strings in this case is not a big
>>>>> speed-up as it is going to go through all the chars of the string,
>>>>> trashing cache again, and multiplying and adding the result to an
>>>>> integer, instead of a fail in the first different char or just
>>>>> summarize
>>>>> to a boolean.\
>>>>>
>>>>> Regards,
>>>>>
>>>>>
>>>>> On Tue, Aug 10, 2010 at 2:37 AM, Clive Brettingham-Moore
>>>>> <xml...@brettingham-moore.net<mailto:xml...@brettingham-moore.net>>
>>>>> wrote:
>>>>>
>>>>> Have to agree .equals is the way to go, since correctness of == is too
>>>>> reliant on what must be considered implementation optimisations in the
>>>>> parser.
>>>>>
>>>>> Benchmarking in JVM is notoriously difficult, but it does look like
>>>>> there is no gross difference, which should kill any objections to doing
>>>>> it correctly.
>>>>>
>>>>> Since I recently spend far to long researching this for an unrelated
>>>>> problem I'll add my 10c to the detail discussion.
>>>>>
>>>>> On 10/08/10 01:23, Chad La Joie wrote:
>>>>>
>>>>>  Not necessarily, there are a number of not equal checks in there that
>>>>>> should, in theory, perform better if you only use == only. In such a
>>>>>> case, the use of != will just be a single check while !equals() will
>>>>>> result in a char-by-char comparison.
>>>>>>
>>>>>
>>>>> Actually, the next thing String.equals tests is length equality - so
>>>>> character comparison will only be reached if the strings are the same
>>>>> length.
>>>>>
>>>>> Since the char by char comparison returns on the first mismatch, then
>>>>> only same length strings with shared prefixes will show the expected
>>>>> slowness. (namespace URIs are likely to share prefixes, but I think are
>>>>> not particularly likely to be the same length, unless actually
>>>>> equal)...
>>>>> thus String.equals is only likely to be slow where comparing long
>>>>> distinct but equal strings (so intern or alternative string pooling
>>>>> techniques needed for == benefit .equals without all the nasty
>>>>> loopholes: even if .equals is occasionally slow, at least it is always
>>>>> right).
>>>>>
>>>>> In circumstances where doing repeated tests with many length and prefix
>>>>> matches, adding a hash code inequality test ((s1.hashCode()==
>>>>> s2.hashCode())&&s1.equals(s2)) could prevent practically all
>>>>> char-by-char checks for !equal cases (but if the same strings are never
>>>>> repeatedly used, the hash code calculation could be an issue; nb intern
>>>>> results in hash calculation for all strings anyway)... pooling is still
>>>>> needed to speed up matches for equality though.
>>>>>
>>>>> Re VM options I would feel -server is definitely the right test bed,
>>>>> both because of the more aggressive JIT, and also because the code is
>>>>> likely to see heaviest real world cases in -server VMs.
>>>>>
>>>>>
>>>>>
>>>>
>>> --
>>> Chad La Joie
>>> http://itumi.biz
>>> trusted identities, delivered
>>>
>>>
>>
> --
> Chad La Joie
> http://itumi.biz
> trusted identities, delivered
>

Reply via email to