Re: RFR [9] 8151384: Examine sun.misc.ASCIICaseInsensitiveComparator

Chris Hegarty Wed, 09 Mar 2016 07:36:56 -0800

On 9 Mar 2016, at 14:43, Peter Levart <peter.lev...@gmail.com> wrote:


> 
> On 03/09/2016 02:44 PM, Chris Hegarty wrote:
>> On 9 Mar 2016, at 13:03, Claes Redestad <claes.redes...@oracle.com> wrote:
>> 
>>> On 2016-03-09 13:17, Peter Levart wrote:
>>>>> When digging through old history to try to find out why 
>>>>> java.util.jar.Attributes
>>>>> was ever using ASCIICaseInsensitiveComparator, it was not clear that
>>>>> performance was the motivation.
>>>> I guess looking-up a manifest attribute is not a performance critical 
>>>> operation, you are right.
>>> Could this be an old startup optimization, since first call to 
>>> String.toLowerCase/toUpperCase will initialize and pull in java.util.Locale 
>>> and friends? If so it's probably not effective any more.
>>> 
>>> Coincidentally - due to a recent regression - we're currently spending 
>>> quite a bit of time parsing manifests of all jar files on the classpath, 
>>> making ASCIICaseInsensitiveComparator show up prominently in some startup 
>>> profiles.
>> Not any more ( it is no longer with us )!!
>> 
>> Interesting… let me know if you issues once this change makes its
>> way into a promoted build, or during your performance investigations.
>> 
>> BTW. I am not against doing something “smarter” for Attributes.hashCode.
>> I just didn’t think it was relevant, or performance sensitive, any more.
>> 
>> -Chris
> 
> Hi Chris,
> 
> I have another concern. Let's say Attributes keys are LATIN1. So for 
> comparison, the StringLatin1.compareToCI is used:
> 
>    public static int compareToCI(byte[] value, byte[] other) {
>        int len1 = value.length;
>        int len2 = other.length;
>        int lim = Math.min(len1, len2);
>        for (int k = 0; k < lim; k++) {
>            if (value[k] != other[k]) {
>                char c1 = (char) 
> CharacterDataLatin1.instance.toUpperCase(getChar(value, k));
>                char c2 = (char) 
> CharacterDataLatin1.instance.toUpperCase(getChar(other, k));
>                if (c1 != c2) {
>                    c1 = (char) CharacterDataLatin1.instance.toLowerCase(c1);
>                    c2 = (char) CharacterDataLatin1.instance.toLowerCase(c2);
>                    if (c1 != c2) {
>                        return c1 - c2;
>                    }
>                }
>            }
>        }
>        return len1 - len2;
>    }
> 
> comparing this with Name.hashCode:
> 
>        public int hashCode() {
>            if (hashCode == -1) {
>                hashCode = name.toLowerCase(Locale.ROOT).hashCode();
>            }
>            return hashCode;
>        }
> 
> 
> ...is it possible that for some pair of keys, compareToCI would result in 0, 
> but hashCode(s) would differ? For example, the uppercased keys would be the 
> same, but the .toLowerCase(Locale.ROOT) not? Maybe not for LATIN1 keys, but 
> what if one uses non-latin1 keys (StringUTF16.compareToCI is similar)?

Can this really happen? ASCIICaseInsensitiveComparator was asserting that 
string characters were ASCII, so this situation would have triggered an assert
with the old code, no?

-Chris.

Re: RFR [9] 8151384: Examine sun.misc.ASCIICaseInsensitiveComparator

Reply via email to