Re: RFR [9] 8151384: Examine sun.misc.ASCIICaseInsensitiveComparator

Peter Levart Wed, 09 Mar 2016 07:39:17 -0800

Hi Chris,

So what do you think of providing a hashCode forjava.util.jar.Attributes.Name that is obviously consistent with itsequals method (and not dependent on the good will of Unicode tables)*and* also provide it as a public API to serve others, like for example:


http://cr.openjdk.java.net/~plevart/jdk9-dev/String.CASE_INSENSITIVE_HASHER/webrev.01/

Regards, Peter


On 03/09/2016 03:43 PM, Peter Levart wrote:

On 03/09/2016 02:44 PM, Chris Hegarty wrote:
On 9 Mar 2016, at 13:03, Claes Redestad <[email protected]>wrote:
On 2016-03-09 13:17, Peter Levart wrote:
When digging through old history to try to find out whyjava.util.jar.Attributes
was ever using ASCIICaseInsensitiveComparator, it was not clear that
performance was the motivation.
I guess looking-up a manifest attribute is not a performancecritical operation, you are right.
Could this be an old startup optimization, since first call toString.toLowerCase/toUpperCase will initialize and pull injava.util.Locale and friends? If so it's probably not effective anymore.
Coincidentally - due to a recent regression - we're currentlyspending quite a bit of time parsing manifests of all jar files onthe classpath, making ASCIICaseInsensitiveComparator show upprominently in some startup profiles.
Not any more ( it is no longer with us )!!

Interesting… let me know if you issues once this change makes its
way into a promoted build, or during your performance investigations.

BTW. I am not against doing something “smarter” for Attributes.hashCode.
I just didn’t think it was relevant, or performance sensitive, any more.

-Chris
Hi Chris,
I have another concern. Let's say Attributes keys are LATIN1. So forcomparison, the StringLatin1.compareToCI is used:
    public static int compareToCI(byte[] value, byte[] other) {
        int len1 = value.length;
        int len2 = other.length;
        int lim = Math.min(len1, len2);
        for (int k = 0; k < lim; k++) {
            if (value[k] != other[k]) {
char c1 = (char)CharacterDataLatin1.instance.toUpperCase(getChar(value, k));char c2 = (char)CharacterDataLatin1.instance.toUpperCase(getChar(other, k));
                if (c1 != c2) {
c1 = (char)CharacterDataLatin1.instance.toLowerCase(c1);c2 = (char)CharacterDataLatin1.instance.toLowerCase(c2);
                    if (c1 != c2) {
                        return c1 - c2;
                    }
                }
            }
        }
        return len1 - len2;
    }

comparing this with Name.hashCode:

        public int hashCode() {
            if (hashCode == -1) {
                hashCode = name.toLowerCase(Locale.ROOT).hashCode();
            }
            return hashCode;
        }
...is it possible that for some pair of keys, compareToCI would resultin 0, but hashCode(s) would differ? For example, the uppercased keyswould be the same, but the .toLowerCase(Locale.ROOT) not? Maybe notfor LATIN1 keys, but what if one uses non-latin1 keys(StringUTF16.compareToCI is similar)?
Regards, Peter

Re: RFR [9] 8151384: Examine sun.misc.ASCIICaseInsensitiveComparator

Reply via email to