Re: Code Review Request: 7171415: java.net.URI.equals/hashCode not consistent for some URIs

Kurchi Subhra Hazra Tue, 08 Jan 2013 20:04:31 -0800

On 1/8/13 6:55 PM, Vitaly Davidovich wrote:

Also, I'm not sure how hot this method is in practice but allocatingStringBuilder seems a bit heavy (maybe it gets partially escapeanalyzed out though). Can you instead loop over all the chars in thestring and build up the hash code as you go along? If you see a % thenyou handle next 2 chars specially, like you do now. Or are you tryingto take advantage of String intrinsic support in the JIT? I guess ifperf is a concern you can write a micro benchmark comparing theapproaches ...

That did occur to me, but I guess we have to be consistent with thevalue that people have already been using, and that means I haveto duplicate the code in String.hashCode() (that is what the originalimplementation was calling) - I was trying to avoid that. Also,String.hashCode() uses its internal char[] - whereas charAt() willinvolve several additional bound checks - butusing toCharArray() may be better. Let me take another look at this, andget back with another webrev.

Sent from my phone

On Jan 8, 2013 9:45 PM, "Vitaly Davidovich" <vita...@gmail.com<mailto:vita...@gmail.com>> wrote:


    Hi Kurchi,

    In the hash method, I suggest you move handling of strings with %
    into a separate method to keep the hash method small for common
    case (no %).  Otherwise, there's a chance this method won't get
    inlined anymore due to its (newly increased) size.

- Yep, will do.

    Also, I realize toLower does the same thing, but why does toUpper
    return an int whereas it's really a char? Might be cleaner to
    declare return type as char and do the cast inside the method as
    needed.

- I followed the format of toLower(). But I agree this way it will becleaner.


Thanks a lot,
Kurchi

    Thanks

    Sent from my phone

    On Jan 8, 2013 8:20 PM, "Kurchi Hazra"
    <kurchi.subhra.ha...@oracle.com
    <mailto:kurchi.subhra.ha...@oracle.com>> wrote:

        Hi,

            According to RFC 3986[1], hexadecimal digits encoded by a
        '%' should be case-insensitive, for example,%A2 and %a2 should be
        considered equal. Although, URI.equals() does take this into
        consideration, the implementation of URI.hashCode() does not and
        returns different hashcodes for two URIs that are similar in
        all respects except for the case of the percent-encoded
        hexadecimal digits.
        This fix attempts to construct a normalized string from the
        string representing a component before calculating its hashCode.
        I converted to upper case for the normalization(and not lower
        case) as required by [1].

            For testing the fix, I added an additional test scenario
        to an existing test (jdk/test/java/net/URI/Test.java). While I
        was there, I also made
        minor changes to the test so that it does not produce rawtype
        and other lint warnings.

        Bug: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7171415
        Webrev: http://cr.openjdk.java.net/~khazra/7171415/webrev.00/
        <http://cr.openjdk.java.net/%7Ekhazra/7171415/webrev.00/>

        URI.compareTo() still suffers from the same problem - I am not
        sure if it should be dealt with as a separate bug.

        Thanks,
        Kurchi

        [1] http://tools.ietf.org/html/rfc3986#section-6.2.2.1

Re: Code Review Request: 7171415: java.net.URI.equals/hashCode not consistent for some URIs

Reply via email to