On 1/8/13 6:55 PM, Vitaly Davidovich wrote:

Also, I'm not sure how hot this method is in practice but allocating StringBuilder seems a bit heavy (maybe it gets partially escape analyzed out though). Can you instead loop over all the chars in the string and build up the hash code as you go along? If you see a % then you handle next 2 chars specially, like you do now. Or are you trying to take advantage of String intrinsic support in the JIT? I guess if perf is a concern you can write a micro benchmark comparing the approaches ...

That did occur to me, but I guess we have to be consistent with the value that people have already been using, and that means I have to duplicate the code in String.hashCode() (that is what the original implementation was calling) - I was trying to avoid that. Also, String.hashCode() uses its internal char[] - whereas charAt() will involve several additional bound checks - but using toCharArray() may be better. Let me take another look at this, and get back with another webrev.

Sent from my phone

On Jan 8, 2013 9:45 PM, "Vitaly Davidovich" <vita...@gmail.com <mailto:vita...@gmail.com>> wrote:

    Hi Kurchi,

    In the hash method, I suggest you move handling of strings with %
    into a separate method to keep the hash method small for common
    case (no %).  Otherwise, there's a chance this method won't get
    inlined anymore due to its (newly increased) size.

- Yep, will do.

    Also, I realize toLower does the same thing, but why does toUpper
    return an int whereas it's really a char? Might be cleaner to
    declare return type as char and do the cast inside the method as
    needed.

- I followed the format of toLower(). But I agree this way it will be cleaner.

Thanks a lot,
Kurchi



    Thanks

    Sent from my phone

    On Jan 8, 2013 8:20 PM, "Kurchi Hazra"
    <kurchi.subhra.ha...@oracle.com
    <mailto:kurchi.subhra.ha...@oracle.com>> wrote:

        Hi,

            According to RFC 3986[1], hexadecimal digits encoded by a
        '%' should be case-insensitive, for example,%A2 and %a2 should be
        considered equal. Although, URI.equals() does take this into
        consideration, the implementation of URI.hashCode() does not and
        returns different hashcodes for two URIs that are similar in
        all respects except for the case of the percent-encoded
        hexadecimal digits.
        This fix attempts to construct a normalized string from the
        string representing a component before calculating its hashCode.
        I converted to upper case for the normalization(and not lower
        case) as required by [1].

            For testing the fix, I added an additional test scenario
        to an existing test (jdk/test/java/net/URI/Test.java). While I
        was there, I also made
        minor changes to the test so that it does not produce rawtype
        and other lint warnings.

        Bug: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7171415
        Webrev: http://cr.openjdk.java.net/~khazra/7171415/webrev.00/
        <http://cr.openjdk.java.net/%7Ekhazra/7171415/webrev.00/>

        URI.compareTo() still suffers from the same problem - I am not
        sure if it should be dealt with as a separate bug.

        Thanks,
        Kurchi

        [1] http://tools.ietf.org/html/rfc3986#section-6.2.2.1


Reply via email to