Re: Tune String's hashCode() + equals()

Ulf Zibis Thu, 04 Mar 2010 13:39:05 -0800

Much thanks for your effort.

Am 04.03.2010 21:31, schrieb Marek Kozieł:

@Ulf
Few explanations:
1.

Intersting alternative, but I'm afraid, this is against the spec.
Shifting all 0's to 1 would break String's hash definition:  h = 31 * h + 
val[i++].

Yes it does, any way i think spec is to tight here. Do we really need
hash of each value even if String have length like 600000?
there is noting good coming from it in my opinion.
Did any one saw at least one code relaying on that ?
Btw. Same come with .compare

See discussion on project Coin list, subject "Benefit from computingString Hash at compile time?"

2.
private static final long isHashBijection= 0x8000000000000000L;
should be fine


Now I understand how it should work.

Your algorithm will guarantee only 2^63 bijectional values. So stringsof lenght >= 4 can't have bijectional hashes as 4 chars count 2^64variations.

3.
Second sample would work only if hash would be set in constructor so
even 0 would be valid hash.

4.
Maybe I'm wrong but if most cases when hash should be count is String
concatenation then we could make it +/-

a.hash+b.hash*powderOf31[a.length()]

so it would not consume so much time.


Interesting idea. I suggest to file an RFE.

5.
Intern string do not need hash codes co comparing cos they have same
address, so first loop would return true if they are equal, after this
we need only to check if they are not equal:

if (isIntern()&&  anotherString.isIntern()) return false;


You are right, but
    if (h1 != 0 && h2 != 0 && h1 != h2) return false;

would perform same (if already computed internal hash would beback-propagated to the Java object).

6.
Have in mind that adding additional fields to string might be not an
option, because memory lost this way may have great impact on average
application efficiency.


+ would break compatibility if objects are serialized.

-Ulf

Re: Tune String's hashCode() + equals()

Reply via email to