[
https://issues.apache.org/jira/browse/AVRO-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191935#comment-17191935
]
Kyoungha Min commented on AVRO-2905:
------------------------------------
Im very sorry for the late reply.
I created [PR|https://github.com/apache/avro/pull/955]
> org.apache.avro.util.Utf8 bug in `hashCode`
> -------------------------------------------
>
> Key: AVRO-2905
> URL: https://issues.apache.org/jira/browse/AVRO-2905
> Project: Apache Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.10.0
> Reporter: Kyoungha Min
> Priority: Major
> Fix For: 1.10.1
>
>
> Bug : utf8 hash reset is not correctly done, as a result, returns different
> hash when the values are logically equal.
>
> cause commit :
>
> [https://github.com/apache/avro/commit/d5d1cd70e920f45468bf696d3668b496a9adbab3#diff-7001731cdafe4fc24f86e71c7e8a0b61]
> [https://github.com/apache/avro/commit/8e345c48e8f9708c27af63754965777314dac85c#diff-7001731cdafe4fc24f86e71c7e8a0b61]
>
> reproduce :
> {code:java}
> org.apache.avro.util.Utf8 utf8 = new org.apache.avro.util.Utf8();
> utf8.set("hello");
> System.out.println(utf8.hashCode()); // 99162322
> utf8.set("hello");
> System.out.println(utf8.hashCode()); // -1538739392
> {code}
>
> Suggested fix.
> # get rid of `hasHash` field. In my opinion, it leads to more bug-prone by
> adding additional variable to take care of. `hash != 0` should be enough to
> check the presence of hash. Redundant computation for `0 hash` elements
> should be negligible.
> # reset `hash = 0` on every mutation call.
> # not needed, but I want to see some minor optimization like avoiding memory
> access by creating local variable (int hash = 0; .... this.hash = hash).
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)