[
https://issues.apache.org/jira/browse/STORM-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14635303#comment-14635303
]
ASF GitHub Bot commented on STORM-139:
--------------------------------------
Github user d2r commented on a diff in the pull request:
https://github.com/apache/storm/pull/641#discussion_r35120287
--- Diff:
storm-core/src/jvm/backtype/storm/grouping/PartialKeyGrouping.java ---
@@ -65,7 +66,11 @@ public void prepare(WorkerTopologyContext context,
GlobalStreamId stream, List<I
List<Object> selectedFields = outFields.select(fields,
values);
ByteBuffer out = ByteBuffer.allocate(selectedFields.size()
* 4);
for (Object o: selectedFields) {
- out.putInt(o.hashCode());
+ if (o instanceof Object[]) {
--- End diff --
Yes, we need to catch the primitive array types as well somehow.
Here `o` may not be a List. I think we need to check not only `instanceof
Object[]` but all of the other primitive array types as well. Probably
something like this?
```Java
for (Object o: selectedFields) {
if (o instanceof Object[]) {
out.putInt(Arrays.deepHashCode((Object[])o));
} else if (o instanceof byte[]) {
out.putInt(Arrays.hashCode((byte[]) o));
} else if (o instanceof short[]) {
out.putInt(Arrays.hashCode((short[]) o));
} else if (o instanceof int[]) {
out.putInt(Arrays.hashCode((int[]) o));
} else if (o instanceof long[]) {
out.putInt(Arrays.hashCode((long[]) o));
} else if (o instanceof char[]) {
out.putInt(Arrays.hashCode((char[]) o));
} else if (o instanceof float[]) {
out.putInt(Arrays.hashCode((float[]) o));
} else if (o instanceof double[]) {
out.putInt(Arrays.hashCode((double[]) o));
} else if (o instanceof boolean[]) {
out.putInt(Arrays.hashCode((boolean[]) o));
} else if (o != null) {
out.putInt(o.hashCode());
} else {
out.putInt(0);
}
```
This seems to be what
[Arrays#deepHashCode](http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/Arrays.java#3709)
does.
What do you think?
> hashCode does not work for byte[]
> ---------------------------------
>
> Key: STORM-139
> URL: https://issues.apache.org/jira/browse/STORM-139
> Project: Apache Storm
> Issue Type: Bug
> Reporter: James Xu
> Priority: Minor
>
> https://github.com/nathanmarz/storm/issues/245
> Storm should use a different hashCode method when getting the hash for a
> byte[] array, since the default one uses the object identity. Should check
> the behavior on other arrays as well
> ----------
> xiaokang: I tested byte[] and other arrays. The hashCode of array is the
> array object identity.
> I alse tested that java.util.Arrays.hashCode(xx[]) is based of the array
> element's hash code. It maybe ok change the list-hash-code function of
> tuple.clj to fix the problem.
> ----------
> Sirwellington: you may want to read this:
> http://martin.kleppmann.com/2012/06/18/java-hashcode-unsafe-for-distributed-systems.html
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)