[
https://issues.apache.org/jira/browse/S4-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174929#comment-13174929
]
Matthieu Morel commented on S4-30:
----------------------------------
thanks a lot Quoc, that's very useful!
What happens is that the calculated hash value is truncated to a _signed_ 32
bits number (contrary to what I initially assumed).
I'm not exactly sure about the rationale for truncating to 32 bits, and I don't
see an optimized way to make sure we get a positive value when casting to int,
maybe somebody has one?
In the meantime, we could simply use Math.abs (slower, but correct!) and
probably replace:
{code}return rv & 0xffffffffL;{code}
with
{code}return Math.abs((int)(rv & 0xffffffffL));{code}
...so that we make sure we have a positive value when we cast to an integer.
We might also add regression tests such as those from twitter's utility library
https://github.com/twitter/util/blob/master/util-hashing/src/test/scala/com/twitter/hashing/KeyHasherSpec.scala
> DefaultHasher hashes keys to negative number
> --------------------------------------------
>
> Key: S4-30
> URL: https://issues.apache.org/jira/browse/S4-30
> Project: Apache S4
> Issue Type: Bug
> Affects Versions: 0.4
> Environment: All - Windows and Linux
> Reporter: Quoc Nguyen
> Priority: Blocker
>
> DefaultHasher uses HashAlgorithm hashAlgorithm = HashAlgorithm.FNV1_64_HASH;
> which hashes key strings such as 118+18233, 118+17360, 118+17258, 118+18147
> and 118+18121 and many more to negative values which the DefaultPartitioner
> (int partitionId = (int) (hasher.hash(stringValue) % partitionCount);) tries
> to partition the key to incorrect partition.
> Workaround:
> None - stream has those keys, they will get dropped because the partitioner
> cannot correctly partition.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira