[ https://issues.apache.org/jira/browse/HIVE-19501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472950#comment-16472950 ]
Gopal V edited comment on HIVE-19501 at 5/12/18 7:10 AM: --------------------------------------------------------- The short and int versions are nearly identical, except they need to account for the sign-bit during the cast up from int -> long by doing {code} public static long hash64(int data) { long k1 = Integer.reverseBytes(data) & (-1L >>> 32); int length = Integer.BYTES; long hash = DEFAULT_SEED; k1 *= C1; k1 = Long.rotateLeft(k1, R1); k1 *= C2; hash ^= k1; // finalization hash ^= length; hash = fmix64(hash); return hash; } {code} was (Author: gopalv): The short and int versions are nearly identical, except they need to account for the sign-bit during the cast up from int -> long by doing {code} long k = Integer.reverseBytes(data) & (-1L >>> 32); int length = Integer.BYTES; {code} > Fix HyperLogLog to be threadsafe > -------------------------------- > > Key: HIVE-19501 > URL: https://issues.apache.org/jira/browse/HIVE-19501 > Project: Hive > Issue Type: Bug > Reporter: Zoltan Haindrich > Priority: Major > > not sure if this is an issue in reality or not; but there are 3 static fields > in HyperLogLog which are rewritten during working; if there are multiple > threads are calculating HLL in the same JVM, there is a theoretical chance > that they might overwrite eachothers value... > static fields: > https://github.com/apache/hive/blob/8028ce8a4cf5a03e2998c33e032a511fae770b47/standalone-metastore/src/main/java/org/apache/hadoop/hive/common/ndv/hll/HyperLogLog.java#L65 > usage: > https://github.com/apache/hive/blob/8028ce8a4cf5a03e2998c33e032a511fae770b47/standalone-metastore/src/main/java/org/apache/hadoop/hive/common/ndv/hll/HyperLogLog.java#L216 -- This message was sent by Atlassian JIRA (v7.6.3#76005)