Teddy Choi created HIVE-20873: --------------------------------- Summary: Use Murmur hash for VectorHashKeyWrapperTwoLong to reduce hash collision Key: HIVE-20873 URL: https://issues.apache.org/jira/browse/HIVE-20873 Project: Hive Issue Type: Improvement Reporter: Teddy Choi Assignee: Teddy Choi
VectorHashKeyWrapperTwoLong is implemented with few bit shift operators and XOR operators for short computation time, but more hash collision. Group by operations become very slow on large data sets. It needs Murmur hash or a better hash function for less hash collision. -- This message was sent by Atlassian JIRA (v7.6.3#76005)