Rui Li created HIVE-8017: ---------------------------- Summary: Use HiveKey instead of Byteswritable as key type of the pair RDD [Spark Branch] Key: HIVE-8017 URL: https://issues.apache.org/jira/browse/HIVE-8017 Project: Hive Issue Type: Bug Components: Spark Reporter: Rui Li Assignee: Rui Li
HiveKey should be used as the key type because it holds the hash code for partitioning. While BytesWritable serves partitioning well for simple cases, we have to use {{HiveKey.hashCode}} for more complicated ones, e.g. join, bucketed table, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)