Gopal V created HIVE-8349:
-----------------------------
Summary: DISTRIBUTE BY should work with tez auto-parallelism
enabled
Key: HIVE-8349
URL: https://issues.apache.org/jira/browse/HIVE-8349
Project: Hive
Issue Type: Bug
Reporter: Gopal V
Current implementation of DISTRIBUTE BY does not work when tez auto-parallelism
is turned on, because of hashCode distribution issues.
In case of distribute by, the key is actually zero bytes, with only
partitioning enabled via hashCode - this adversely affects the uniform hashing
implementation.
In an ideal scenario, the edge should go from the ordered kv input to the
unordered partitioned edge, to speed up the processing massively.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)