SORT BY is always sending data to only the first reducer even if there are
multiple reducers
--------------------------------------------------------------------------------------------
Key: HIVE-432
URL: https://issues.apache.org/jira/browse/HIVE-432
Project: Hadoop Hive
Issue Type: Bug
Components: Query Processor
Reporter: Zheng Shao
Priority: Critical
When we generate the ReduceSInkOperator, the partition columns are empty, which
means all the rows will get a hash value of 0, and they will all go to the
first reducer.
In the meanwhile we are fixing this bug, please use "CLUSTER BY" instead of
"SORT BY" so that the data will get distributed to multiple reducers.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.