SORT BY is always sending data to only the first reducer even if there are 
multiple reducers
--------------------------------------------------------------------------------------------

                 Key: HIVE-432
                 URL: https://issues.apache.org/jira/browse/HIVE-432
             Project: Hadoop Hive
          Issue Type: Bug
          Components: Query Processor
            Reporter: Zheng Shao
            Priority: Critical


When we generate the ReduceSInkOperator, the partition columns are empty, which 
means all the rows will get a hash value of 0, and they will all go to the 
first reducer.

In the meanwhile we are fixing this bug, please use "CLUSTER BY" instead of 
"SORT BY" so that the data will get distributed to multiple reducers.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to