[ https://issues.apache.org/jira/browse/HIVE-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Phabricator updated HIVE-3841: ------------------------------ Attachment: HIVE-3841.D7671.1.patch navis requested code review of "HIVE-3841 [jira] Sampling in previous MR for range partitioning of next RS". Reviewers: JIRA DPAL-1945 Sampling in previous MR for range partitioning of next RS Currently hive enforces single reducer for order by clause, which can be performance bottleneck. If sampling could be done on ordering key at previous MR stage, multiple reducers could be assigned for it. TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D7671 AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/HiveTotalOrderPartitioner.java ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java ql/src/java/org/apache/hadoop/hive/ql/exec/PartitionSampler.java ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/SampleMerger.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/SamplingOptimizer.java ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java ql/src/java/org/apache/hadoop/hive/ql/plan/SamplingContext.java MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/18381/ To: JIRA, navis > Sampling in previous MR for range partitioning of next RS > --------------------------------------------------------- > > Key: HIVE-3841 > URL: https://issues.apache.org/jira/browse/HIVE-3841 > Project: Hive > Issue Type: Improvement > Components: Query Processor > Reporter: Navis > Assignee: Navis > Priority: Minor > Attachments: HIVE-3841.D7671.1.patch > > > Currently hive enforces single reducer for order by clause, which can be > performance bottleneck. > If sampling could be done on ordering key at previous MR stage, multiple > reducers could be assigned for it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira