----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7126/ -----------------------------------------------------------
Review request for hive. Description ------- This optimizer exploits intra-query correlations and merges multiple correlated MapReduce jobs into one jobs. Open a new request since I have been working on hive-git. This addresses bug HIVE-2206. https://issues.apache.org/jira/browse/HIVE-2206 Diffs ----- common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 5efae89 ql/src/java/org/apache/hadoop/hive/ql/exec/BaseReduceSinkOperator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationCompositeOperator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationLocalSimulativeReduceSinkOperator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationReducerDispatchOperator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/ExecReducer.java 283d0b6 ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java e3ed13a ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java f0c35e7 ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 0c22141 ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java a2caeed ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 1a40630 ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java dffdd7b ql/src/java/org/apache/hadoop/hive/ql/optimizer/CorrelationOptimizer.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/CorrelationOptimizerUtils.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 6bc5fe4 ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 67d3a99 ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 8bacd3d ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java a65b0e4 ql/src/java/org/apache/hadoop/hive/ql/plan/BaseReduceSinkDesc.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationCompositeDesc.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationLocalSimulativeReduceSinkDesc.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationReducerDispatchDesc.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 5f38bf2 ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 16eb125 ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 9a95efd ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 142f040 ql/src/test/results/compiler/plan/groupby1.q.xml 4382252 ql/src/test/results/compiler/plan/groupby2.q.xml eef669c ql/src/test/results/compiler/plan/groupby3.q.xml 9743480 ql/src/test/results/compiler/plan/groupby5.q.xml 8e07860 Diff: https://reviews.apache.org/r/7126/diff/ Testing ------- Cannot test TestHBaseMinimrCliDriver, TestHBaseCliDriver, TestHBaseNegativeCliDriver, testSynchronized in TestEmbeddedHiveMetaStore, testSynchronized in TestRemoteHiveMetaStore, testSynchronized in TestSetUGIOnBothClientServer, testSynchronized in TestSetUGIOnOnlyClient, testSynchronized in TestSetUGIOnOnlyServer, and testNegativeCliDriver_local_mapred_error_cache in TestNegativeCliDriver. This patch should pass all other tests. When the optimizer is enabled (right now, the optimizer is disabled by default), there are several cases failed. 1 is optimized by the optimizer. 1 is not suitable for this correlation optimizer. 2 are due to potential bugs of the trunk. Other failures are parsing cases (xml plans). Those failures are due to my minor changes in SemanticAnalyzer since several redundant operators will be generated for the correlation optimizer. Overall, those failures are not very relevant to the patch. Please see https://issues.apache.org/jira/browse/HIVE-2206?focusedCommentId=13456171&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13456171 for details. Thanks, Yin Huai