-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7126/
-----------------------------------------------------------
(Updated Sept. 24, 2012, 3:53 p.m.)
Review request for hive.
Changes
-------
bug fix
Description
-------
This optimizer exploits intra-query correlations and merges multiple correlated
MapReduce jobs into one jobs. Open a new request since I have been working on
hive-git.
This addresses bug HIVE-2206.
https://issues.apache.org/jira/browse/HIVE-2206
Diffs (updated)
-----
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 2693663
ql/src/java/org/apache/hadoop/hive/ql/exec/BaseReduceSinkOperator.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationCompositeOperator.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationLocalSimulativeReduceSinkOperator.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationReducerDispatchOperator.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/ExecReducer.java 283d0b6
ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 8669051
ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 5f08519
ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 0c22141
ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 919a140
ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 1a40630
ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java 1469325
ql/src/java/org/apache/hadoop/hive/ql/optimizer/CorrelationOptimizer.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/optimizer/CorrelationOptimizerUtils.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 40dd949
ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java f292131
ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 8bacd3d
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 33ce6ca
ql/src/java/org/apache/hadoop/hive/ql/plan/BaseReduceSinkDesc.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationCompositeDesc.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationLocalSimulativeReduceSinkDesc.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationReducerDispatchDesc.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 5f38bf2
ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 16eb125
ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 9a95efd
ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 142f040
ql/src/test/queries/clientpositive/correlationoptimizer1.q PRE-CREATION
ql/src/test/queries/clientpositive/correlationoptimizer2.q PRE-CREATION
ql/src/test/queries/clientpositive/correlationoptimizer3.q PRE-CREATION
ql/src/test/queries/clientpositive/correlationoptimizer4.q PRE-CREATION
ql/src/test/queries/clientpositive/correlationoptimizer5.q PRE-CREATION
ql/src/test/results/clientpositive/correlationoptimizer1.q.out PRE-CREATION
ql/src/test/results/clientpositive/correlationoptimizer2.q.out PRE-CREATION
ql/src/test/results/clientpositive/correlationoptimizer3.q.out PRE-CREATION
ql/src/test/results/clientpositive/correlationoptimizer4.q.out PRE-CREATION
ql/src/test/results/clientpositive/correlationoptimizer5.q.out PRE-CREATION
ql/src/test/results/compiler/plan/groupby1.q.xml 4382252
ql/src/test/results/compiler/plan/groupby2.q.xml eef669c
ql/src/test/results/compiler/plan/groupby3.q.xml 9743480
ql/src/test/results/compiler/plan/groupby5.q.xml 8e07860
Diff: https://reviews.apache.org/r/7126/diff/
Testing
-------
Cannot test TestHBaseMinimrCliDriver, TestHBaseCliDriver,
TestHBaseNegativeCliDriver, testSynchronized in TestEmbeddedHiveMetaStore,
testSynchronized in TestRemoteHiveMetaStore, testSynchronized in
TestSetUGIOnBothClientServer, testSynchronized in TestSetUGIOnOnlyClient,
testSynchronized in TestSetUGIOnOnlyServer, and
testNegativeCliDriver_local_mapred_error_cache in TestNegativeCliDriver, since
trunk failed on these tests on my machine. Also, since trunk will generate a
different order of results (rows are in a different order) for queries
skewjoinopt1.q to skewjoinopt5.q, skewjoinopt10.q, skewjoinopt15.q to
skewjoinopt17.q, and skewjoinopt19.q to skewjoinopt20.q, I cannot test these
queries on my machine either. All other tests pass.
Thanks,
Yin Huai