-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18210/
-----------------------------------------------------------
Review request for Tajo.
Bugs: TAJO-601
https://issues.apache.org/jira/browse/TAJO-601
Repository: tajo
Description
-------
Currently, distinct aggregation queries are executed as follows:
* the first stage: it just shuffles tuples by hashing grouping keys.
* the second stage: it sorts them and executes sort aggregation.
This way executes queries including distinct aggregation functions with only
two stages. But, it leads to large intermediate data during shuffle phase.
This kind of query can be rewritten as two queries:
[Original query]
----------
SELECT grp1, grp2, count(*) as total, count(distinct grp3) as distinct_col from
rel1 group by grp1, grp2;
----------
[Rewritten query]
----------
SELECT grp1, grp2, sum(cnt) as total, count(grp3) as distinct_col from (
SELECT grp1, grp2, grp3, count(*) as cnt from rel1 group by grp1, grp2, grp3)
tmp1 group by grp1, grp2
) table1;
----------
I'm expecting that this rewrite will significantly reduce the intermediate data
volume and query response time in most cases.
Diffs
-----
tajo-common/src/main/java/org/apache/tajo/util/TUtil.java cc694d4
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/eval/EvalTreeUtil.java
da05739
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumDoubleDistinct.java
PRE-CREATION
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloat.java
10fd720
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloatDistinct.java
PRE-CREATION
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumIntDistinct.java
PRE-CREATION
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumLongDistinct.java
PRE-CREATION
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/ExprsVerifier.java
b14c448
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java
f7c0bfa
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PlannerUtil.java
624518b
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PreLogicalPlanVerifier.java
6dac031
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/DataChannel.java
efa1e05
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java
f390b52
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java
91f658d
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/SeqScanExec.java
a0c0eeb
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/FilterPushDownRule.java
399903c
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/PartitionedTableRewriter.java
e5f7fb4
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/ProjectionPushDownRule.java
633d0c1
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMaster.java
ae6d5eb
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMasterManagerService.java
3c30e38
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/eval/TestEvalTreeUtil.java
d756242
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestGroupByQuery.java
1f80bce
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestExecutionBlockCursor.java
053c028
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestGlobalPlanner.java
2d3124d
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct.sql
6fe604e
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct2.sql
6bf8a8a
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation1.sql
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation2.sql
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation3.sql
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation4.sql
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation5.sql
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregationWithHaving1.sql
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct.result
f2ad32a
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct2.result
9164120
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation1.result
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation2.result
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation3.result
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation4.result
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation5.result
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregationWithHaving1.result
PRE-CREATION
Diff: https://reviews.apache.org/r/18210/diff/
Testing
-------
mvn clean install
Thanks,
Hyunsik Choi