-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18210/
-----------------------------------------------------------
(Updated Feb. 20, 2014, 2:28 p.m.)
Review request for Tajo.
Changes
-------
rebased against the latest revision.
Bugs: TAJO-601
https://issues.apache.org/jira/browse/TAJO-601
Repository: tajo
Description
-------
Currently, distinct aggregation queries are executed as follows:
* the first stage: it just shuffles tuples by hashing grouping keys.
* the second stage: it sorts them and executes sort aggregation.
This way executes queries including distinct aggregation functions with only
two stages. But, it leads to large intermediate data during shuffle phase.
This kind of query can be rewritten as two queries:
[Original query]
----------
SELECT grp1, grp2, count(*) as total, count(distinct grp3) as distinct_col from
rel1 group by grp1, grp2;
----------
[Rewritten query]
----------
SELECT grp1, grp2, sum(cnt) as total, count(grp3) as distinct_col from (
SELECT grp1, grp2, grp3, count(*) as cnt from rel1 group by grp1, grp2, grp3)
tmp1 group by grp1, grp2
) table1;
----------
I'm expecting that this rewrite will significantly reduce the intermediate data
volume and query response time in most cases.
Diffs (updated)
-----
tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/SortSpec.java
3ef73d5c5385b40fcfb3b0ecbbc35b783224c760
tajo-common/src/main/java/org/apache/tajo/util/TUtil.java
cc694d43f42f68945cf53a7b8b9bbdca97a4f205
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/eval/EvalTreeUtil.java
da05739b8feff0e04b1762f8000b1f3818c773a2
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/eval/FunctionEval.java
0555bdec8aff6fa79c02b640c81ad55d4666b90a
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumDoubleDistinct.java
PRE-CREATION
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloat.java
10fd7205f29c82adf87816737598ce762ee0ebc9
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumFloatDistinct.java
PRE-CREATION
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumIntDistinct.java
PRE-CREATION
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/function/builtin/SumLongDistinct.java
PRE-CREATION
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/ExprsVerifier.java
b14c448ee5b3ce0dfca67c6a9b942f1803cc91f9
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java
f7c0bfab78cb3416e7a2ed263cc362917023e3ca
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PhysicalPlannerImpl.java
67f56303e04787bf950c4a9a703faec58fb74cd4
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PlannerUtil.java
7d5e2fc7e085cc36527383a208277384035263e7
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PreLogicalPlanVerifier.java
6dac031218c650b9c1c86811b4552fe6d82da0c1
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/enforce/Enforcer.java
dd46996eca7eb9c38f87d97813f5dcc7220429ed
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/DataChannel.java
9f5c6bf9dd7b549308724ce1e8044aff1630cef1
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java
f390b52f378a2d7e84e40876df4a4b416af912ef
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/global/MasterPlan.java
91f658dab395620f5a891f51407b3676b07a8fa5
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java
791781e526c54f216152e935682bc2c3147a9e0c
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/SeqScanExec.java
53a1c24197c40c77153f79f90c05882c90aae957
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/FilterPushDownRule.java
399903c66bb8a62074facd0bbbe9b3b8e891c067
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/PartitionedTableRewriter.java
e5f7fb40414e0b2e2e40bccebe24069ee4d9301b
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/rewrite/ProjectionPushDownRule.java
633d0c1857533b02c4ecc6913c740fd2e3722845
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMaster.java
ae6d5ebb97f8c4287ffd11262b2932d2f8b1250c
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryMasterManagerService.java
3c30e3854abaa891f72b368144942164e5dffab7
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java
56c26797aad1dbe95945567961e9425fef72fa96
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/eval/TestEvalTreeUtil.java
d7562426647a6a9d6aae5207a67ddcdd03d0ee3a
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/query/TestGroupByQuery.java
1f80bce23c74e3abdcbf9bc0553ec30244d6bd93
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestExecutionBlockCursor.java
053c02833e80dd931807fa6314965e687d7b26c0
tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestGlobalPlanner.java
2d3124d7e9d7853b0f872eee1016cbae504c9c6b
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct.sql
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testCountDistinct2.sql
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation3.sql
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation4.sql
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregation5.sql
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregationWithHaving1.sql
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/queries/TestGroupByQuery/testDistinctAggregationWithUnion1.sql
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct.result
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testCountDistinct2.result
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation3.result
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation4.result
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregation5.result
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregationWithHaving1.result
PRE-CREATION
tajo-core/tajo-core-backend/src/test/resources/results/TestGroupByQuery/testDistinctAggregationWithUnion1.result
PRE-CREATION
tajo-storage/src/main/java/org/apache/tajo/storage/RawFile.java
c3a7525154e0f36d51dcca211949f21f57a9f1c8
Diff: https://reviews.apache.org/r/18210/diff/
Testing
-------
mvn clean install
Thanks,
Hyunsik Choi