Re: Review Request 17905: TAJO-593: outer groupby and groupby in derived table causes only one shuffle output number

Hyunsik Choi Tue, 11 Feb 2014 03:31:23 -0800

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17905/
-----------------------------------------------------------


(Updated Feb. 11, 2014, 8:30 p.m.)


Review request for Tajo.


Bugs: TAJO-593
    https://issues.apache.org/jira/browse/TAJO-593


Repository: tajo


Description
-------

See the following query case:

{code:sql}
select count(*) from (select l_orderkey, l_partkey, count(*) from lineitem 
group by l_orderkey, l_partkey) t1;
{code}

In this case, SubQuery::calculateShuffleOutputNum() are used two times for 
choosing the number of shuffle outputs. At that time, 
SubQuery::calculateShuffleOutputNum() method finds GroupByNode to know the 
number of grouping keys. Here is one bug. SubQuery::calculateShuffleOutputNum() 
always the topmost GroupByNode. In most cases, it work well. But, outer groupby 
and groupby in derived table can cause the problem. In this case, we must use 
the most bottom groupby node. Actually, it is always the correct way.

This patch fixes SubQuery::calculateShuffleOutputNum() to use the most bottom 
groupby node.


Diffs
-----

  
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PlannerUtil.java
 b59cddafadd0c254aaef97c482cacab6ca4742c1 
  
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java
 83a593a3cd858c96bdde935306a51f545f8971cf 

Diff: https://reviews.apache.org/r/17905/diff/


Testing (updated)
-------

mvn clean install


Thanks,

Hyunsik Choi

Re: Review Request 17905: TAJO-593: outer groupby and groupby in derived table causes only one shuffle output number

Reply via email to