> On April 19, 2014, 4:18 a.m., Hyunsik Choi wrote:
> > tajo-core/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java, 
> > line 817
> > <https://reviews.apache.org/r/20478/diff/2/?file=562166#file562166line817>
> >
> >     The original code may consider multiple ScanNodes led by multiple small 
> > tables broadcasted. But, the changes only considers the first ScanNode. Is 
> > is valid?

Hi Hyunsik

Thank you for your review. I agree with you.
For really, I fixed for TAJO-750 bug. But we will be faced with unexpected 
issue by this code.
So, I'll recover SubQuery now. 

Cheers
Jaehwa


- Jung


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20478/#review40842
-----------------------------------------------------------


On April 18, 2014, 3:09 p.m., Jung JaeHwa wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/20478/
> -----------------------------------------------------------
> 
> (Updated April 18, 2014, 3:09 p.m.)
> 
> 
> Review request for Tajo.
> 
> 
> Bugs: TAJO-748
>     https://issues.apache.org/jira/browse/TAJO-748
> 
> 
> Repository: tajo
> 
> 
> Description
> -------
> 
> I found that inline view doesn't run expected at multiple join as follows:
> 
> *Environment*
> * DataSet: TPC-DS 
> * tajo.dist-query.join.broadcast.auto : false
> 
> *Case: 1*
> {code:xml}
> SELECT COUNT(*)
> FROM (
>   SELECT cs.cs_item_sk as cs_item_sk,
>   cs.cs_ext_discount_amt as cs_ext_discount_amt
>   FROM catalog_sales cs
>   JOIN date_dim d ON (d.d_date_sk = cs.cs_sold_date_sk)
>   WHERE d.d_date between '2000-01-27' and '2000-04-27'
> ) cs1
> JOIN item i ON (i.i_item_sk = cs1.cs_item_sk);
> {code}
> 
> - actual result: 4163848
> - expected result: 4163848
> 
> *Case: 2*
> {code:xml}
> select count(*)
> from item i
> JOIN (SELECT cs2.cs_item_sk as cs_item_sk,
>                           1.3 * avg(cs_ext_discount_amt) as 
> avg_cs_ext_discount_amt
>            FROM (SELECT cs.cs_item_sk as cs_item_sk,
>                                         cs.cs_ext_discount_amt as 
> cs_ext_discount_amt
>                         FROM catalog_sales cs
>                         JOIN date_dim d ON (d.d_date_sk = cs.cs_sold_date_sk)
>                         WHERE d.d_date between '2000-01-27' and '2000-04-27') 
> cs2
>                         GROUP BY cs2.cs_item_sk) tmp1
> ON (i.i_item_sk = tmp1.cs_item_sk);
> {code}
> 
> - actual result: 102000
> - expected result: 102000
> 
> *Case: 3*
> {code:xml}
> SELECT COUNT(*)
> FROM (SELECT cs.cs_item_sk as cs_item_sk,
>                              cs.cs_ext_discount_amt as cs_ext_discount_amt
>              FROM catalog_sales cs
>              JOIN date_dim d ON (d.d_date_sk = cs.cs_sold_date_sk)
>              WHERE d.d_date between '2000-01-27' and '2000-04-27') cs1
> JOIN item i ON (i.i_item_sk = cs1.cs_item_sk)
> JOIN (SELECT cs2.cs_item_sk as cs_item_sk,
>                           1.3 * avg(cs_ext_discount_amt) as 
> avg_cs_ext_discount_amt
>            FROM (SELECT cs.cs_item_sk as cs_item_sk,
>                                         cs.cs_ext_discount_amt as 
> cs_ext_discount_amt
>                         FROM catalog_sales cs
>                         JOIN date_dim d ON (d.d_date_sk = cs.cs_sold_date_sk)
>                         WHERE d.d_date between '2000-01-27' and '2000-04-27') 
> cs2
>                         GROUP BY cs2.cs_item_sk) tmp1
> ON (i.i_item_sk = tmp1.cs_item_sk)
> WHERE i.i_manufact_id = 436;
> {code}
> 
> - actual result: 80
> - expected result: 4586
> 
> *Case: 4*
> {code:xml}
> SELECT COUNT(*)
> FROM (SELECT cs.cs_item_sk as cs_item_sk,
>                              cs.cs_ext_discount_amt as cs_ext_discount_amt
>              FROM catalog_sales cs
>              JOIN date_dim d ON (d.d_date_sk = cs.cs_sold_date_sk)
>              WHERE d.d_date between '2000-01-27' and '2000-04-27') cs1
> JOIN item i ON (i.i_item_sk = cs1.cs_item_sk)
> JOIN (SELECT cs2.cs_item_sk as cs_item_sk,
>                           1.3 * avg(cs_ext_discount_amt) as 
> avg_cs_ext_discount_amt
>            FROM (SELECT cs.cs_item_sk as cs_item_sk,
>                                         cs.cs_ext_discount_amt as 
> cs_ext_discount_amt
>                         FROM catalog_sales cs
>                         JOIN date_dim d ON (d.d_date_sk = cs.cs_sold_date_sk)
>                         WHERE d.d_date between '2000-01-27' and '2000-04-27') 
> cs2
>                         GROUP BY cs2.cs_item_sk) tmp1
> ON (i.i_item_sk = tmp1.cs_item_sk)
> {code}
> 
> - actual result: 71147
> - expected result: 4163848
> 
> For reference, I made activated result using hive.
> 
> 
> Diffs
> -----
> 
>   
> tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java
>  bf2bf7d 
>   tajo-core/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java 
> 63b50ac 
>   tajo-core/src/test/java/org/apache/tajo/engine/query/TestJoinBroadcast.java 
> 89519ef 
>   
> tajo-core/src/test/java/org/apache/tajo/master/TestExecutionBlockCursor.java 
> ab31c8d 
>   
> tajo-core/src/test/java/org/apache/tajo/master/querymaster/TestQueryUnitStatusUpdate.java
>  07b4ac5 
>   tajo-core/src/test/resources/queries/TestNetTypes/testJoin.sql ec4f8e6 
>   
> tajo-core/src/test/resources/results/TestJoinBroadcast/testBroadcastSubquery2.result
>  14c2211 
> 
> Diff: https://reviews.apache.org/r/20478/diff/
> 
> 
> Testing
> -------
> 
> mvn clean install
> 
> 
> Thanks,
> 
> Jung JaeHwa
> 
>

Reply via email to