> On April 19, 2014, 4:18 a.m., Hyunsik Choi wrote: > > tajo-core/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java, > > line 817 > > <https://reviews.apache.org/r/20478/diff/2/?file=562166#file562166line817> > > > > The original code may consider multiple ScanNodes led by multiple small > > tables broadcasted. But, the changes only considers the first ScanNode. Is > > is valid?
Hi Hyunsik Thank you for your review. I agree with you. For really, I fixed for TAJO-750 bug. But we will be faced with unexpected issue by this code. So, I'll recover SubQuery now. Cheers Jaehwa - Jung ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/20478/#review40842 ----------------------------------------------------------- On April 18, 2014, 3:09 p.m., Jung JaeHwa wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/20478/ > ----------------------------------------------------------- > > (Updated April 18, 2014, 3:09 p.m.) > > > Review request for Tajo. > > > Bugs: TAJO-748 > https://issues.apache.org/jira/browse/TAJO-748 > > > Repository: tajo > > > Description > ------- > > I found that inline view doesn't run expected at multiple join as follows: > > *Environment* > * DataSet: TPC-DS > * tajo.dist-query.join.broadcast.auto : false > > *Case: 1* > {code:xml} > SELECT COUNT(*) > FROM ( > SELECT cs.cs_item_sk as cs_item_sk, > cs.cs_ext_discount_amt as cs_ext_discount_amt > FROM catalog_sales cs > JOIN date_dim d ON (d.d_date_sk = cs.cs_sold_date_sk) > WHERE d.d_date between '2000-01-27' and '2000-04-27' > ) cs1 > JOIN item i ON (i.i_item_sk = cs1.cs_item_sk); > {code} > > - actual result: 4163848 > - expected result: 4163848 > > *Case: 2* > {code:xml} > select count(*) > from item i > JOIN (SELECT cs2.cs_item_sk as cs_item_sk, > 1.3 * avg(cs_ext_discount_amt) as > avg_cs_ext_discount_amt > FROM (SELECT cs.cs_item_sk as cs_item_sk, > cs.cs_ext_discount_amt as > cs_ext_discount_amt > FROM catalog_sales cs > JOIN date_dim d ON (d.d_date_sk = cs.cs_sold_date_sk) > WHERE d.d_date between '2000-01-27' and '2000-04-27') > cs2 > GROUP BY cs2.cs_item_sk) tmp1 > ON (i.i_item_sk = tmp1.cs_item_sk); > {code} > > - actual result: 102000 > - expected result: 102000 > > *Case: 3* > {code:xml} > SELECT COUNT(*) > FROM (SELECT cs.cs_item_sk as cs_item_sk, > cs.cs_ext_discount_amt as cs_ext_discount_amt > FROM catalog_sales cs > JOIN date_dim d ON (d.d_date_sk = cs.cs_sold_date_sk) > WHERE d.d_date between '2000-01-27' and '2000-04-27') cs1 > JOIN item i ON (i.i_item_sk = cs1.cs_item_sk) > JOIN (SELECT cs2.cs_item_sk as cs_item_sk, > 1.3 * avg(cs_ext_discount_amt) as > avg_cs_ext_discount_amt > FROM (SELECT cs.cs_item_sk as cs_item_sk, > cs.cs_ext_discount_amt as > cs_ext_discount_amt > FROM catalog_sales cs > JOIN date_dim d ON (d.d_date_sk = cs.cs_sold_date_sk) > WHERE d.d_date between '2000-01-27' and '2000-04-27') > cs2 > GROUP BY cs2.cs_item_sk) tmp1 > ON (i.i_item_sk = tmp1.cs_item_sk) > WHERE i.i_manufact_id = 436; > {code} > > - actual result: 80 > - expected result: 4586 > > *Case: 4* > {code:xml} > SELECT COUNT(*) > FROM (SELECT cs.cs_item_sk as cs_item_sk, > cs.cs_ext_discount_amt as cs_ext_discount_amt > FROM catalog_sales cs > JOIN date_dim d ON (d.d_date_sk = cs.cs_sold_date_sk) > WHERE d.d_date between '2000-01-27' and '2000-04-27') cs1 > JOIN item i ON (i.i_item_sk = cs1.cs_item_sk) > JOIN (SELECT cs2.cs_item_sk as cs_item_sk, > 1.3 * avg(cs_ext_discount_amt) as > avg_cs_ext_discount_amt > FROM (SELECT cs.cs_item_sk as cs_item_sk, > cs.cs_ext_discount_amt as > cs_ext_discount_amt > FROM catalog_sales cs > JOIN date_dim d ON (d.d_date_sk = cs.cs_sold_date_sk) > WHERE d.d_date between '2000-01-27' and '2000-04-27') > cs2 > GROUP BY cs2.cs_item_sk) tmp1 > ON (i.i_item_sk = tmp1.cs_item_sk) > {code} > > - actual result: 71147 > - expected result: 4163848 > > For reference, I made activated result using hive. > > > Diffs > ----- > > > tajo-core/src/main/java/org/apache/tajo/engine/planner/global/GlobalPlanner.java > bf2bf7d > tajo-core/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java > 63b50ac > tajo-core/src/test/java/org/apache/tajo/engine/query/TestJoinBroadcast.java > 89519ef > > tajo-core/src/test/java/org/apache/tajo/master/TestExecutionBlockCursor.java > ab31c8d > > tajo-core/src/test/java/org/apache/tajo/master/querymaster/TestQueryUnitStatusUpdate.java > 07b4ac5 > tajo-core/src/test/resources/queries/TestNetTypes/testJoin.sql ec4f8e6 > > tajo-core/src/test/resources/results/TestJoinBroadcast/testBroadcastSubquery2.result > 14c2211 > > Diff: https://reviews.apache.org/r/20478/diff/ > > > Testing > ------- > > mvn clean install > > > Thanks, > > Jung JaeHwa > >
