> On Jan. 8, 2016, 9:43 p.m., pengcheng xiong wrote: > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java, line > > 219 > > <https://reviews.apache.org/r/42081/diff/2/?file=1188760#file1188760line219> > > > > Could you please be more specific on the reason why "if TS[0] branch > > for src1 is not optimized, then there is no need to continue processing > > TS[6] branch? Thanks.
The AST tree of query (select max(value) from src1 union all select max(value) from src2) passed to StatsOptimizator after UnionRemove optimization is: TS[0]->SEL[1]->GBY[2]-RS[3]->GBY[4]->FS[17] --- for subquery src1 TS[6]->SEL[7]->GBY[8]-RS[9]->GBY[10]->FS[18] --- for subquery src2 It has two top Operators, TS[0] for table src1 and TS[6] for table src2. If the TS[0] branch (for subquery src1) is not optimized but TS[6] branch (for subquery src2) is, in existing code, TS[6] branch result will be set to FetchTask in ParseContext and the entire query is not further compiled into MRTasks (in SemanticAnalyzer.analyzeInternal step 9). So the union query will return result with only the row from TS[6] (the subquery src2). It is obviously not right. So for union query, if any one of its subqueries could not be Stats Optimizated, the whole query should not be optimized and fails back to regular plan. I wonder if it is a littler clear. Thanks - Chaoyu ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/42081/#review113549 ----------------------------------------------------------- On Jan. 8, 2016, 8:16 p.m., Chaoyu Tang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/42081/ > ----------------------------------------------------------- > > (Updated Jan. 8, 2016, 8:16 p.m.) > > > Review request for hive, Ashutosh Chauhan, pengcheng xiong, and Xuefu Zhang. > > > Bugs: HIVE-12788 > https://issues.apache.org/jira/browse/HIVE-12788 > > > Repository: hive-git > > > Description > ------- > > adds StatsOptimizator support to union with aggregate function. Otherwise, it > always returns one row. > > > Diffs > ----- > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/StatsOptimizer.java 03c1c3f > ql/src/test/queries/clientpositive/union_remove_26.q PRE-CREATION > ql/src/test/results/clientpositive/union_remove_26.q.out PRE-CREATION > > Diff: https://reviews.apache.org/r/42081/diff/ > > > Testing > ------- > > 1. Manual tests for some partitcular cases > 2. submitted to precommit-tests > > > Thanks, > > Chaoyu Tang > >