You need to push the Filter into the Union. Otherwise the Aggregate is on top of a Filter, not a Union. Use FilterSetOpTransposeRule.
Julian > On Jun 2, 2016, at 9:54 AM, Ravikumar CS <[email protected]> wrote: > > Hi, > > I am trying to come up with an optimized relational expression which does > predicate push downs > > & partial aggregates in preparation for a distributed execution of the > query. > > SQL of interest: > > SELECT col1, SUM(col2) FROM ( SELECT col2, col2 FROM Orders1 > > UNION ALL > > SELECT col1, col2 > FROM Orders2 > > ) WHERE col1=1 GROUP > BY col1; > > > Relational Expression - Initial: > > LogicalAggregate(group=[{0}], EXPR$1=[AVG($1)]) > > LogicalFilter(condition=[=($0, 1)]) > > LogicalUnion(all=[true]) > > LogicalProject(id=[$0], units=[$2]) > > LogicalTableScan(table=[[SALES, Orders1]]) > > LogicalProject(id=[$0], units=[$2]) > > LogicalTableScan(table=[[SALES, Orders2]]) > > > Query Optimization rules applied: > > HepProgram program = new HepProgramBuilder() > > .addRuleInstance(AggregateUnionAggregateRule.INSTANCE) > > .addRuleInstance(AggregateUnionTransposeRule.INSTANCE) > > .addRuleInstance(AggregateReduceFunctionsRule.INSTANCE) > > .build(); > > HepPlanner planner = new HepPlanner(program); > > planner.setRoot(oldLogicalPlan); > > > Relational Expression after Query Optimization: RelNode newLogicalPlan = > planner.findBestExp(); > > LogicalProject(id=[$0], EXPR$1=[CAST(/($1, $2)):INTEGER NOT NULL]) > > LogicalAggregate(group=[{0}], agg#0=[$SUM0($1)], agg#1=[COUNT()]) > > LogicalFilter(condition=[=($0, 1)]) > > LogicalUnion(all=[true]) > > LogicalProject(id=[$0], units=[$2]) > > LogicalTableScan(table=[[SALES, Orders1]]) > > LogicalProject(id=[$0], units=[$2]) > > LogicalTableScan(table=[[SALES, Orders2]]) > > > Questions: > > 1. Are there any rules to push the predicates(col1=1) down to the sub > queries ? > > 2. How can I rewrite the query such that the partial aggregates are > computed within each union(as below)? > > Tried AggregateUnionTransposeRule and AggregateJoinTransposeRule. May be > I am missed something. > > SELECT col1, SUM(partialCol2) AS c > > FROM ( SELECT col1, SUM(col2) AS partialCol2 FROM Orders1 where > col1=1 GROUP BY deptno > > UNION ALL > > SELECT col1, SUM(col2) AS partialCol2 FROM Orders2 > where col1=1 GROUP BY deptno > > ) GROUP BY col1;
