Maybe it’s the AVG function in your query. That cannot be implemented directly, 
only if decomposed into SUM and COUNT.

Remember, the Aggregate operator ONLY groups data into groups and computes 
aggregate functions; it doesn’t perform any kind of operations on the input or 
output values.

> On Oct 24, 2017, at 8:16 AM, George Theodorakis <[email protected]> 
> wrote:
> 
> Hello,
> 
> I have tried to create my own operators with their respective cost model.
> Simple cases work, but when I have certain aggregation examples something
> goes wrong.
> 
> I have this complex query:
> 
> *with R as (*
> *select L.rowtime , L.productid , L.orderid , L.customerid*
> *from*
> *(select rowtime, customerid, avg(units) as globalAvgLoad*
> *from s.orders*
> *group by rowtime, customerid, hop(rowtime, interval '1' second, interval
> '1' hour) ) as G,*
> *(select rowtime, productid, orderid, customerid, avg(units) as
> localAvgLoad*
> *from s.orders*
> *group by rowtime, productid, orderid, customerid, hop(rowtime, interval
> '1' second, interval '1' hour) )as L*
> *where L.customerid = G.customerid and L.localAvgLoad > G.globalAvgLoad*
> *) *
> 
> *select rowtime, customerid, count(*) *
> *from R*
> *group by rowtime, customerid;*
> 
> When I *use* the rule *AggregateReduceFunctionsRule* along with other
> rules, Volcano Planner returns to me the expected plan for execution, which
> is
> 
> SaberAggregateRel(group=[{0, 3}], EXPR$2=[COUNT()]): rowcount = 1875.0,
> cumulative cost = {22859.375 rows, 12337.5 cpu, 0.0 io, 84.375 rate, 3.0
> memory, 1.0 window, 146.22222222222223 R}, id = 266
> SaberJoinRel(condition=[AND(=($3, $6), >($4, $7))], joinType=[inner]):
> rowcount = 18750.0, cumulative cost = {20750.0 rows, 12000.0 cpu, 0.0 io,
> 75.0 rate, 2.0 memory, 1.0 window, 160.0 R}, id = 264
>   SaberAggrCalcRel: rowcount = 500.0, cumulative cost = {1000.0 rows,
> 2000.0 cpu, 0.0 io, 500.0 rate, 0.0 memory, 1.0 window, 4.0 R}, id = 270
>     SaberTableScanRel(table=[[s, orders]]): rowcount = 500.0, cumulative
> cost = {500.0 rows, 0.0 cpu, 0.0 io, 500.0 rate, 0.0 memory, 1.0 window,
> 0.0 R}, id = 138
>   SaberAggrCalcRel: rowcount = 500.0, cumulative cost = {1000.0 rows,
> 2000.0 cpu, 0.0 io, 500.0 rate, 0.0 memory, 1.0 window, 4.0 R}, id = 268
>     SaberTableScanRel(table=[[s, orders]]): rowcount = 500.0, cumulative
> cost = {500.0 rows, 0.0 cpu, 0.0 io, 500.0 rate, 0.0 memory, 1.0 window,
> 0.0 R}, id = 138
> 
> 
> When I *omit* *AggregateReduceFunctionsRule*, I get:
> 
> Exception in thread "main"
> org.apache.calcite.plan.RelOptPlanner$CannotPlanException: Node
> [rel#98:Subset#9.LOGICAL.[]] could not be implemented; planner state:
> 
> Root: rel#98:Subset#9.LOGICAL.[]
> Original rel:
> 
> Sets:
> Set#0, type: RecordType(TIMESTAMP(0) rowtime, INTEGER orderid, INTEGER
> productid, INTEGER units, INTEGER customerid)
> rel#79:Subset#0.NONE.[0], best=rel#14, importance=0.4782969000000001
> rel#14:LogicalTableScan.NONE.[[0]](table=[s, orders]), rowcount=500.0,
> cumulative cost={huge}
> rel#115:Subset#0.LOGICAL.[0], best=rel#118, importance=0.531441
> rel#118:SaberTableScanRel.LOGICAL.[[0]](table=[s, orders]), rowcount=500.0,
> cumulative cost={500.0 rows, 0.0 cpu, 0.0 io, 500.0 rate, 0.0 memory, 1.0
> window, 0.0 R}
> Set#1, type: RecordType(TIMESTAMP(0) rowtime, INTEGER customerid,
> TIMESTAMP(0) $f2, INTEGER units)
> rel#81:Subset#1.NONE.[], best=rel#80, importance=0.531441
> rel#80:LogicalProject.NONE.[[0],
> [2]](input=rel#79:Subset#0.NONE.[0],rowtime=$0,customerid=$4,$f2=HOP($0,
> 1000, 3600000),units=$3), rowcount=500.0, cumulative cost={huge}
> rel#111:Subset#1.LOGICAL.[], best=rel#116, importance=0.5904900000000001
> rel#116:SaberProjectRel.LOGICAL.[[0],
> [2]](input=rel#115:Subset#0.LOGICAL.[0],rowtime=$0,customerid=$4,$f2=HOP($0,
> 1000, 3600000),units=$3), rowcount=500.0, cumulative cost={1000.0 rows,
> 2000.0 cpu, 0.0 io, 500.0 rate, 0.0 memory, 1.0 window, 4.0 R}
> Set#2, type: RecordType(TIMESTAMP(0) rowtime, INTEGER customerid,
> TIMESTAMP(0) $f2, INTEGER globalAvgLoad)
> rel#83:Subset#2.NONE.[], best=rel#82, importance=0.5904900000000001
> rel#82:LogicalAggregate.NONE.[](input=rel#81:Subset#1.NONE.[],group={0, 1,
> 2},globalAvgLoad=AVG($3)), rowcount=50.0, cumulative cost={huge}
> rel#107:Subset#2.LOGICAL.[], best=rel#112, importance=0.6561
> rel#112:SaberAggregateRel.LOGICAL.[](input=rel#111:Subset#1.LOGICAL.[],group={0,
> 1, 2},globalAvgLoad=AVG($3)), rowcount=50.0, cumulative cost={1056.25 rows,
> 4250.0 cpu, 0.0 io, 562.5 rate, 1.0 memory, 1.0 window, 7.555555555555555 R}
> Set#3, type: RecordType(TIMESTAMP(0) rowtime, INTEGER customerid, INTEGER
> globalAvgLoad)
> rel#85:Subset#3.NONE.[], best=rel#84, importance=0.6561
> rel#84:LogicalProject.NONE.[](input=rel#83:Subset#2.NONE.[],rowtime=$0,customerid=$1,globalAvgLoad=$3),
> rowcount=50.0, cumulative cost={huge}
> rel#104:Subset#3.LOGICAL.[], best=rel#108, importance=0.7290000000000001
> rel#108:SaberProjectRel.LOGICAL.[](input=rel#107:Subset#2.LOGICAL.[],rowtime=$0,customerid=$1,globalAvgLoad=$3),
> rowcount=50.0, cumulative cost={1106.25 rows, 6500.0 cpu, 0.0 io, 562.5
> rate, 1.0 memory, 1.0 window, 11.555555555555555 R}
> Set#4, type: RecordType(TIMESTAMP(0) rowtime, INTEGER productid, INTEGER
> orderid, INTEGER customerid, TIMESTAMP(0) $f4, INTEGER units)
> rel#87:Subset#4.NONE.[], best=rel#86, importance=0.531441
> rel#86:LogicalProject.NONE.[[0],
> [4]](input=rel#79:Subset#0.NONE.[0],rowtime=$0,productid=$2,orderid=$1,customerid=$4,$f4=HOP($0,
> 1000, 3600000),units=$3), rowcount=500.0, cumulative cost={huge}
> rel#113:Subset#4.LOGICAL.[], best=rel#117, importance=0.5904900000000001
> rel#117:SaberProjectRel.LOGICAL.[[0],
> [4]](input=rel#115:Subset#0.LOGICAL.[0],rowtime=$0,productid=$2,orderid=$1,customerid=$4,$f4=HOP($0,
> 1000, 3600000),units=$3), rowcount=500.0, cumulative cost={1000.0 rows,
> 2000.0 cpu, 0.0 io, 500.0 rate, 0.0 memory, 1.0 window, 4.0 R}
> Set#5, type: RecordType(TIMESTAMP(0) rowtime, INTEGER productid, INTEGER
> orderid, INTEGER customerid, TIMESTAMP(0) $f4, INTEGER localAvgLoad)
> rel#89:Subset#5.NONE.[], best=rel#88, importance=0.5904900000000001
> rel#88:LogicalAggregate.NONE.[](input=rel#87:Subset#4.NONE.[],group={0, 1,
> 2, 3, 4},localAvgLoad=AVG($5)), rowcount=50.0, cumulative cost={huge}
> rel#109:Subset#5.LOGICAL.[], best=rel#114, importance=0.6561
> rel#114:SaberAggregateRel.LOGICAL.[](input=rel#113:Subset#4.LOGICAL.[],group={0,
> 1, 2, 3, 4},localAvgLoad=AVG($5)), rowcount=50.0, cumulative cost={1056.25
> rows, 4250.0 cpu, 0.0 io, 562.5 rate, 1.0 memory, 1.0 window,
> 7.555555555555555 R}
> Set#6, type: RecordType(TIMESTAMP(0) rowtime, INTEGER productid, INTEGER
> orderid, INTEGER customerid, INTEGER localAvgLoad)
> rel#91:Subset#6.NONE.[], best=rel#90, importance=0.6561
> rel#90:LogicalProject.NONE.[](input=rel#89:Subset#5.NONE.[],rowtime=$0,productid=$1,orderid=$2,customerid=$3,localAvgLoad=$5),
> rowcount=50.0, cumulative cost={huge}
> rel#105:Subset#6.LOGICAL.[], best=rel#110, importance=0.7290000000000001
> rel#110:SaberProjectRel.LOGICAL.[](input=rel#109:Subset#5.LOGICAL.[],rowtime=$0,productid=$1,orderid=$2,customerid=$3,localAvgLoad=$5),
> rowcount=50.0, cumulative cost={1106.25 rows, 6500.0 cpu, 0.0 io, 562.5
> rate, 1.0 memory, 1.0 window, 11.555555555555555 R}
> Set#7, type: RecordType(TIMESTAMP(0) rowtime, INTEGER customerid, INTEGER
> globalAvgLoad, TIMESTAMP(0) rowtime0, INTEGER productid, INTEGER orderid,
> INTEGER customerid0, INTEGER localAvgLoad)
> rel#93:Subset#7.NONE.[], best=rel#92, importance=0.7290000000000001
> rel#92:LogicalJoin.NONE.[](left=rel#85:Subset#3.NONE.[],right=rel#91:Subset#6.NONE.[],condition=AND(=($6,
> $1), >($7, $2)),joinType=inner), rowcount=187.5, cumulative cost={huge}
> rel#102:Subset#7.LOGICAL.[], best=rel#106, importance=0.81
> rel#106:SaberJoinRel.LOGICAL.[](left=rel#104:Subset#3.LOGICAL.[],right=rel#105:Subset#6.LOGICAL.[],condition=AND(=($6,
> $1), >($7, $2)),joinType=inner), rowcount=187.5, cumulative cost={2400.0
> rows, 22000.0 cpu, 0.0 io, 84.375 rate, 4.0 memory, 1.0 window,
> 260.74074074074076 R}
> Set#8, type: RecordType(TIMESTAMP(0) rowtime, INTEGER customerid)
> rel#95:Subset#8.NONE.[], best=rel#94, importance=0.81
> rel#94:LogicalProject.NONE.[](input=rel#93:Subset#7.NONE.[],rowtime=$3,customerid=$6),
> rowcount=187.5, cumulative cost={huge}
> rel#100:Subset#8.LOGICAL.[], best=rel#103, importance=0.9
> rel#103:SaberProjectRel.LOGICAL.[](input=rel#102:Subset#7.LOGICAL.[],rowtime=$3,customerid=$6),
> rowcount=187.5, cumulative cost={2587.5 rows, Infinity cpu, 0.0 io,
> 1.7976931348623157E308 rate, 1.7976931348623157E308 memory,
> 1.3482698511467366E307 window, 1.7976931348623157E308 R}
> Set#9, type: RecordType(TIMESTAMP(0) rowtime, INTEGER customerid, BIGINT
> EXPR$2)
> rel#97:Subset#9.NONE.[], best=rel#96, importance=0.9
> rel#96:LogicalAggregate.NONE.[](input=rel#95:Subset#8.NONE.[],group={0,
> 1},EXPR$2=COUNT()), rowcount=18.75, cumulative cost={huge}
> rel#98:Subset#9.LOGICAL.[], best=null, importance=1.0
> rel#99:AbstractConverter.LOGICAL.[](input=rel#97:Subset#9.NONE.[],convention=LOGICAL,sort=[]),
> rowcount=18.75, cumulative cost={Infinity rows, Infinity cpu, Infinity io,
> 1.7976931348623157E308 rate, Infinity memory, 1.7976931348623157E308
> window, 1.7976931348623157E308 R}
> rel#101:SaberAggregateRel.LOGICAL.[](input=rel#100:Subset#8.LOGICAL.[],group={0,
> 1},EXPR$2=COUNT()), rowcount=18.75, cumulative cost={2608.59375 rows,
> Infinity cpu, 0.0 io, 1.7976931348623157E308 rate, Infinity memory,
> 1.3482698511467366E307 window, 1.7976931348623157E308 R}
> 
> 
> at
> org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:443)
> at
> org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(RelSubset.java:293)
> at
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:666)
> at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:368)
> at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:313)
> at calcite.planner.SaberPlanner.getLogicalPlan(SaberPlanner.java:215)
> at calcite.Tester.main(Tester.java:234)
> 
> Sorry for the long email...
> 
> Thank you!

Reply via email to