weijie.tong created DRILL-5913:
----------------------------------
Summary: DrillReduceAggregatesRule
Key: DRILL-5913
URL: https://issues.apache.org/jira/browse/DRILL-5913
Project: Apache Drill
Issue Type: Bug
Components: Query Planning & Optimization
Affects Versions: 1.11.0, 1.9.0
Reporter: weijie.tong
sample query:
{code:java}
select stddev_samp(cast(employee_id as int)) as col1, sum(cast(employee_id as
int)) as col2 from cp.`employee.json`
{code}
error info:
{code:java}
org.apache.drill.exec.rpc.RpcException:
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
AssertionError: Type mismatch:
rel rowtype:
RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, INTEGER $f3) NOT NULL
equivRel rowtype:
RecordType(INTEGER $f0, INTEGER $f1, BIGINT NOT NULL $f2, BIGINT $f3) NOT NULL
[Error Id: f5114e62-a57b-46b1-afe8-ae652f390896 on localhost:31010]
(org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception
during fragment initialization: Internal error: Error while applying rule
DrillReduceAggregatesRule, args
[rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
org.apache.drill.exec.work.foreman.Foreman.run():294
java.util.concurrent.ThreadPoolExecutor.runWorker():1142
java.util.concurrent.ThreadPoolExecutor$Worker.run():617
java.lang.Thread.run():745
Caused By (java.lang.AssertionError) Internal error: Error while applying
rule DrillReduceAggregatesRule, args
[rel#29:LogicalAggregate.NONE.ANY([]).[](input=rel#28:Subset#3.NONE.ANY([]).[],group={},agg#0=SUM($1),agg#1=SUM($0),agg#2=COUNT($0),agg#3=$SUM0($0))]
org.apache.calcite.util.Util.newInternal():792
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():251
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():811
{code}
The reason is that stddev_samp(cast(employee_id as int)) will be reduced as
sum($0) ,sum($1) ,count($0) while the sum(cast(employee_id as int)) will be
reduced as sum0($0) by the DrillReduceAggregatesRule's first time matching.
The second time's matching will reduce stddev_samp's sum($0) to sum0($0) too .
But this sum0($0) 's data type is different from the first time's sum0($0) :
one is integer ,the other is bigint . But Calcite's addAggCall method treat
them as the same by ignoring their data type. This leads to the bigint sum0($0)
be replaced by the integer sum0($0).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)