somu-imply opened a new pull request, #15237:
URL: https://github.com/apache/druid/pull/15237
Consider this query
```
with t AS (SELECT distinct(m2) as mo, APPROX_COUNT_DISTINCT(m1) / (
TIMESTAMPDIFF(DAY,min( __time ), CURRENT_TIMESTAMP ) + 0.0000000001) as
trend_score
FROM "foo"
GROUP BY 1
ORDER BY trend_score DESC
LIMIT 2)
select mo, (MAX(trend_score)) from t
where mo > 2
GROUP BY 1
ORDER BY 2 DESC
```
This currently fails to plan with the stack trace
```
Caused by: org.apache.calcite.plan.RelOptPlanner$CannotPlanException: There
are not enough rules to produce a node with desired properties:
convention=DRUID, sort=[1 DESC].
Missing conversion is LogicalSort[convention: NONE -> DRUID]
There is 1 empty subset: rel#126:RelSubset#7.DRUID.[1 DESC], the relevant
part of the original plan is as follows
124:LogicalSort(sort0=[$1], dir0=[DESC], fetch=[1001:BIGINT])
120:LogicalFilter(subset=[rel#128:RelSubset#5.NONE.[]], condition=[>($0,
2)])
118:LogicalSort(subset=[rel#119:RelSubset#4.NONE.[1 DESC]], sort0=[$1],
dir0=[DESC], fetch=[2])
116:LogicalProject(subset=[rel#117:RelSubset#3.NONE.[]], mo=[$0],
trend_score=[/($1, +(CAST(/INT(Reinterpret(-(2023-10-17 23:46:28.328, $2)),
86400000)):INTEGER NOT NULL, 1E-10:DECIMAL(11, 10)))])
114:LogicalAggregate(subset=[rel#115:RelSubset#2.NONE.[]],
group=[{0}], agg#0=[APPROX_COUNT_DISTINCT($1)], agg#1=[MIN($2)])
112:LogicalProject(subset=[rel#113:RelSubset#1.NONE.[]], mo=[$5],
m1=[$4], __time=[$0])
63:LogicalTableScan(subset=[rel#111:RelSubset#0.NONE.[]],
table=[[druid, foo]])
Root: rel#126:RelSubset#7.DRUID.[1 DESC]
Original rel:
LogicalSort(subset=[rel#126:RelSubset#7.DRUID.[1 DESC]], sort0=[$1],
dir0=[DESC], fetch=[1001:BIGINT]): rowcount = 1.0, cumulative cost = {1.0 rows,
20.0 cpu, 0.0 io}, id = 124
LogicalFilter(subset=[rel#128:RelSubset#5.NONE.[]], condition=[>($0, 2)]):
rowcount = 1.0, cumulative cost = {1.0 rows, 2.0 cpu, 0.0 io}, id = 120
LogicalSort(subset=[rel#119:RelSubset#4.NONE.[1 DESC]], sort0=[$1],
dir0=[DESC], fetch=[2]): rowcount = 2.0, cumulative cost = {2.0 rows, 200.0
cpu, 0.0 io}, id = 118
LogicalProject(subset=[rel#117:RelSubset#3.NONE.[]], mo=[$0],
trend_score=[/($1, +(CAST(/INT(Reinterpret(-(2023-10-17 23:46:28.328, $2)),
86400000)):INTEGER NOT NULL, 1E-10:DECIMAL(11, 10)))]): rowcount = 10.0,
cumulative cost = {10.0 rows, 20.0 cpu, 0.0 io}, id = 116
LogicalAggregate(subset=[rel#115:RelSubset#2.NONE.[]], group=[{0}],
agg#0=[APPROX_COUNT_DISTINCT($1)], agg#1=[MIN($2)]): rowcount = 10.0,
cumulative cost = {12.5 rows, 0.0 cpu, 0.0 io}, id = 114
LogicalProject(subset=[rel#113:RelSubset#1.NONE.[]], mo=[$5],
m1=[$4], __time=[$0]): rowcount = 100.0, cumulative cost = {100.0 rows, 300.0
cpu, 0.0 io}, id = 112
LogicalTableScan(subset=[rel#111:RelSubset#0.NONE.[]],
table=[[druid, foo]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0
cpu, 0.0 io}, id = 63
at
org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:718)
~[calcite-core-1.35.0.jar:1.35.0]
at
org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(RelSubset.java:391)
~[calcite-core-1.35.0.jar:1.35.0]
at
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:533)
~[calcite-core-1.35.0.jar:1.35.0]
at
org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:317)
~[calcite-core-1.35.0.jar:1.35.0]
at
org.apache.calcite.tools.Programs$SequenceProgram.run(Programs.java:337)
~[calcite-core-1.35.0.jar:1.35.0]
at
org.apache.druid.sql.calcite.planner.CalcitePlanner.transform(CalcitePlanner.java:452)
~[druid-sql-29.0.0-SNAPSHOT.jar:29.0.0-SNAPSHOT]
at
org.apache.druid.sql.calcite.planner.QueryHandler.planWithDruidConvention(QueryHandler.java:590)
~[druid-sql-29.0.0-SNAPSHOT.jar:29.0.0-SNAPSHOT]
at
org.apache.druid.sql.calcite.planner.QueryHandler$SelectHandler.planForDruid(QueryHandler.java:738)
~[druid-sql-29.0.0-SNAPSHOT.jar:29.0.0-SNAPSHOT]
at
org.apache.druid.sql.calcite.planner.QueryHandler.plan(QueryHandler.java:220)
~[druid-sql-29.0.0-SNAPSHOT.jar:29.0.0-SNAPSHOT]
... 84 more
```
In this PR we fix planning by removing the Aggregate remove rule which plans
these queries as scan over group by with order by on a non-time column which
Druid does not support.
This PR has:
- [ ] been self-reviewed.
- [ ] using the [concurrency
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
(Remove this item if the PR doesn't have any relation to concurrency.)
- [ ] added documentation for new or modified features or behaviors.
- [ ] a release note entry in the PR description.
- [ ] added Javadocs for most classes and all non-trivial methods. Linked
related entities via Javadoc links.
- [ ] added or updated version, license, or notice information in
[licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
- [ ] added comments explaining the "why" and the intent of the code
wherever would not be obvious for an unfamiliar reader.
- [ ] added unit tests or modified existing tests to cover new code paths,
ensuring the threshold for [code
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
is met.
- [ ] added integration tests.
- [ ] been tested in a test Druid cluster.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]