[
https://issues.apache.org/jira/browse/CALCITE-4202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188076#comment-17188076
]
Julian Hyde commented on CALCITE-4202:
--------------------------------------
I support this. Even if the actual cost is the same (say if Druid does its own
optimization) it is still worth tweaking the cost model to generate plans that
are likely 'better' for humans reading them, and to achieve plan stability.
Those are worth goals in their own right.
Be sure to make it clear in the cost formulas what your goal is. We don't want
future maintainers slavishly maintaining the current behavior if it no longer
makes sense.
> Refine Druid cost-model to capture differences in intermediate projections
> ---------------------------------------------------------------------------
>
> Key: CALCITE-4202
> URL: https://issues.apache.org/jira/browse/CALCITE-4202
> Project: Calcite
> Issue Type: Improvement
> Components: druid-adapter
> Reporter: Stamatis Zampetakis
> Priority: Major
>
> The planner generates equivalent DruidQuery expressions with exactly the same
> cost. Most of the time the expressions differ only in the number of
> intermediate projections
> For example, running the following query
> {code:sql}
> select distinct "countryName"
> from "wiki"
> where "page" = 'Jeremy Corbyn'
> {code}
> via {{DruidAdapterIT#testSelectDistinctWiki}} generates among others the
> following alternatives during optimization.
> +Choice 1+
> {noformat}
> rel#184:DruidQuery.BINDABLE.[](table=[wiki,
> wiki],intervals=[1900-01-09T00:00:00.000Z/2992-01-10T00:00:00.000Z],filter==($13,
> 'Jeremy Corbyn'),projects=[$5, $13],groups={0},aggs=[])
> {noformat}
> +Choice 2+
> {noformat}
> rel#108:DruidQuery.BINDABLE.[](table=[wiki,
> wiki],intervals=[1900-01-09T00:00:00.000Z/2992-01-10T00:00:00.000Z],filter==($13,
> 'Jeremy Corbyn'),projects=[$5],groups={0},aggs=[])
> {noformat}
> Using the debugger we can see that the cost of the two plans is exactly the
> same (although they are different) which means that the one that was
> generated first will dominate the other. Clearly in this case the second
> choice is a better plan.
> Performance wise the difference may not be that big but refining the cost is
> beneficial at least for plan stability. Currently the final plan is dependent
> on the order that the rules are applied.
> The goal of this jira is to refine Druid's cost model so that choice 2
> becomes cheaper than choice 1 outlined above.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)