[
https://issues.apache.org/jira/browse/CALCITE-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004696#comment-17004696
]
Vladimir Sitnikov commented on CALCITE-1842:
--------------------------------------------
Oh, I thought makeCost(rowCount=Util.nLogN(rowCount), ...) was there for a good
reason.
Note: it is hard (impossible?) to compare a cost with multiple units involved.
In other words, we can't really compare `(rows=10, cpu=20)` vs `(rows=20,
cpu=10)`.
I always thought we keep everything important in `rows` field, so the costs
always compare with that while "cpu and io" was not really useful :(
For instance, EnumerableHashJoin puts all the costing into ROWS field, and it
even leaves CPU empty:
[https://github.com/apache/calcite/blob/0341e97835200e1d2d7cc582ded98d3aaa829c01/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableHashJoin.java#L138]
The net result becomes Calcite refrains from using EnumerableHashJoin :(
[~julianhyde], do you have any clue here? Should we revisit costing to make
sure "rows" always return the number of rows, and cpu work is reflected in CPU?
Frankly speaking, I have no idea how to do that, because there's a trivial
counter-example: Project node is almost always cost nothing (it is not really
computation-intensive), however, a mere presence of a project node in the plan
would add **rows** costing part, and it would make the plan to look "less
effective" even in case the alternative is using one less projection but with
much more CPU work.
Note: current VolcanoCost is NOT using CPU for comparison purposes:
[https://github.com/apache/calcite/blob/571731b80a58eb095ebac7123285c375e7afff90/core/src/main/java/org/apache/calcite/plan/volcano/VolcanoCost.java#L98-L103]
It uses just ROWS part.
> Wrong order of inputs for makeCost() call in Sort.computeSelfCost()
> -------------------------------------------------------------------
>
> Key: CALCITE-1842
> URL: https://issues.apache.org/jira/browse/CALCITE-1842
> Project: Calcite
> Issue Type: Bug
> Components: core
> Reporter: JD Zheng
> Assignee: Julian Hyde
> Priority: Major
> Fix For: 1.14.0
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> Original code in Sort.java
> {code:java}
> @Override public RelOptCost computeSelfCost(RelOptPlanner planner,
> RelMetadataQuery mq) {
> // Higher cost if rows are wider discourages pushing a project through a
> // sort.
> double rowCount = mq.getRowCount(this);
> double bytesPerRow = getRowType().getFieldCount() * 4;
> return planner.getCostFactory().makeCost(
> Util.nLogN(rowCount) * bytesPerRow, rowCount, 0);
> {code}
> The last line should be
> {code:java}
> return planner.getCostFactory().makeCost(
> rowCount/*rowCount*/, Util.nLogN(rowCount) * bytesPerRow/*cpu*/,
> 0/*io*/);
> {code}
> The wrong order will make the planner choose the wrong physical plan. For
> example, if the druid query has a limit of 10 with 10+ dimensions, the
> optimizer will choose not push the "limit" down to druid instead choose
> scanning entire data source in druid.
> The fix is very easy, the gain is huge as the performance of the wrong plan
> is really bad. Hope it will be picked up by the next release.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)