[
https://issues.apache.org/jira/browse/CALCITE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116911#comment-17116911
]
Haisheng Yuan commented on CALCITE-3972:
----------------------------------------
The fact that Sort can participate rule matching is the culprit.
Sort changes the relation's physical property, but doesn't change the logical
property.
Use
[testSortJoinTranspose2|https://github.com/apache/calcite/commit/0715f5b55f363a58e3dd8c20caac0024e19be413#diff-de15ea9da479ca31d38de70365967392R4070]
as example,
{code:java}
Before:
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4],
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], DEPTNO0=[$9], NAME=[$10])
LogicalSort(sort0=[$10], dir0=[ASC])
LogicalJoin(condition=[=($7, $9)], joinType=[right])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
LogicalProject(DEPTNO=[$0], NAME=[$1])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
After:
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4],
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], DEPTNO0=[$9], NAME=[$10])
LogicalSort(sort0=[$10], dir0=[ASC])
LogicalJoin(condition=[=($7, $9)], joinType=[right])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
LogicalSort(sort0=[$1], dir0=[ASC])
LogicalProject(DEPTNO=[$0], NAME=[$1])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}
If we combine the sort with any operator in the original plan, the logical
properties are all the same. After the rule execution, LogicalJoin has a new
right input, even the sort in the right input can be the same RelSet as
LogicalProject (unfortunately it isn't), the new join right input changed (the
logical join is requesting a collation on right input), the join's digest
changed, it will be viewed as a whole new join, then will apply all the logical
transformations that it can apply.
Although the rule above only applies on outer join, the same problem happens on
SortProjectTransposeRule.
Now come back to the problem in JDBCTest.testJoinManyWays(),
JoinPushThroughJoinRule's one rule operand is RelNode.class, which means any
new node in the join's input RelSet will trigger the rule. But in this rule, we
don't care about what exact relnode it is, we just want the whole group as a
placeholder. Any new logical sort, physical sort, and abstract converter will
all trigger the matches of JoinPushThroughJoinRule. This is extremely
unnecessary.
If we change
{code:java}
operand(RelNode.class, any())),
{code}
to
{code:java}
operandJ(RelNode.class, null, n -> !n.isEnforcer(), any())),
{code}
It will achieve the same effect as generating EnumerableSort directly, but
still generating LogicalSort in RelCollationTraitDef, without affecting rules
like, SortProjectTranspose, SortJoinTranspose, SortJoinCopy.
The total rule apply count of JoinPushThroughJoinRule cut from 9000 to 900,
reduced by 90%. This will again reduce the ProjectMergeRule a lot, because
every join reorder generate at least a new LogicalProject in Calcite.
Now the rule count is:
{code:java}
Rules
Attempts Time (us)
ProjectMergeRule:force_mode
14,064 2,680,177
EnumerableProjectRule(in:NONE,out:ENUMERABLE)
974 271,608
JoinPushThroughJoinRule:left
449 209,768
JoinPushThroughJoinRule:right
449 2,949
AggregatePullUpConstantsRule
291 17,947
AggregateProjectMergeRule
277 83,288
ProjectFilterTransposeRule
207 30,300
EnumerableJoinRule(in:NONE,out:ENUMERABLE)
108 70,179
EnumerableMergeJoinRule(in:NONE,out:ENUMERABLE)
108 46,111
JoinPushExpressionsRule
108 10,807
{code}
> Allow RelBuilder to create RelNode with convention and use it for trait
> convert
> -------------------------------------------------------------------------------
>
> Key: CALCITE-3972
> URL: https://issues.apache.org/jira/browse/CALCITE-3972
> Project: Calcite
> Issue Type: Bug
> Reporter: Xiening Dai
> Assignee: Xiening Dai
> Priority: Major
> Time Spent: 2h
> Remaining Estimate: 0h
>
> 1. Provide Convention.transformRelBuilder() to transform an existing
> RelBuilder into one with specific convention.
> 2. RelBuilder provides withRelFactories() method to allow caller swap the
> underlying RelFactories and create a new builder.
> 3. Use the new interface in RelCollationTraitDef for converting into
> RelCollation traits
> We can avoid ~1/3 of total rule firings in a N way join case with this change.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)