asolimando commented on code in PR #4562:
URL: https://github.com/apache/calcite/pull/4562#discussion_r2390182429
##########
core/src/main/java/org/apache/calcite/rel/rules/IntersectToSemiJoinRule.java:
##########
@@ -112,33 +114,50 @@ protected IntersectToSemiJoinRule(Config config) {
for (int i = 1; i < inputs.size(); i++) {
RelNode next = inputs.get(i);
- List<RexNode> conditions = new ArrayList<>();
- int fieldCount = current.getRowType().getFieldCount();
- for (int j = 0; j < fieldCount; j++) {
- RelDataType leftFieldType =
current.getRowType().getFieldList().get(j).getType();
- RelDataType rightFieldType =
next.getRowType().getFieldList().get(j).getType();
- RelDataType leastFieldType =
leastRowType.getFieldList().get(j).getType();
+ // cast columns of the join inputs to the least types (global)
+ final RelNode leftCasted = projectJoinInput(builder, leastRowType,
current);
+ final RelNode rightCasted = projectJoinInput(builder, leastRowType,
next);
+
+ builder.clear();
+ builder.push(leftCasted).push(rightCasted);
- conditions.add(
+ // compute the join condition over plain fields from the projections of
left/right inputs
+ final int fieldCount = leastRowType.getFieldCount();
+ final List<RexNode> joinPredicates = new ArrayList<>(fieldCount);
+ for (int j = 0; j < fieldCount; j++) {
+ joinPredicates.add(
builder.isNotDistinctFrom(
- rexBuilder.makeCast(leastFieldType,
- rexBuilder.makeInputRef(leftFieldType, j)),
- rexBuilder.makeCast(leastFieldType,
- rexBuilder.makeInputRef(rightFieldType, j + fieldCount))));
+ builder.field(2, 0, j),
+ builder.field(2, 1, j)));
}
- RexNode condition = RexUtil.composeConjunction(rexBuilder, conditions);
- builder.push(next)
- .join(JoinRelType.SEMI, condition);
+ final RexNode condition = RexUtil.composeConjunction(rexBuilder,
joinPredicates);
+ builder.join(JoinRelType.SEMI, condition);
current = builder.peek();
}
- builder.distinct()
- .convert(leastRowType, true);
+ builder.distinct().convert(leastRowType, true);
call.transformTo(builder.build());
}
+ private RelNode projectJoinInput(
+ RelBuilder builder, RelDataType leastRowType, RelNode joinInput) {
+ builder.clear();
+ builder.push(joinInput);
+
+ final int fieldCount = joinInput.getRowType().getFieldCount();
+ final List<String> names = leastRowType.getFieldNames();
+ final List<RexNode> joinKeys = new ArrayList<>(fieldCount);
+ final RexBuilder rexBuilder = builder.getRexBuilder();
+ for (int j = 0; j < fieldCount; j++) {
+ final RelDataType leastType =
leastRowType.getFieldList().get(j).getType();
+ joinKeys.add(rexBuilder.makeCast(leastType, builder.field(j)));
+ }
+
+ return builder.project(joinKeys, names).build();
Review Comment:
Without the projections, `JoinPushExpressionsRule` will still push several
times the operands of the join conditions from the different semi-joins we
have, leading to the same repeated `A0` expression as before.
```
for (int i = 1; i < inputs.size(); i++) {
RelNode next = inputs.get(i);
- // cast columns of the join inputs to the least types (global)
- final RelNode leftCasted = projectJoinInput(builder, leastRowType,
current);
- final RelNode rightCasted = projectJoinInput(builder, leastRowType,
next);
- builder.push(leftCasted).push(rightCasted);
+ builder.clear();
+ builder.push(current).push(next);
- // compute the join condition over plain fields from the projections
of left/right inputs
+ // cast columns of the join inputs to the least types (global)
final int fieldCount = leastRowType.getFieldCount();
final List<RexNode> joinPredicates = new ArrayList<>(fieldCount);
for (int j = 0; j < fieldCount; j++) {
- joinPredicates.add(
- builder.isNotDistinctFrom(
- builder.field(2, 0, j),
- builder.field(2, 1, j)));
+ final RelDataType leastType =
leastRowType.getFieldList().get(j).getType();
+ final RexNode leftKey = rexBuilder.makeCast(leastType,
builder.field(2, 0, j), true, false);
+ final RexNode rightKey = rexBuilder.makeCast(leastType,
builder.field(2, 1, j), true, false);
+ joinPredicates.add(builder.isNotDistinctFrom(leftKey, rightKey));
}
```
You can try by changing the inner loop as follows, then planner.iq will get
back to the previous state with `A0=[$t1]`.
`JoinPushExpressionsRule` is working on each join separately, it has no
global vision of repeated predicates it pushes down, and it's probably not
worth trying to enforce that either, but this leaves us with little choice on
how to tackle this, either we have a project or we accept redundant elements
`A`, `A0` etc.
Am I overlooking a better alternative?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]