yujun777 commented on code in PR #62742:
URL: https://github.com/apache/doris/pull/62742#discussion_r3361257768
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AddProjectForVolatileExpression.java:
##########
@@ -269,6 +271,85 @@ public <T extends Expression> Optional<Pair<List<T>,
LogicalProject<Plan>>> rewr
return Optional.of(Pair.of(newTargetsBuilder.build(), new
LogicalProject<>(projects, plan.child(0))));
}
+ private Optional<JoinRewriteResult>
rewriteJoinExpressions(LogicalJoin<Plan, Plan> join,
+ Collection<Expression> targets) {
+ Map<Expression, Integer> volatileExpressionCounter =
Maps.newLinkedHashMap();
+ Map<Expression, Set<Slot>> volatileExpressionSlots =
Maps.newLinkedHashMap();
+ for (Expression target : targets) {
+ target.foreach(e -> {
+ Expression expr = (Expression) e;
+ if (expr.isVolatile()) {
+ volatileExpressionCounter.merge(expr, 1, Integer::sum);
+ Set<Slot> volatileInputSlots = expr.getInputSlots();
+ volatileExpressionSlots
+ .computeIfAbsent(expr, ignored ->
Sets.newLinkedHashSet())
+ .addAll(volatileInputSlots.isEmpty() ?
target.getInputSlots() : volatileInputSlots);
+ }
+ });
+ }
+
+ ImmutableList.Builder<NamedExpression> leftAliases =
ImmutableList.builder();
+ ImmutableList.Builder<NamedExpression> rightAliases =
ImmutableList.builder();
+ Map<Expression, Slot> replaceMap = Maps.newHashMap();
+ Set<Slot> leftOutputSet = join.left().getOutputSet();
+ Set<Slot> rightOutputSet = join.right().getOutputSet();
+ for (Entry<Expression, Integer> entry :
volatileExpressionCounter.entrySet()) {
+ if (entry.getValue() <= 1) {
+ continue;
+ }
+ Set<Slot> inputSlots = volatileExpressionSlots.get(entry.getKey());
+ Set<Slot> volatileInputSlots = entry.getKey().getInputSlots();
+ if (!volatileInputSlots.isEmpty()
+ && !leftOutputSet.containsAll(inputSlots)
+ && !rightOutputSet.containsAll(inputSlots)) {
+ continue;
+ }
+ ExprId exprId = StatementScopeIdGenerator.newExprId();
+ String functionName = entry.getKey() instanceof Function
+ ? ((Function) entry.getKey()).getName() : "volatile";
+ Alias alias = new Alias(exprId, entry.getKey(), "$_" +
functionName + "_" + exprId.asInt() + "_$");
+ replaceMap.put(alias.child(0), alias.toSlot());
+ // Join can not add a project at join-pair scope, but repeated
volatile expressions
+ // still need one materialized value. Slot-free volatile functions
use the containing
+ // conjunct's slots to choose a side, so t2.k + rand() can project
rand() on the right.
+ // Volatile functions with input slots use their own slots to
avoid projecting
+ // volatile_udf(t2.k) on the left only because its containing
conjunct also uses t1.
+ // Volatile functions whose own slots span both join children
cannot be projected into
+ // either child, so they are not rewritten here.
+ // Put right-only expressions on the right child; otherwise keep
the previous
+ // left-child behavior as the conservative default.
+ if (!inputSlots.isEmpty() &&
rightOutputSet.containsAll(inputSlots)) {
+ rightAliases.add(alias);
+ } else {
Review Comment:
One possible refinement is to handle inner/cross join specially: move the
volatile ON predicate to a Filter above the join, then let
AddProjectForVolatileExpression materialize the volatile expression at
join-output scope. That would be more accurate for inner/cross join.
But this still does not solve outer/semi/anti/mark join, where moving ON
predicates above the join is not generally equivalent. So we would end up with
different volatile-expression semantics for different join types, while the
original problem of duplicated volatile evaluation still exists for the
remaining join types.
Considering this tradeoff, I think it is better to keep the current
implementation simple and consistent for now. The current approach is not
perfect for all join semantics, but it avoids the more serious
duplicated-evaluation problem such as expanded BETWEEN predicates.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]