yujun777 commented on code in PR #64335:
URL: https://github.com/apache/doris/pull/64335#discussion_r3503178264


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/SimplifyAggGroupBy.java:
##########
@@ -81,7 +81,56 @@ protected static boolean 
isBinaryArithmeticSlot(TreeNode<Expression> expr) {
         if (!supportedFunctions.contains(expr.getClass())) {
             return false;
         }
-        return ExpressionUtils.isSlotOrCastOnSlot(expr.child(0)).isPresent() 
&& expr.child(1) instanceof Literal
-                || 
ExpressionUtils.isSlotOrCastOnSlot(expr.child(1)).isPresent() && expr.child(0) 
instanceof Literal;
+
+        // Float/double arithmetic: precision loss for all operations
+        if (expr.child(0).getDataType().isFloatLikeType()
+                || expr.child(1).getDataType().isFloatLikeType()) {
+            return false;
+        }
+
+        Expression slotExpr;
+        Literal literal;
+        if (expr.child(0) instanceof Literal) {
+            literal = (Literal) expr.child(0);
+            slotExpr = expr.child(1);
+        } else if (expr.child(1) instanceof Literal) {
+            literal = (Literal) expr.child(1);
+            slotExpr = expr.child(0);
+        } else {
+            return false;
+        }
+
+        if (!canExtractSlot(slotExpr)) {
+            return false;
+        }
+
+        return checkLiteral(expr, literal);
     }
+
+    @VisibleForTesting
+    protected static boolean checkLiteral(Expression expr, Literal literal) {
+        if (literal.isNullLiteral()) {
+            return false;
+        }

Review Comment:
   Integer arithmetic overflow in Doris triggers an error (ARITHMETIC_OVERFLOW) 
rather than silent wrap-around, so the optimization does not introduce 
correctness issues — the original query would already fail before the rewrite.
   
   The scenario of a column value near BIGINT/LARGEINT boundary multiplied by a 
literal is extremely rare in practice. Many optimizer rules have the same 
trade-off of silently avoiding an evaluation that would have thrown an error. 
For example, `days_add(t, literal) >= xx` is rewritten to `t >= days_sub(xx, 
literal)` without checking whether the left-hand side would exceed the maximum 
date range of 9999-12-31.



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/SimplifyAggGroupBy.java:
##########
@@ -81,7 +81,56 @@ protected static boolean 
isBinaryArithmeticSlot(TreeNode<Expression> expr) {
         if (!supportedFunctions.contains(expr.getClass())) {
             return false;
         }
-        return ExpressionUtils.isSlotOrCastOnSlot(expr.child(0)).isPresent() 
&& expr.child(1) instanceof Literal
-                || 
ExpressionUtils.isSlotOrCastOnSlot(expr.child(1)).isPresent() && expr.child(0) 
instanceof Literal;
+
+        // Float/double arithmetic: precision loss for all operations
+        if (expr.child(0).getDataType().isFloatLikeType()
+                || expr.child(1).getDataType().isFloatLikeType()) {
+            return false;
+        }
+
+        Expression slotExpr;
+        Literal literal;
+        if (expr.child(0) instanceof Literal) {
+            literal = (Literal) expr.child(0);
+            slotExpr = expr.child(1);
+        } else if (expr.child(1) instanceof Literal) {
+            literal = (Literal) expr.child(1);
+            slotExpr = expr.child(0);
+        } else {
+            return false;
+        }
+
+        if (!canExtractSlot(slotExpr)) {
+            return false;
+        }
+
+        return checkLiteral(expr, literal);
     }
+
+    @VisibleForTesting
+    protected static boolean checkLiteral(Expression expr, Literal literal) {
+        if (literal.isNullLiteral()) {
+            return false;
+        }
+        if (expr instanceof Multiply || expr instanceof Divide) {
+            if (literal.isZero()) {
+                return false;
+            }

Review Comment:
   Same reasoning as the multiply/divide case: integer arithmetic overflow in 
Doris throws ARITHMETIC_OVERFLOW rather than silently wrapping, so there is no 
correctness issue — only a difference in error behavior. The rule merely avoids 
evaluating a key that the original query would have computed. This trade-off is 
common across many optimizer rules (e.g., date/time arithmetic 
simplifications). Given the extreme rarity of DECIMAL(38,0) boundary values in 
practice, we believe the simplification is safe and useful.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to