gengliangwang opened a new pull request, #56253:
URL: https://github.com/apache/spark/pull/56253

   ### What changes were proposed in this pull request?
   
   `DivModLike` (`Divide` / `Remainder` / `IntegralDivide`) and `Pmod` always 
emit a divide-by-zero guard in their generated code:
   
   ```java
   if (divisor == 0) {
     throw QueryExecutionErrors.divideByZeroError(...); // ANSI; or: isNull = 
true;
   } else {
     ... result ...
   }
   ```
   
   When the divisor is a foldable, non-null, non-zero constant, that guard is 
dead code. This PR detects that case (`divisorIsNonZero`) and emits only the 
division/remainder body, dropping the check. The `errCtx` error-context value 
is also made `lazy` so its constant-pool reference is no longer registered when 
the (now-skipped) check is the only thing that would have used it.
   
   Example: for `col / 100.0` in ANSI mode, the generated code goes from
   
   ```java
   if (100.0D == 0) {
     throw QueryExecutionErrors.divideByZeroError(((SQLQueryContext) 
references[1] /* errCtx */));
   } else {
     project_value = (double)(col / 100.0D);
   }
   ```
   
   to
   
   ```java
   project_value = (double)(col / 100.0D);
   ```
   
   The dead-check elimination only fires when the divisor is statically 
non-zero; a zero literal divisor (which always errors / returns null) and any 
variable divisor keep the existing check.
   
   ### Why are the changes needed?
   
   This is a sub-task of SPARK-56908 (reduce generated Java size in whole-stage 
codegen). Dumping the whole-stage codegen of the TPC-DS queries shows 56 
occurrences of the dead `if (100.0D == 0) throw 
QueryExecutionErrors.divideByZeroError(...)` check across 14 queries 
(percentage computations such as `x / 100.0`). Each is unreachable source plus 
an unreachable `references[]` / constant-pool entry. Removing them shrinks the 
generated code and eases constant-pool pressure with no behavior change.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No. The generated code for a non-zero constant divisor is smaller but 
computes the same result; behavior for zero and variable divisors is unchanged.
   
   ### How was this patch tested?
   
   - Added `ArithmeticExpressionSuite` test "SPARK-57198: skip the 
divide-by-zero check when the divisor is a non-zero literal", asserting the 
by-zero check is dropped for a non-zero literal divisor (Divide / Remainder / 
IntegralDivide / Pmod) and kept for a variable divisor. Existing tests (e.g. 
SPARK-33008) continue to cover the zero-literal-divisor error path.
   - Verified end-to-end by re-dumping the TPC-DS whole-stage codegen: the 56 
dead `if (100.0D == 0)` checks dropped to 0, and all generated subtrees still 
compile.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Code (Opus 4.8)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to