gengliangwang opened a new pull request, #56258:
URL: https://github.com/apache/spark/pull/56258
### What changes were proposed in this pull request?
`In`'s whole-stage codegen emits, for each list element `x`:
```java
if (x.isNull) {
inTmpResult = -1; // HAS_NULL
} else if (value == x) {
inTmpResult = 1; // MATCHED
continue;
}
```
IN lists are usually constant literals, so `x.isNull` is the literal `false`
and the `HAS_NULL` branch is dead (`if (false) { inTmpResult = -1; } else if
...`). This PR emits only the equality check when `x.isNull == FalseLiteral`.
Symmetrically, when an element is statically null (`x.isNull == TrueLiteral`,
e.g. `IN (NULL, ...)`), only the `HAS_NULL` assignment is emitted and the dead
equality check is dropped.
### Why are the changes needed?
Sub-task of SPARK-56908 (reduce generated Java size in whole-stage codegen).
Dumping the TPC-DS whole-stage codegen shows ~348 dead `if (false) {
inTmpResult = -1; } else if (...)` blocks from `In` over literal lists.
Emitting only the live branch removes the dead conditional from the generated
code.
### Does this PR introduce _any_ user-facing change?
No. The generated code is smaller but evaluates IN with identical (including
null) semantics: a statically non-null element can never set `HAS_NULL`, and a
statically null element can never match, so dropping those dead branches is
equivalent.
### How was this patch tested?
Behavior-preserving change covered by `PredicateSuite` (70 tests, including
`In` with null list elements and a null left-hand value), all pass.
Additionally verified by re-dumping the TPC-DS whole-stage codegen: the ~348
dead `if (false)` blocks in `In` are gone and every generated subtree still
compiles.
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Opus 4.8)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]