Github user kiszk commented on a diff in the pull request:
https://github.com/apache/spark/pull/19901#discussion_r155189797
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/conditionalExpressions.scala
---
@@ -180,13 +180,15 @@ case class CaseWhen(
}
override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
- // This variable represents whether the first successful condition is
met or not.
- // It is initialized to `false` and it is set to `true` when the first
condition which
- // evaluates to `true` is met and therefore is not needed to go on
anymore on the computation
- // of the following conditions.
- val conditionMet = ctx.freshName("caseWhenConditionMet")
- ctx.addMutableState(ctx.JAVA_BOOLEAN, ev.isNull)
- ctx.addMutableState(ctx.javaType(dataType), ev.value)
+ // This variable represents whether the evaluated result is null or
not. It's a byte value
+ // instead of boolean because it carries an extra information about if
the case-when condition
+ // is met or not. It is initialized to `-1`, which means the condition
is not met yet and the
+ // result is unknown. When the first condition is met, it is set to
`1` if result is null, or
+ // `0` if result is not null. We won't go on anymore on the
computation if it's set to `1` or
+ // `0`.
+ val resultIsNull = ctx.freshName("caseWhenResultIsNull")
--- End diff --
Could you elaborate your thought on how to use `boolean` in the case
`CaseWhen`?
IIUC, `coalesce` requires two states: `not met with isNull = true` or `met
with isNull = false`. Thus, we can use `boolean` value. `CaseWhen` requires
three states: `not met`, `met with isNull = false`, or `met with isNull =
true`. Thus, we introduces `byte` value. If we would use `boolean value` like
`coalesce`, we will miss the state `met with isNull = true`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]