kosiew opened a new pull request, #18191:
URL: https://github.com/apache/datafusion/pull/18191
## Which issue does this PR close?
* Closes #17983
## Rationale for this change
The current implementation of the `nvl2` function in DataFusion eagerly
evaluates all its arguments, which can lead to unnecessary computation and
incorrect behavior when handling expressions that should only be conditionally
evaluated. This PR introduces **lazy evaluation** for `nvl2`, aligning its
behavior with other conditional expressions like `coalesce` and improving both
performance and correctness.
This change also introduces a **simplification rule** that rewrites `nvl2`
expressions into equivalent `CASE` statements, allowing for better optimization
during query planning and execution.
## What changes are included in this PR?
* Refactored `nvl2` implementation in
`datafusion/functions/src/core/nvl2.rs`:
* Added support for **short-circuit (lazy) evaluation** using
`short_circuits()`.
* Implemented **simplify()** method to rewrite expressions into `CASE`
form.
* Introduced **return_field_from_args()** for correct nullability and type
inference.
* Replaced the previous eager `nvl2_func()` logic with an optimized, more
declarative approach.
* Added comprehensive **unit tests**:
* `test_nvl2_short_circuit` in `dataframe_functions.rs` verifies correct
short-circuit behavior.
* `test_create_physical_expr_nvl2` in `expr_api/mod.rs` validates physical
expression creation and output correctness.
## Are these changes tested?
✅ Yes, multiple new tests are included:
* **`test_nvl2_short_circuit`** ensures `nvl2` does not evaluate unnecessary
branches.
* **`test_create_physical_expr_nvl2`** checks the correctness of evaluation
and type coercion behavior.
All existing and new tests pass successfully.
## Are there any user-facing changes?
Yes, but they are **non-breaking** and **performance-enhancing**:
* `nvl2` now evaluates lazily, meaning only the required branch is computed
based on the nullity of the test expression.
* Expression simplification will yield more optimized query plans.
There are **no API-breaking changes**. However, users may observe improved
performance and reduced computation for expressions involving `nvl2`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]