Re: [PR] Add blog post regarding CASE work [datafusion-site]

via GitHub Mon, 22 Dec 2025 06:22:12 -0800


pepijnve commented on code in PR #122:
URL: https://github.com/apache/datafusion-site/pull/122#discussion_r2640038067



##########
content/blog/2025-11-11-datafusion_case.md:
##########
@@ -0,0 +1,377 @@
+---
+layout: post
+title: Optimizing CASE Expression Evaluation
+date: 2025-11-11
+author: Pepijn Van Eeckhoudt
+categories: [features]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+[TOC]
+
+<style>
+figure {
+  margin: 20px 0;
+}
+
+figure img {
+  display: block;
+  max-width: 80%;
+  margin: auto;
+}
+
+figcaption {
+  font-style: italic;
+  color: #555;
+  font-size: 0.9em;
+  max-width: 80%;
+  margin: auto;
+  text-align: center;
+}
+</style>
+
+# Optimizing CASE Expression Evaluation in DataFusion
+
+SQL's `CASE` expression is one of the few constructs the language provides to 
perform conditional logic.
+Its deceptively simple syntax hides significant implementation complexity.
+Over the past few weeks, we've landed a series of improvements to DataFusion's 
`CASE` expression evaluator that reduce both CPU time and memory allocations.
+This post walks through the original implementation, its performance 
bottlenecks, and how we addressed them step by step.
+Finally we'll also take a look at some future improvements to `CASE` that are 
in the works.
+
+## Background: CASE Expression Evaluation
+
+SQL supports two forms of CASE expressions:
+
+1. **Simple**: `CASE expr WHEN value1 THEN result1 WHEN value2 THEN result2 
... END`
+2. **Searched**: `CASE WHEN condition1 THEN result1 WHEN condition2 THEN 
result2 ... END`
+
+The simple form evaluates an expression once for each input row and then tests 
that value against the constants (or expressions) in each `WHEN` clause using 
equality comparisons.
+Think of it as a limited Rust `match` expression.
+
+Here's a simple example:
+
+```sql
+CASE status
+    WHEN 'pending' THEN 1
+    WHEN 'active' THEN 2
+    WHEN 'complete' THEN 3
+    ELSE 0
+END
+```
+
+In this `CASE` expression, `status` is evaluated once per row, and then its 
value is tested for equality with the values `'pending'`, `'active'`, and 
`'complete'` in that order.
+The `THEN` expression value for the first matching `WHEN` expression is 
returned per row.
+
+The searched `CASE` form is a more flexible variant.
+It evaluates completely independent boolean expressions for each branch.
+
+This allows you to test different columns with different operators per branch 
as can be seen in the following example:
+
+```sql
+CASE
+    WHEN age > 65 THEN 'senior'
+    WHEN childCount != 0 THEN 'parent'
+    WHEN age < 21 THEN 'minor'
+    ELSE 'adult'
+END
+```
+
+In both forms, branches are evaluated sequentially with short-circuit 
semantics: for each row, once a `WHEN` condition matches, the corresponding 
`THEN` expression is evaluated. Any further branches are not evaluated for that 
row.
+This lazy evaluation model is critical for correctness.
+It let's you safely write `CASE` expressions like
+
+```sql
+CASE
+    WHEN denominator == 0 THEN NULL
+    ELSE nominator / denominator
+END
+```
+
+that are guaranteed to not trigger divide-by-zero errors.
+
+## `CASE` Evaluation in DataFusion 50.0.0
+
+For the rest of this post we'll be looking at 'searched case' evaluation.
+'Simple case' uses a distinct, but very similar implementation.
+The same set of improvements has been applied to both.
+
+The baseline implementation in DataFusion 50.0.0 evaluated `CASE` using a 
straightforward approach:
+
+1. Start with an output array `out` with the same length as the input batch, 
filled with nulls. Additionally, create a bit vector `remainder` with the same 
length and each value set to `true`.
+2. For each `WHEN`/`THEN` branch:
+  - Evaluate the `WHEN` condition for remaining unmatched rows using 
`PhysicalExpr::evaluate_selection`, passing in the input batch and the 
`remainder` mask
+  - If any rows matched, evaluate the `THEN` expression for those rows using 
`PhysicalExpr::evaluate_selection`
+  - Merge the results into the `out` using the `zip` kernel
+  - Update the `remainder` mask to exclude matched rows
+3. If there's an `ELSE` clause, evaluate it for any remaining unmatched rows 
and merge using `zip`
+
+Here's a simplified version of the original loop:
+
+```rust
+let mut current_value = new_null_array(&return_type, batch.num_rows());
+let mut remainder = BooleanArray::from(vec![true; batch.num_rows()]);
+
+for (when_expr, then_expr) in &self.when_then_expr {
+    let when_value = when_expr.evaluate_selection(batch, &remainder)?
+        .into_array(batch.num_rows())?;
+    let when_value = and(&when_value, &remainder)?;
+
+    if when_value.true_count() == 0 {
+        continue;
+    }
+
+    let then_value = then_expr.evaluate_selection(batch, &when_value)?;
+    current_value = zip(&when_value, &then_value, &current_value)?;
+    remainder = and_not(&remainder, &when_value)?;
+}
+```
+
+While correct, this implementation had several performance issues mostly 
related to the usage of `evaluate_selection`.
+To understand why, we need to dig a little deeper into the implementation of 
that function.
+Here's a simplified version of it that captures the relevant parts:
+
+```rust
+pub trait PhysicalExpr {
+    fn evaluate_selection(
+        &self,
+        batch: &RecordBatch,
+        selection: &BooleanArray,
+    ) -> Result<ColumnarValue> {
+        let filtered_batch = filter_record_batch(batch, selection)?;
+        let filtered_result = self.evaluate(&filtered_batch)?;
+        scatter(selection, filtered_result)
+    }
+}
+```
+
+The `evaluate_selection` method first filters the input batch to only include 
rows that match the `selection` mask.
+It then calls the regular `evaluate` method using the filtered batch as input.
+Finally, to return a result array with the same number of rows as `batch`, the 
`scatter` function is called.
+This function produces a new array padded with `null` values for any rows that 
didn't match the `selection` mask.
+
+So how does the simple evaluation strategy and use of `evaluate_selection` 
cause performance overhead?
+
+### Problem 1: No Early Exit

Review Comment:
   You're absolutely right, the current wording is too negative. I'll revise 
this.



##########
content/blog/2025-11-11-datafusion_case.md:
##########
@@ -0,0 +1,377 @@
+---
+layout: post
+title: Optimizing CASE Expression Evaluation
+date: 2025-11-11
+author: Pepijn Van Eeckhoudt
+categories: [features]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+[TOC]
+
+<style>
+figure {
+  margin: 20px 0;
+}
+
+figure img {
+  display: block;
+  max-width: 80%;
+  margin: auto;
+}
+
+figcaption {
+  font-style: italic;
+  color: #555;
+  font-size: 0.9em;
+  max-width: 80%;
+  margin: auto;
+  text-align: center;
+}
+</style>
+
+# Optimizing CASE Expression Evaluation in DataFusion
+
+SQL's `CASE` expression is one of the few constructs the language provides to 
perform conditional logic.
+Its deceptively simple syntax hides significant implementation complexity.
+Over the past few weeks, we've landed a series of improvements to DataFusion's 
`CASE` expression evaluator that reduce both CPU time and memory allocations.
+This post walks through the original implementation, its performance 
bottlenecks, and how we addressed them step by step.
+Finally we'll also take a look at some future improvements to `CASE` that are 
in the works.
+
+## Background: CASE Expression Evaluation
+
+SQL supports two forms of CASE expressions:
+
+1. **Simple**: `CASE expr WHEN value1 THEN result1 WHEN value2 THEN result2 
... END`
+2. **Searched**: `CASE WHEN condition1 THEN result1 WHEN condition2 THEN 
result2 ... END`
+
+The simple form evaluates an expression once for each input row and then tests 
that value against the constants (or expressions) in each `WHEN` clause using 
equality comparisons.
+Think of it as a limited Rust `match` expression.
+
+Here's a simple example:
+
+```sql
+CASE status
+    WHEN 'pending' THEN 1
+    WHEN 'active' THEN 2
+    WHEN 'complete' THEN 3
+    ELSE 0
+END
+```
+
+In this `CASE` expression, `status` is evaluated once per row, and then its 
value is tested for equality with the values `'pending'`, `'active'`, and 
`'complete'` in that order.
+The `THEN` expression value for the first matching `WHEN` expression is 
returned per row.
+
+The searched `CASE` form is a more flexible variant.
+It evaluates completely independent boolean expressions for each branch.
+
+This allows you to test different columns with different operators per branch 
as can be seen in the following example:
+
+```sql
+CASE
+    WHEN age > 65 THEN 'senior'
+    WHEN childCount != 0 THEN 'parent'
+    WHEN age < 21 THEN 'minor'
+    ELSE 'adult'
+END
+```
+
+In both forms, branches are evaluated sequentially with short-circuit 
semantics: for each row, once a `WHEN` condition matches, the corresponding 
`THEN` expression is evaluated. Any further branches are not evaluated for that 
row.
+This lazy evaluation model is critical for correctness.
+It let's you safely write `CASE` expressions like
+
+```sql
+CASE
+    WHEN denominator == 0 THEN NULL
+    ELSE nominator / denominator
+END
+```
+
+that are guaranteed to not trigger divide-by-zero errors.
+
+## `CASE` Evaluation in DataFusion 50.0.0
+
+For the rest of this post we'll be looking at 'searched case' evaluation.
+'Simple case' uses a distinct, but very similar implementation.
+The same set of improvements has been applied to both.
+
+The baseline implementation in DataFusion 50.0.0 evaluated `CASE` using a 
straightforward approach:
+
+1. Start with an output array `out` with the same length as the input batch, 
filled with nulls. Additionally, create a bit vector `remainder` with the same 
length and each value set to `true`.
+2. For each `WHEN`/`THEN` branch:
+  - Evaluate the `WHEN` condition for remaining unmatched rows using 
`PhysicalExpr::evaluate_selection`, passing in the input batch and the 
`remainder` mask
+  - If any rows matched, evaluate the `THEN` expression for those rows using 
`PhysicalExpr::evaluate_selection`
+  - Merge the results into the `out` using the `zip` kernel
+  - Update the `remainder` mask to exclude matched rows
+3. If there's an `ELSE` clause, evaluate it for any remaining unmatched rows 
and merge using `zip`
+
+Here's a simplified version of the original loop:
+
+```rust
+let mut current_value = new_null_array(&return_type, batch.num_rows());
+let mut remainder = BooleanArray::from(vec![true; batch.num_rows()]);
+
+for (when_expr, then_expr) in &self.when_then_expr {
+    let when_value = when_expr.evaluate_selection(batch, &remainder)?
+        .into_array(batch.num_rows())?;
+    let when_value = and(&when_value, &remainder)?;
+
+    if when_value.true_count() == 0 {
+        continue;
+    }
+
+    let then_value = then_expr.evaluate_selection(batch, &when_value)?;
+    current_value = zip(&when_value, &then_value, &current_value)?;
+    remainder = and_not(&remainder, &when_value)?;
+}
+```
+
+While correct, this implementation had several performance issues mostly 
related to the usage of `evaluate_selection`.

Review Comment:
   Fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Add blog post regarding CASE work [datafusion-site]

Reply via email to