alamb commented on code in PR #10260:
URL: https://github.com/apache/datafusion/pull/10260#discussion_r1583694523
##########
datafusion/expr/src/expr_fn.rs:
##########
@@ -215,6 +215,18 @@ pub fn count(expr: Expr) -> Expr {
))
}
+/// Create an expression to represent the COUNT(IS NULL x) aggregate function
Review Comment:
I found this confusing as first because it isn't `COUNT(x IS NULL)` -- I
think a more accurate description is `COUNT(x) FILTER (WHERE x IS NULL)` or
something like that
```suggestion
/// Create an aggregate that returns the number `NULL` values in a column
```
##########
datafusion/core/src/dataframe/mod.rs:
##########
@@ -534,7 +534,7 @@ impl DataFrame {
vec![],
original_schema_fields
.clone()
- .map(|f| count(is_null(col(f.name()))).alias(f.name()))
+ .map(|f| count_null(col(f.name())).alias(f.name()))
Review Comment:
Got it. Make sense to me. Another classic trick is to use a CASE expression,
something like
```sql
SELECT SUM(CASE WHEN x IS NULL THEN 1 ELSE 0 END) as null_count ...
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]