jayzhan211 commented on code in PR #14289:
URL: https://github.com/apache/datafusion/pull/14289#discussion_r1936645403
##########
datafusion/physical-expr/src/scalar_function.rs:
##########
@@ -186,6 +187,15 @@ impl PhysicalExpr for ScalarFunctionExpr {
.map(|e| e.evaluate(batch))
.collect::<Result<Vec<_>>>()?;
+ if self.fun.signature().null_handling == NullHandling::Propagate
+ && args.iter().any(
+ |arg| matches!(arg, ColumnarValue::Scalar(scalar) if
scalar.is_null()),
Review Comment:
> Ok, I think I understand this now. If the function is called with a single
set of arguments then each arg will be a `ColumnarValue::Scalar`. However, if
the function is called on a batch of arguments, then each arg will be a
`ColumnarValue::Array` of all the arguments. So this does not work in the batch
case.
>
> What we'd really like is to identify all indexes, `i`, s.t. one of the
args at index `i` is `Null`. Then somehow skip all rows at the identified
indexes and immediately return `Null` for those. That seems a little tricky
because it looks like we pass the entire `ArrayRef` to the function
implementation.
I don't think we need to peek the null for column case, they should be
specific logic handled for each function. For scalar case, since most of the
scalar function returns null if any one of args is null, it is beneficial to
introduce another null handling method. It is just convenient method nice to
have but not the must have solution for null handling since they can be handled
in 'invoke' as well.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]