jayzhan211 commented on code in PR #14289: URL: https://github.com/apache/datafusion/pull/14289#discussion_r1936645403
########## datafusion/physical-expr/src/scalar_function.rs: ########## @@ -186,6 +187,15 @@ impl PhysicalExpr for ScalarFunctionExpr { .map(|e| e.evaluate(batch)) .collect::<Result<Vec<_>>>()?; + if self.fun.signature().null_handling == NullHandling::Propagate + && args.iter().any( + |arg| matches!(arg, ColumnarValue::Scalar(scalar) if scalar.is_null()), Review Comment: > Ok, I think I understand this now. If the function is called with a single set of arguments then each arg will be a `ColumnarValue::Scalar`. However, if the function is called on a batch of arguments, then each arg will be a `ColumnarValue::Array` of all the arguments. So this does not work in the batch case. > > What we'd really like is to identify all indexes, `i`, s.t. one of the args at index `i` is `Null`. Then somehow skip all rows at the identified indexes and immediately return `Null` for those. That seems a little tricky because it looks like we pass the entire `ArrayRef` to the function implementation. I don't think we need to peek the null for column case, they should be specific logic handled for each function. For scalar case, since most of the scalar function returns null if any one of args is null, it is beneficial to introduce another null handling method. It is just convenient method nice to have but not the must have solution for null handling since they can be handled in 'invoke' as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org