jorgecarleitao commented on a change in pull request #303:
URL: https://github.com/apache/arrow-datafusion/pull/303#discussion_r630285020
##########
File path: datafusion/src/physical_plan/functions.rs
##########
@@ -1373,20 +1370,26 @@ impl PhysicalExpr for ScalarFunctionExpr {
}
fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue> {
- // evaluate the arguments
- let inputs = self
- .args
- .iter()
- .map(|e| e.evaluate(batch))
- .collect::<Result<Vec<_>>>()?;
+ // evaluate the arguments, if there are no arguments we'll instead
pass in a null array of
+ // batch size (as a convention)
+ let inputs = match self.args.len() {
+ 0 => vec![ColumnarValue::Array(Arc::new(NullArray::new(
Review comment:
Note that `NullArray` is composed by zero buffers, zero childs, no
validity and one datatype, so the cost to instantiate it is really small. The
advantage over a `ScalarValue` is that the semantics of getting a length are
preserved: use `array.len()` as any other array.
I am not married with any; was just trying to think about this from a
documentations' perspective:
> We support zero-argument UDFs. They MUST be declared as accepting zero
arguments and the function signature MUST be a single argument. DataFusion will
pass an `Array` to it, from which you can retrieve its length via
`Array::len()`. The function MUST return an array whose number of rows equals
the length of the array.
If we pass a scalar of any type, if the evaluation is distributed, I believe
that we will have to serialize `Scalar -> Array` in Ballista.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]