alamb commented on code in PR #3765:
URL: https://github.com/apache/arrow-datafusion/pull/3765#discussion_r992268369
##########
datafusion/physical-expr/src/functions.rs:
##########
@@ -266,16 +288,20 @@ where
ColumnarValue::Array(a) => Some(a.len()),
});
- // to array
- let args = if let Some(len) = len {
- args.iter()
- .map(|arg| arg.clone().into_array(len))
- .collect::<Vec<ArrayRef>>()
- } else {
- args.iter()
- .map(|arg| arg.clone().into_array(1))
- .collect::<Vec<ArrayRef>>()
- };
+ let inferred_length = len.unwrap_or(1);
+ let args = args
Review Comment:
Not sure how functional you want, but you can probably use
`.zip(hints.iter())` here and make it more suscinct
You may have to extend hints too with `chain(std::iter::repeat(false))` or
something to ensure all args are processed
##########
datafusion/physical-expr/src/functions.rs:
##########
@@ -2871,4 +2897,146 @@ mod tests {
Ok(())
}
+
+ fn dummy_function(args: &[ArrayRef]) -> Result<ArrayRef> {
+ let result: UInt64Array =
+ args.iter().map(|array| Some(array.len() as u64)).collect();
+ Ok(Arc::new(result) as ArrayRef)
+ }
+
+ fn unpack_uint64_array(col: Result<ColumnarValue>) -> Result<Vec<u64>> {
+ match col? {
+ ColumnarValue::Array(array) => Ok(array
+ .as_any()
+ .downcast_ref::<UInt64Array>()
+ .unwrap()
+ .values()
+ .to_vec()),
+ ColumnarValue::Scalar(_) => Err(DataFusionError::Internal(
+ "Unexpected scalar created by a test function".to_string(),
+ )),
+ }
+ }
+
+ #[test]
+ fn test_make_scalar_function() -> Result<()> {
+ let adapter_func = make_scalar_function(dummy_function);
+
+ let scalar_arg = ColumnarValue::Scalar(ScalarValue::Int64(Some(1)));
+ let array_arg =
+
ColumnarValue::Array(ScalarValue::Int64(Some(1)).to_array_of_size(5));
+ let result = unpack_uint64_array(adapter_func(&[array_arg,
scalar_arg]))?;
+ assert_eq!(result.len(), 2);
+ assert_eq!(result[0], 5);
+ assert_eq!(result[1], 5);
Review Comment:
You can also write this like
```suggestion
assert_eq!(result, vec![5, 5]));
```
Which might make the intent clearer as well as make updating these tests
easier if that was ever required
##########
datafusion/physical-expr/src/functions.rs:
##########
@@ -253,6 +253,28 @@ macro_rules! invoke_if_unicode_expressions_feature_flag {
/// decorates a function to handle [`ScalarValue`]s by converting them to
arrays before calling the function
/// and vice-versa after evaluation.
pub fn make_scalar_function<F>(inner: F) -> ScalarFunctionImplementation
+where
+ F: Fn(&[ArrayRef]) -> Result<ArrayRef> + Sync + Send + 'static,
+{
+ make_scalar_function_with_hints(inner, vec![])
+}
+
+/// Just like [`make_scalar_function`], decorates the given function to handle
both [`ScalarValue`]s and arrays.
+/// Additionally can receive a `hints` vector which can be used to control the
output arrays when generating them
+/// from [`ScalarValue`]s.
+///
+/// Each element of the `hints` vector gets mapped to the corresponding
argument of the function. The number of hints
+/// can be less or greater than the number of arguments (for functions with
variable number of arguments). Each unmapped
+/// argument will assume the default hint.
+///
+/// Hints:
Review Comment:
I suggest changing this from `true` / `false` to some sort of enum to make
the code potentially easier to read (not having to remember what true / false
mean)
So for example, perhaps something like
```rust
enum Hint {
/// Indicates the argument needs to be padded if it is scalar
Pad,
/// indicates the argument can be converted to an array of length 1
ToArray
}
```
Or something like that
##########
datafusion/physical-expr/src/functions.rs:
##########
@@ -253,6 +253,28 @@ macro_rules! invoke_if_unicode_expressions_feature_flag {
/// decorates a function to handle [`ScalarValue`]s by converting them to
arrays before calling the function
/// and vice-versa after evaluation.
pub fn make_scalar_function<F>(inner: F) -> ScalarFunctionImplementation
+where
+ F: Fn(&[ArrayRef]) -> Result<ArrayRef> + Sync + Send + 'static,
+{
+ make_scalar_function_with_hints(inner, vec![])
+}
+
+/// Just like [`make_scalar_function`], decorates the given function to handle
both [`ScalarValue`]s and arrays.
+/// Additionally can receive a `hints` vector which can be used to control the
output arrays when generating them
+/// from [`ScalarValue`]s.
+///
+/// Each element of the `hints` vector gets mapped to the corresponding
argument of the function. The number of hints
+/// can be less or greater than the number of arguments (for functions with
variable number of arguments). Each unmapped
+/// argument will assume the default hint.
+///
+/// Hints:
+/// - (default) `false`: indicates the argument needs to be padded if it
is scalar
+/// - `true`: indicates the argument can be converted to an array of
length 1
+///
+pub(crate) fn make_scalar_function_with_hints<F>(
Review Comment:
The fact that this is `pub(crate)` is good given its specialized nature (and
potential possibility of change)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]