alamb commented on code in PR #9114:
URL: https://github.com/apache/arrow-datafusion/pull/9114#discussion_r1476239827


##########
datafusion-examples/examples/simple_udf.rs:
##########
@@ -71,13 +70,15 @@ async fn main() -> Result<()> {
         // this is guaranteed by DataFusion based on the function's signature.
         assert_eq!(args.len(), 2);
 
-        let args = columnar_values_to_array(args)?;
+        // Expand the arguments to arrays (this is simple, but inefficient for
+        // single constant values).
+        let args = ColumnarValue::values_to_arrays(args)?;

Review Comment:
   This is the basic change -- make the logic a function of `ColumnarValue`



##########
datafusion/expr/src/columnar_value.rs:
##########
@@ -75,4 +75,166 @@ impl ColumnarValue {
     pub fn create_null_array(num_rows: usize) -> Self {
         ColumnarValue::Array(Arc::new(NullArray::new(num_rows)))
     }
+
+    /// Converts  [`ColumnarValue`]s to [`ArrayRef`]s with the same length.
+    ///
+    /// # Performance Note
+    ///
+    /// This function expands any [`ScalarValue`] to an array. This expansion
+    /// permits using a single function in terms of arrays, but it can be
+    /// inefficient compared to handling the scalar value directly.
+    ///
+    /// Thus, It is recommended to provide specialized implementations for
+    /// scalar values if performance is a concern.
+    ///
+    /// # Errors
+    ///
+    /// If there are multiple array arguments that have different lengths
+    pub fn values_to_arrays(args: &[ColumnarValue]) -> Result<Vec<ArrayRef>> {

Review Comment:
   This is a different algorithm than `columnar_values_to_array` as  it also 
handles mixed `ScalarValue` and `ArrayRef`s



##########
datafusion/expr/src/columnar_value.rs:
##########
@@ -75,4 +75,166 @@ impl ColumnarValue {
     pub fn create_null_array(num_rows: usize) -> Self {
         ColumnarValue::Array(Arc::new(NullArray::new(num_rows)))
     }
+
+    /// Converts  [`ColumnarValue`]s to [`ArrayRef`]s with the same length.
+    ///
+    /// # Performance Note
+    ///
+    /// This function expands any [`ScalarValue`] to an array. This expansion
+    /// permits using a single function in terms of arrays, but it can be
+    /// inefficient compared to handling the scalar value directly.
+    ///
+    /// Thus, It is recommended to provide specialized implementations for
+    /// scalar values if performance is a concern.
+    ///
+    /// # Errors
+    ///
+    /// If there are multiple array arguments that have different lengths
+    pub fn values_to_arrays(args: &[ColumnarValue]) -> Result<Vec<ArrayRef>> {
+        if args.is_empty() {
+            return Ok(vec![]);
+        }
+
+        let mut array_len = None;
+        for arg in args {
+            array_len = match (arg, array_len) {
+                (ColumnarValue::Array(a), None) => Some(a.len()),
+                (ColumnarValue::Array(a), Some(array_len)) => {
+                    if array_len == a.len() {
+                        Some(array_len)
+                    } else {
+                        return internal_err!(
+                            "Arguments has mixed length. Expected length: 
{array_len}, found length: {}",a.len()
+                        );
+                    }
+                }
+                (ColumnarValue::Scalar(_), array_len) => array_len,
+            }
+        }
+
+        // If array_len is none, it means there are only scalars, so make a 1 
element array
+        let inferred_length = array_len.unwrap_or(1);
+
+        let args = args
+            .iter()
+            .map(|arg| arg.clone().into_array(inferred_length))
+            .collect::<Result<Vec<_>>>()?;
+
+        Ok(args)
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn values_to_arrays() {

Review Comment:
   new tests



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to