tlm365 commented on code in PR #16087:
URL: https://github.com/apache/datafusion/pull/16087#discussion_r2095176682


##########
datafusion/functions/src/string/ascii.rs:
##########
@@ -103,19 +106,29 @@ impl ScalarUDFImpl for AsciiFunc {
 
 fn calculate_ascii<'a, V>(array: V) -> Result<ArrayRef, ArrowError>
 where
-    V: ArrayAccessor<Item = &'a str>,
+    V: StringArrayType<'a, Item = &'a str>,
 {
-    let iter = ArrayIter::new(array);
-    let result = iter
-        .map(|string| {
-            string.map(|s| {
-                let mut chars = s.chars();
-                chars.next().map_or(0, |v| v as i32)
-            })
-        })
-        .collect::<Int32Array>();
-
-    Ok(Arc::new(result) as ArrayRef)
+    let mut values = Vec::with_capacity(array.len());

Review Comment:
   I updated the code and benchmark. Already tested this version:
   ```rust
   fn calculate_ascii<'a, V>(array: V) -> Result<ArrayRef, ArrowError>
   where
       V: StringArrayType<'a, Item = &'a str>,
   {
       let nulls = array.nulls();
   
       let values: Vec<i32> = if nulls.map_or(false, |n| n.null_count() > 0) {
           // Nulls present: handle each element with null-check
           (0..array.len())
               .map(|i| {
                   if array.is_null(i) {
                       0
                   } else {
                       let s = array.value(i);
                       s.chars().next().map_or(0, |c| c as i32)
                   }
               })
               .collect()
       } else {
           // Fast path: no null check needed
           (0..array.len())
               .map(|i| {
                   let s = unsafe { array.value_unchecked(i) };
                   s.chars().next().map_or(0, |c| c as i32)
               })
               .collect()
       };
   
       let array = Int32Array::new(values.into(), nulls.cloned());
       Ok(Arc::new(array))
   }
   ```
   but the performance difference is not significant, so I chose the current 
version to keep the code simple.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to