alamb commented on code in PR #10054:
URL: 
https://github.com/apache/arrow-datafusion/pull/10054#discussion_r1567433074


##########
datafusion/functions/src/string/overlay.rs:
##########
@@ -88,10 +105,13 @@ pub fn overlay<T: OffsetSizeTrait>(args: &[ArrayRef]) -> 
Result<ArrayRef> {
             let characters_array = as_generic_string_array::<T>(&args[1])?;
             let pos_num = as_int64_array(&args[2])?;
 
+            let characters_array_iter = 
adaptive_array_iter(characters_array.iter());

Review Comment:
   > Thank you @alamb for your suggestion. However, if we were to make changes 
according to your suggestion, many functions would require significant 
modifications, such as `lpad`, `rpad`, `strpos`, and so on. 
   
   Yes, I agree with this assesment
   
   > Here, I am seeking a universal and elegant method: based on the current 
function implementations, with only minimal modifications, we can optimize for 
the Scalar case without degrading the performance for Arrays. 
   
   Thank you -- this makes sense.
   
   I think the only way to do this is to have compile time specialized 
implementations for the cases you want to optimize. 
   
   > (Maybe my thinking is not correct, and I need your help to provide 
suggestions for modification 🙏)
   
   I suspect the solution will look something like a scalar (rust) function 
that actually implements the the operation and then a macro / generic that 
instantiates the function in different ways depending on argument type
   
   For example
   ```rust
   /// do the actual operation, calling `output` for each string produced
   fn overlay3<F: Fn(&str)> (output: F, input: &str, len: usize, pos: usize)  
{...}
   ```
   
   And then make a macro or other templated functin that instantiates  
`overlay3` in different loops depending on the arguments (columns, scalars, 
etc).
   
   You should be able to avoid code duplication in the source code, though we 
will need a copy a runtime
   
   Hoepfully that helps
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to