alamb commented on code in PR #10054:
URL:
https://github.com/apache/arrow-datafusion/pull/10054#discussion_r1567433074
##########
datafusion/functions/src/string/overlay.rs:
##########
@@ -88,10 +105,13 @@ pub fn overlay<T: OffsetSizeTrait>(args: &[ArrayRef]) ->
Result<ArrayRef> {
let characters_array = as_generic_string_array::<T>(&args[1])?;
let pos_num = as_int64_array(&args[2])?;
+ let characters_array_iter =
adaptive_array_iter(characters_array.iter());
Review Comment:
> Thank you @alamb for your suggestion. However, if we were to make changes
according to your suggestion, many functions would require significant
modifications, such as `lpad`, `rpad`, `strpos`, and so on.
Yes, I agree with this assesment
> Here, I am seeking a universal and elegant method: based on the current
function implementations, with only minimal modifications, we can optimize for
the Scalar case without degrading the performance for Arrays.
Thank you -- this makes sense.
I think the only way to do this is to have compile time specialized
implementations for the cases you want to optimize.
> (Maybe my thinking is not correct, and I need your help to provide
suggestions for modification 🙏)
I suspect the solution will look something like a scalar (rust) function
that actually implements the the operation and then a macro / generic that
instantiates the function in different ways depending on argument type
For example
```rust
/// do the actual operation, calling `output` for each string produced
fn overlay3<F: Fn(&str)> (output: F, input: &str, len: usize, pos: usize)
{...}
```
And then make a macro or other templated functin that instantiates
`overlay3` in different loops depending on the arguments (columns, scalars,
etc).
You should be able to avoid code duplication in the source code, though we
will need a copy a runtime
Hoepfully that helps
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]