goldmedal commented on code in PR #12401:
URL: https://github.com/apache/datafusion/pull/12401#discussion_r1754942094
##########
datafusion/functions/src/unicode/strpos.rs:
##########
@@ -140,24 +138,43 @@ fn calculate_strpos<'a, V1, V2, T: ArrowPrimitiveType>(
substring_array: V2,
) -> Result<ArrayRef>
where
- V1: ArrayAccessor<Item = &'a str>,
- V2: ArrayAccessor<Item = &'a str>,
+ V1: StringArrayType<'a, Item = &'a str>,
+ V2: StringArrayType<'a, Item = &'a str>,
{
- let string_iter = ArrayIter::new(string_array);
- let substring_iter = ArrayIter::new(substring_array);
+ let ascii_only = string_array.is_ascii() && substring_array.is_ascii();
Review Comment:
Nice suggestion! I did the benchmark again. The performance is improved!
```
group before
after
----- -----
-------
strpos_StringArray_ascii_str_len_128 1.06 370.5±9.46ns ?
?/sec 1.00 348.6±11.38ns ? ?/sec
strpos_StringArray_ascii_str_len_32 1.07 372.5±10.23ns ?
?/sec 1.00 346.9±9.45ns ? ?/sec
strpos_StringArray_ascii_str_len_4096 1.08 378.4±12.05ns ?
?/sec 1.00 349.4±16.83ns ? ?/sec
strpos_StringArray_ascii_str_len_8 1.07 371.4±14.53ns ?
?/sec 1.00 346.1±28.44ns ? ?/sec
strpos_StringArray_utf8_str_len_128 1.06 377.2±18.04ns ?
?/sec 1.00 356.4±21.92ns ? ?/sec
strpos_StringArray_utf8_str_len_32 1.08 374.9±34.32ns ?
?/sec 1.00 345.8±11.05ns ? ?/sec
strpos_StringArray_utf8_str_len_4096 1.09 381.6±16.68ns ?
?/sec 1.00 351.4±23.14ns ? ?/sec
strpos_StringArray_utf8_str_len_8 1.09 372.9±20.83ns ?
?/sec 1.00 343.3±11.78ns ? ?/sec
strpos_StringViewArray_ascii_str_len_128 1.79 3.2±0.15ms ?
?/sec 1.00 1763.8±44.16µs ? ?/sec
strpos_StringViewArray_ascii_str_len_32 1.03 648.7±18.27µs ?
?/sec 1.00 628.1±21.69µs ? ?/sec
strpos_StringViewArray_ascii_str_len_4096 1.24 62.8±7.27ms ?
?/sec 1.00 50.5±2.05ms ? ?/sec
strpos_StringViewArray_ascii_str_len_8 1.00 280.2±10.44µs ?
?/sec 1.05 294.8±42.47µs ? ?/sec
strpos_StringViewArray_utf8_str_len_128 1.03 5.3±0.13ms ?
?/sec 1.00 5.1±0.12ms ? ?/sec
strpos_StringViewArray_utf8_str_len_32 1.01 1961.9±57.34µs ?
?/sec 1.00 1944.1±73.19µs ? ?/sec
strpos_StringViewArray_utf8_str_len_4096 1.03 147.4±9.24ms ?
?/sec 1.00 142.5±3.61ms ? ?/sec
strpos_StringViewArray_utf8_str_len_8 1.01 874.5±26.02µs ?
?/sec 1.00 863.4±40.68µs ? ?/sec
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]