kevinmingtarja commented on issue #9890:
URL: 
https://github.com/apache/arrow-datafusion/issues/9890#issuecomment-2038037965

   Hi, I was curious about this and decided to test it out myself.
   
   I first generated a flamegraph using 
[flamegraph-rs](https://github.com/flamegraph-rs/flamegraph) 
(`CARGO_PROFILE_BENCH_DEBUG=true cargo flamegraph --root --bench substr_index 
-- --bench`).
   
   It seems like quite a lot of time is spent on `malloc`, `free`, and 
`memmove`, as you noted in 
https://github.com/apache/arrow-datafusion/pull/9864#discussion_r1545353414.
   
   A snippet of 
[flamegraph.svg](https://github.com/apache/arrow-datafusion/assets/69668484/b7f24760-e4e0-4f5d-9996-03a10f1b7de2):
   ![Screen Shot 2024-04-05 at 03 11 
53](https://github.com/apache/arrow-datafusion/assets/69668484/cc244092-df26-4b4a-b87d-186e09b89438)
   
   I then tested out a quick change to avoid the `Box::new()` (at the expense 
of code duplication :D), and managed to get around **-28%** improvement!
   
   ```bash
   % cargo bench --bench substr_index -- --baseline main
      Compiling datafusion-functions v37.0.0 
(/Users/kevin/git/arrow-datafusion/datafusion/functions)
       Finished bench [optimized] target(s) in 56.31s
        Running benches/substr_index.rs 
(target/release/deps/substr_index-bcc13e06371405f5)
   substr_index_array_array_1000
                           time:   [87.912 µs 88.197 µs 88.515 µs]
                           change: [-29.549% -28.257% -27.292%] (p = 0.00 < 
0.05)
                           Performance has improved.
   ```
   
   diff:
   ```diff
   -                let splitted: Box<dyn Iterator<Item = _>> = if n > 0 {
   -                    Box::new(string.split(delimiter))
   -                } else {
   -                    Box::new(string.rsplit(delimiter))
   -                };
                    let occurrences = 
usize::try_from(n.unsigned_abs()).unwrap_or(usize::MAX);
   -                // The length of the substring covered by substr_index.
   -                let length = splitted
   -                    .take(occurrences) // at least 1 element, since n != 0
   -                    .map(|s| s.len() + delimiter.len())
   -                    .sum::<usize>()
   -                    - delimiter.len();
   +                let length;
   +                if n > 0 {
   +                    let splitted = string.split(delimiter);
   +                    length = splitted
   +                        .take(occurrences)
   +                        .map(|s| s.len() + delimiter.len())
   +                        .sum::<usize>()
   +                        - delimiter.len();
   +                } else {
   +                    let splitted = string.rsplit(delimiter);
   +                    length = splitted
   +                        .take(occurrences)
   +                        .map(|s| s.len() + delimiter.len())
   +                        .sum::<usize>()
   +                        - delimiter.len();
   +                }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to