zhuqi-lucas commented on PR #7351:
URL: https://github.com/apache/arrow-rs/pull/7351#issuecomment-2763109595
Updated the result for the changed code:
### 📌 UTF-8 vs UTF-8 View Benchmark Comparison
| **Benchmark** | **StringArray
(UTF-8)** | **StringViewArray (UTF-8 View)** | **Performance Change**
|
|---------------------------------------------------|--------------------------|----------------------------------|----------------------------------------------|
| **eq long same prefix strings** | 294.95 µs
| 514.73 µs | **StringArray is ~1.7x faster**
|
| **neq long same prefix strings** | 296.12 µs
| 517.70 µs | **StringArray is ~1.75x faster**
|
| **lt long same prefix strings** | 331.46 µs
| 493.17 µs | **StringArray is ~1.5x faster**
|
| **long same prefix strings like_utf8 scalar equals** | 181.78 µs
| 196.91 µs | **StringArray is ~8.3% faster**
|
| **long same prefix strings like_utf8 scalar contains** | 1.7854 ms
| 1.8405 ms | **StringArray is ~3% faster**
|
| **long same prefix strings like_utf8 scalar ends with** | 583.86 µs
| 594.73 µs | **StringArray is ~1.9% faster**
|
| **long same prefix strings like_utf8 scalar starts with** | 664.69 µs
| 672.16 µs | **StringArray is ~1.1% faster**
|
| **long same prefix strings like_utf8 scalar complex** | 590.86 µs
| 604.81 µs | **StringArray is ~2.3% faster**
|
| **long same prefix strings like_utf8view scalar equals** | 181.78 µs
| 196.91 µs | **StringArray is ~8.3% faster**
|
| **long same prefix strings like_utf8view scalar contains** | 1.7854 ms
| 1.8405 ms | **StringArray is ~3% faster**
|
| **long same prefix strings like_utf8view scalar ends with** | 583.86 µs
| 594.73 µs | **StringArray is ~1.9%
faster** |
| **long same prefix strings like_utf8view scalar starts with** | 664.69 µs
| 672.16 µs | **StringArray is ~1.1%
faster** |
| **long same prefix strings like_utf8view scalar complex** | 590.86 µs
| 604.81 µs | **StringArray is ~2.3% faster**
|
---
We can currently only adding those testing, it's enough for us to improve
the code and testing again. Because the compare_unchecked function is which we
want to improve, it mostly used by all compare function for stringview, other
function such as like or regex related, we can improve and testing after it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]