jorgecarleitao opened a new pull request #9044:
URL: https://github.com/apache/arrow/pull/9044
This PR addresses a performance issue in how we allocate and reallocate the
`MutableBuffer`.
# Problem
See #9032
# This PR
This PR changes `MutableBuffer::reserve` to call `std::alloc::alloc` instead
of `std::alloc::alloc_zeroed`, which improves performance when building buffers
with unknown sizes (such as strings and nested types).
This required changing some calls of `MutableBuffer` that assumed a zero
initialized buffer even when `reserve` was used.
It also changed `reserve`'s signature to `reserve(additional)` instead of
`reserve(new_len)`, which is the notation used throughout Rust's std library.
This is a draft as it is built on top of 2 PRs.
```
critcmp master-simd-e1b38cdaa4f2a7d35e2e576463e12b38875f29f3
alloc2-simd-88fc0ae819c24239ac9363fa462f9c6e1ddfd9fc -t 10
```
```
group
alloc2-simd-88fc0ae819c24239ac9363fa462f9c6e1ddfd9fc
master-simd-e1b38cdaa4f2a7d35e2e576463e12b38875f29f3
-----
----------------------------------------------------
----------------------------------------------------
add 512 1.00 549.1±17.26ns ? B/sec
1.14 624.4±118.72ns ? B/sec
buffer_bit_ops or 1.00 369.3±7.51ns ? B/sec
1.16 427.1±20.25ns ? B/sec
cast float32 to int32 512 1.00 3.3±0.09µs ? B/sec
1.21 4.0±0.09µs ? B/sec
cast float64 to float32 512 1.00 3.0±0.06µs ? B/sec
1.24 3.7±0.11µs ? B/sec
cast float64 to uint64 512 1.00 3.6±0.33µs ? B/sec
1.22 4.4±0.29µs ? B/sec
cast int32 to float32 512 1.00 2.9±0.10µs ? B/sec
1.15 3.4±0.09µs ? B/sec
cast int32 to float64 512 1.00 2.9±0.06µs ? B/sec
1.14 3.3±0.06µs ? B/sec
cast int32 to uint32 512 1.00 4.0±0.09µs ? B/sec
1.23 4.9±0.12µs ? B/sec
concat str 1024 1.00 8.4±0.26µs ? B/sec
1.14 9.6±0.22µs ? B/sec
equal_nulls_512 1.00 3.3±0.07µs ? B/sec
1.22 4.0±0.10µs ? B/sec
filter context u8 high selectivity 1.00 3.7±0.09µs ? B/sec
1.27 4.6±0.13µs ? B/sec
filter u8 high selectivity 1.00 10.8±0.47µs ? B/sec
1.10 11.9±0.49µs ? B/sec
like_utf8 scalar equals 1.00 149.9±6.36µs ? B/sec
1.12 167.3±3.45µs ? B/sec
like_utf8 scalar starts with 1.00 338.2±12.26µs ? B/sec
1.15 388.4±7.62µs ? B/sec
min string 512 1.13 6.0±0.17µs ? B/sec
1.00 5.3±0.09µs ? B/sec
nlike_utf8 scalar starts with 1.00 367.8±39.16µs ? B/sec
1.16 425.4±9.38µs ? B/sec
subtract 512 1.00 567.9±10.77ns ? B/sec
1.21 686.0±209.80ns ? B/sec
sum 512 1.32 67.8±0.59ns ? B/sec
1.00 51.3±0.70ns ? B/sec
take str 1024 1.19 6.1±0.10µs ? B/sec
1.00 5.1±0.03µs ? B/sec
take str 512 1.12 3.9±0.09µs ? B/sec
1.00 3.4±0.07µs ? B/sec
take str null indices 1024 1.18 6.1±0.16µs ? B/sec
1.00 5.2±0.11µs ? B/sec
take str null indices 512 1.12 3.9±0.09µs ? B/sec
1.00 3.5±0.03µs ? B/sec
take str null values 1024 1.19 6.1±0.51µs ? B/sec
1.00 5.2±0.11µs ? B/sec
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]