tustvold opened a new pull request #1082:
URL: https://github.com/apache/arrow-rs/pull/1082
**Extremely experimental**, builds on #1054
**Creating this PR now to provide context, this is not read for review**
# Which issue does this PR close?
Adds an optimized ByteArrayReader as part of proving out the generics added
in #1041, and as a precursor to #171
# Rationale for this change
Depending on the benchmark, this can be anything from approximately the same
to significantly (2x) faster than the ArrowArrayReader implementation added in
#384. This is largely down to slightly more efficient null padding, and
avoiding dynamic dispatch. The dominating factor in the benchmarks is the
string value copy, which is makes me optimistic for the returns #171 wil yield.
_I didn't benchmark the results for `DELTA_BYTE_ARRAY` encoding but the
returns are likely to be even more significant, as the layout is more optimal
for decode_
The major benefit over the ArrowArrayReader implementation, aside from the
speed bump, is the ability to share the existing ColumnReaderImpl and
RecordReader logic, and the ability to work with all types of variable length
strings and byte arrays. I also expect to be able to reuse some of the logic
for #171 - in particular `OffsetBuffer`.
# What changes are included in this PR?
Adds a new `ByteArrayReader` that implements `ArrayReader` for variable
length byte arrays
# Are there any user-facing changes?
No
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]