lidavidm commented on a change in pull request #10806:
URL: https://github.com/apache/arrow/pull/10806#discussion_r692331628
##########
File path: cpp/src/arrow/array/builder_binary.h
##########
@@ -274,6 +274,23 @@ class BaseBinaryBuilder : public ArrayBuilder {
return Status::OK();
}
+ Status AppendArraySliceUnchecked(const ArrayData& array, int64_t offset,
+ int64_t length) override {
+ auto bitmap = array.GetValues<uint8_t>(0, 0);
+ auto offsets = array.GetValues<offset_type>(1);
+ auto data = array.GetValues<uint8_t>(2, 0);
+ for (int64_t i = 0; i < length; i++) {
+ if (!bitmap || BitUtil::GetBit(bitmap, array.offset + offset + i)) {
Review comment:
In benchmarks:
Current approach:
```
------------------------------------------------------------------------------------------------
Benchmark Time CPU
Iterations UserCounters...
------------------------------------------------------------------------------------------------
CaseWhenBenchStringContiguous/65536/0 2968201 ns 2968080 ns
233 bytes_per_second=2.68754G/s items_per_second=22.0803M/s
```
With BitRunReader:
```
------------------------------------------------------------------------------------------------
Benchmark Time CPU
Iterations UserCounters...
------------------------------------------------------------------------------------------------
CaseWhenBenchStringContiguous/65536/0 4775811 ns 4775849 ns
150 bytes_per_second=1.67024G/s items_per_second=13.7224M/s
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]