lidavidm commented on a change in pull request #10806:
URL: https://github.com/apache/arrow/pull/10806#discussion_r692331628



##########
File path: cpp/src/arrow/array/builder_binary.h
##########
@@ -274,6 +274,23 @@ class BaseBinaryBuilder : public ArrayBuilder {
     return Status::OK();
   }
 
+  Status AppendArraySliceUnchecked(const ArrayData& array, int64_t offset,
+                                   int64_t length) override {
+    auto bitmap = array.GetValues<uint8_t>(0, 0);
+    auto offsets = array.GetValues<offset_type>(1);
+    auto data = array.GetValues<uint8_t>(2, 0);
+    for (int64_t i = 0; i < length; i++) {
+      if (!bitmap || BitUtil::GetBit(bitmap, array.offset + offset + i)) {

Review comment:
       In benchmarks:
   
   Current approach:
   
   ```
   
------------------------------------------------------------------------------------------------
   Benchmark                                      Time             CPU   
Iterations UserCounters...
   
------------------------------------------------------------------------------------------------
   CaseWhenBenchStringContiguous/65536/0    2968201 ns      2968080 ns          
233 bytes_per_second=2.68754G/s items_per_second=22.0803M/s
   ```
   
   With BitRunReader:
   
   ```
   
------------------------------------------------------------------------------------------------
   Benchmark                                      Time             CPU   
Iterations UserCounters...
   
------------------------------------------------------------------------------------------------
   CaseWhenBenchStringContiguous/65536/0    4775811 ns      4775849 ns          
150 bytes_per_second=1.67024G/s items_per_second=13.7224M/s
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to