andishgar commented on code in PR #46229:
URL: https://github.com/apache/arrow/pull/46229#discussion_r2208271622
##########
cpp/src/arrow/array/array_binary.cc:
##########
@@ -105,6 +111,392 @@
BinaryViewArray::BinaryViewArray(std::shared_ptr<DataType> type, int64_t length,
ArrayData::Make(std::move(type), length, std::move(buffers), null_count,
offset));
}
+namespace {
+
+// TODO Should We move this to bitmap_ops.h and Remove from
compute/kernels/util.s
+Result<std::shared_ptr<Buffer>> GetOrCopyNullBitmapBuffer(const ArrayData&
in_array,
+ MemoryPool* pool) {
+ if (in_array.buffers[0]->data() == nullptr) {
+ return nullptr;
+ } else if (in_array.offset == 0) {
+ return in_array.buffers[0];
+ } else if (in_array.offset % 8 == 0) {
+ return SliceBuffer(in_array.buffers[0], /*offset=*/in_array.offset / 8);
+ } else {
+ // If a non-zero offset, we need to shift the bitmap
+ return internal::CopyBitmap(pool, in_array.buffers[0]->data(),
in_array.offset,
+ in_array.length);
+ }
+}
+
+struct Interval {
+ int64_t start;
+ int64_t end;
+ int32_t offset = -1;
Review Comment:
Actually, the idea of using a hash table came up in my initial
implementation.
The key of the hash table was the start of the interval, and the value was
the new offset in the compacted buffer.
I created the hash table in `CalculateOffsetAndTotalSize`.
Then, in `GetRelativeOffset`, after determining which interval the view
offset belongs to, I used the start of the interval to look up the new offset
in the hash table.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]