felipecrv commented on code in PR #41477:
URL: https://github.com/apache/arrow/pull/41477#discussion_r1596160781
##########
cpp/src/arrow/c/bridge.cc:
##########
@@ -1876,16 +1876,14 @@ struct ArrayImporter {
return ImportBuffer(buffer_id, buffer_size);
}
- // we only need the value of the last offset so let's just copy that
- // one value from device to host.
- auto single_value_buf =
- SliceBuffer(data_->buffers[offsets_buffer_id],
- c_struct_->length * sizeof(OffsetType),
sizeof(OffsetType));
- ARROW_ASSIGN_OR_RAISE(
- auto cpubuf, Buffer::ViewOrCopy(single_value_buf,
default_cpu_memory_manager()));
- auto offsets = cpubuf->data_as<OffsetType>();
+ OffsetType last_offset = 0;
+ RETURN_NOT_OK(
+
data_->buffers[offsets_buffer_id]->memory_manager()->CopyBufferSliceFrom(
+ data_->buffers[offsets_buffer_id], &last_offset,
+ c_struct_->length * sizeof(OffsetType), sizeof(OffsetType)));
Review Comment:
You shouldn't need the `if (device_type_ == DeviceAllocationType::kCPU) {`
case in the beginning of this function if `CopyBufferSliceFrom` is as efficient
as a `memcpy`.
##########
cpp/src/arrow/device.cc:
##########
@@ -100,6 +100,20 @@ Result<std::unique_ptr<Buffer>>
MemoryManager::CopyNonOwned(
" to ", to->device()->ToString(), " not
supported");
}
+Status MemoryManager::CopyBufferSliceFrom(const std::shared_ptr<Buffer>& buf,
void* to,
+ const int64_t offset, const int64_t
length) {
+ auto maybe_buffer = CopyBuffer(buf, default_cpu_memory_manager());
+
+ if (COPY_BUFFER_SUCCESS(maybe_buffer)) {
+ memcpy(to, (*maybe_buffer)->data() + offset, length);
+ return Status::OK();
+ }
+
+ return Status::NotImplemented("Copying buffer slice from ",
+ buf->memory_manager()->device()->ToString(),
+ " to memory not supported");
+}
+
Review Comment:
This class is complicated, so I had to try implementing this to understand
the code.
One of the things I've learned: the `-From` suffix means your function takes
`from` `MemoryManager` parameter that matches `buf->memory_manager()`. That
doesn't apply here.
I also realized that the `static` function is enough, no need to have the
`virtual` version — the fact that we are always copying to CPU memory means we
don't have to virtual-dispatch another call and we only have to
virtual-dispatch to `buf->memory_manager()`.
So here is what I drafted (I didn't test it):
```cpp
Status MemoryManager::CopyBufferSlice(const std::shared_ptr<Buffer>& buf,
int64_t offset,
int64_t length, uint8_t* out_data) {
if (ARROW_PREDICT_TRUE(buf->is_cpu())) {
memcpy(out_data, buf->data() + offset, static_cast<size_t>(length));
return Status::OK();
}
auto& from = buf->memory_manager();
auto cpu_mm = default_cpu_memory_manager();
// Try a view first
auto maybe_buffer_result = from->ViewBufferTo(buf, cpu_mm);
if (!COPY_BUFFER_SUCCESS(maybe_buffer_result)) {
// View failed, try a copy instead
maybe_buffer_result = from->CopyBufferTo(buf, cpu_mm);
}
ARROW_ASSIGN_OR_RAISE(auto maybe_buffer, std::move(maybe_buffer_result));
if (maybe_buffer != nullptr) {
memcpy(out_data, maybe_buffer->data() + offset,
static_cast<size_t>(length));
return Status::OK();
}
return Status::NotImplemented("Copying buffer slice from ",
from->device()->ToString(),
" to CPU not supported");
}
```
Please define and declare it after `ViewBuffer` to not split that related
group of functions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]