[ 
https://issues.apache.org/jira/browse/ARROW-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761692#comment-16761692
 ] 

Antoine Pitrou commented on ARROW-2447:
---------------------------------------

Thanks [~pearu]. Regarding your propositions:

1) sounds fine to me, though we might keep `cpu_data` and `on_cpu` as shortcuts 
for `accessible_data(CPUDevice())` and `is_accessible(CPUDevice())`, 
respectively. Expecting CPU-accessible memory will still be the dominant case 
in code using this API (since, after all, this is code running on the CPU).

(a concern here is that querying the CPU address of a buffer should ideally not 
go through a virtual function call)

2) sounds fine to me (but might also keep shortcuts, see above)

3) sounds fine as well... but need a way to query device-specific buffer 
properties (such as `cuda_buffer->context()`) in another way, then.

4) I have no particular opinion about this.

In case (iii), the memory still resides on a particular device, it just happens 
to be readable from another device as well. So e.g. the memory sits on a GPU, 
but is CPU-accessible (at higher cost than normal) which suggests that 
`accessible_data(CPUDevice())` would return the appropriate CPU memory pointer.


> [C++] Create a device abstraction
> ---------------------------------
>
>                 Key: ARROW-2447
>                 URL: https://issues.apache.org/jira/browse/ARROW-2447
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, GPU
>    Affects Versions: 0.9.0
>            Reporter: Antoine Pitrou
>            Priority: Major
>             Fix For: 0.14.0
>
>
> Right now, a plain Buffer doesn't carry information about where it actually 
> lies. That information also cannot be passed around, so you get APIs like 
> {{PlasmaClient}} which take or return device number integers, and have 
> implementations which hardcode operations on CUDA buffers. Also, unsuspecting 
> receivers of a {{Buffer}} pointer may try to act on the underlying memory 
> without knowing whether it's CPU-reachable or not.
> Here is a sketch for a proposed Device abstraction:
> {code}
> class Device {
>     enum DeviceKind { KIND_CPU, KIND_CUDA };
>     virtual DeviceKind kind() const;
>     //MemoryPool* default_memory_pool() const;
>     //std::shared_ptr<Buffer> Allocate(...);
> };
> class CpuDevice : public Device {};
> class CudaDevice : public Device {
>     int device_num() const;
> };
> class Buffer {
>     virtual DeviceKind device_kind() const;
>     virtual std::shared_ptr<Device> device() const;
>     virtual bool on_cpu() const {
>         return true;
>     }
>     const uint8_t* cpu_data() const {
>         return on_cpu() ? data() : nullptr;
>     }
>     uint8_t* cpu_mutable_data() {
>         return on_cpu() ? mutable_data() : nullptr;
>     }
>     virtual CopyToCpu(std::shared_ptr<Buffer> dest) const;
>     virtual CopyFromCpu(std::shared_ptr<Buffer> src);
> };
> class CudaBuffer : public Buffer {
>     virtual bool on_cpu() const {
>         return false;
>     }
> };
> CopyBuffer(std::shared_ptr<Buffer> dest, const std::shared_ptr<Buffer> src);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to