[jira] [Commented] (ARROW-2447) [C++] Create a device abstraction

Pearu Peterson (JIRA) Sun, 10 Mar 2019 04:07:15 -0700


    [ 
https://issues.apache.org/jira/browse/ARROW-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16788883#comment-16788883
 ]


Pearu Peterson commented on ARROW-2447:
---------------------------------------

Also the MemoryPool plays an important role in this issue: MemoryPool allocate 
the memory and hence is the first source of information that determines the 
pointer's accessibility properties. In the Device class proposal, MemoryPool is 
suggested to become a part of the Device class, but see below.

In the CUDA case, the pointer's accessibility property is defined by the method 
that is used for allocating the memory. The device number is more a parameter 
that is not sufficient for determining the pointer's accessibility. From that I 
would conclude that one should attach to the Buffer the allocation method 
information (that includes device number), not the Device instance which does 
not provide all of the required information.

So, perhaps we should be attaching MemoryPool to the Buffer (instead of 
introducing Device class) so that the accessibility of memory pointer is 
determined by the MemoryPool instance which contains also the process 
information. Recall, CUDA context is essentially a device process.

Currently, CPU based Arrow uses DefaultMemoryPool. For CUDA support, several 
new memory pools would be defined:
CudaManagedMemoryPool
CudaMemoryPool
CudaHostMemoryPool
CudaRegistrededMemoryPool

or in more general, one should be able to define their own MemoryPool instance 
that manages the memory of any device or uses some custom memory manager such 
as [RMM|https://github.com/rapidsai/rmm] for allocating memory for Arrow 
buffers.

`Buffer.is_accessible` would take a MemoryPool instance as argument and pairing 
this with the MemoryPool instance attached to the Buffer will determine the 
Buffer pointer accessibility properties. For that, MemoryPool classes would 
implement a method, say, `is_compatible(<other MemoryPool instance>)` that 
returns `true` if the pointer can be accessed from the process that the other 
MemoryPool represents. In addition, MemoryPool instances would implement the 
CopyTo and CopyFrom methods.

[~wesmckinn], [~pitrou], and others, what do you think?

 

> [C++] Create a device abstraction
> ---------------------------------
>
>                 Key: ARROW-2447
>                 URL: https://issues.apache.org/jira/browse/ARROW-2447
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, GPU
>    Affects Versions: 0.9.0
>            Reporter: Antoine Pitrou
>            Assignee: Pearu Peterson
>            Priority: Major
>             Fix For: 0.14.0
>
>
> Right now, a plain Buffer doesn't carry information about where it actually 
> lies. That information also cannot be passed around, so you get APIs like 
> {{PlasmaClient}} which take or return device number integers, and have 
> implementations which hardcode operations on CUDA buffers. Also, unsuspecting 
> receivers of a {{Buffer}} pointer may try to act on the underlying memory 
> without knowing whether it's CPU-reachable or not.
> Here is a sketch for a proposed Device abstraction:
> {code}
> class Device {
>     enum DeviceKind { KIND_CPU, KIND_CUDA };
>     virtual DeviceKind kind() const;
>     //MemoryPool* default_memory_pool() const;
>     //std::shared_ptr<Buffer> Allocate(...);
> };
> class CpuDevice : public Device {};
> class CudaDevice : public Device {
>     int device_num() const;
> };
> class Buffer {
>     virtual DeviceKind device_kind() const;
>     virtual std::shared_ptr<Device> device() const;
>     virtual bool on_cpu() const {
>         return true;
>     }
>     const uint8_t* cpu_data() const {
>         return on_cpu() ? data() : nullptr;
>     }
>     uint8_t* cpu_mutable_data() {
>         return on_cpu() ? mutable_data() : nullptr;
>     }
>     virtual CopyToCpu(std::shared_ptr<Buffer> dest) const;
>     virtual CopyFromCpu(std::shared_ptr<Buffer> src);
> };
> class CudaBuffer : public Buffer {
>     virtual bool on_cpu() const {
>         return false;
>     }
> };
> CopyBuffer(std::shared_ptr<Buffer> dest, const std::shared_ptr<Buffer> src);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARROW-2447) [C++] Create a device abstraction

Reply via email to