[
https://issues.apache.org/jira/browse/ARROW-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16788883#comment-16788883
]
Pearu Peterson commented on ARROW-2447:
---------------------------------------
Also the MemoryPool plays an important role in this issue: MemoryPool allocate
the memory and hence is the first source of information that determines the
pointer's accessibility properties. In the Device class proposal, MemoryPool is
suggested to become a part of the Device class, but see below.
In the CUDA case, the pointer's accessibility property is defined by the method
that is used for allocating the memory. The device number is more a parameter
that is not sufficient for determining the pointer's accessibility. From that I
would conclude that one should attach to the Buffer the allocation method
information (that includes device number), not the Device instance which does
not provide all of the required information.
So, perhaps we should be attaching MemoryPool to the Buffer (instead of
introducing Device class) so that the accessibility of memory pointer is
determined by the MemoryPool instance which contains also the process
information. Recall, CUDA context is essentially a device process.
Currently, CPU based Arrow uses DefaultMemoryPool. For CUDA support, several
new memory pools would be defined:
CudaManagedMemoryPool
CudaMemoryPool
CudaHostMemoryPool
CudaRegistrededMemoryPool
or in more general, one should be able to define their own MemoryPool instance
that manages the memory of any device or uses some custom memory manager such
as [RMM|https://github.com/rapidsai/rmm] for allocating memory for Arrow
buffers.
`Buffer.is_accessible` would take a MemoryPool instance as argument and pairing
this with the MemoryPool instance attached to the Buffer will determine the
Buffer pointer accessibility properties. For that, MemoryPool classes would
implement a method, say, `is_compatible(<other MemoryPool instance>)` that
returns `true` if the pointer can be accessed from the process that the other
MemoryPool represents. In addition, MemoryPool instances would implement the
CopyTo and CopyFrom methods.
[~wesmckinn], [~pitrou], and others, what do you think?
> [C++] Create a device abstraction
> ---------------------------------
>
> Key: ARROW-2447
> URL: https://issues.apache.org/jira/browse/ARROW-2447
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++, GPU
> Affects Versions: 0.9.0
> Reporter: Antoine Pitrou
> Assignee: Pearu Peterson
> Priority: Major
> Fix For: 0.14.0
>
>
> Right now, a plain Buffer doesn't carry information about where it actually
> lies. That information also cannot be passed around, so you get APIs like
> {{PlasmaClient}} which take or return device number integers, and have
> implementations which hardcode operations on CUDA buffers. Also, unsuspecting
> receivers of a {{Buffer}} pointer may try to act on the underlying memory
> without knowing whether it's CPU-reachable or not.
> Here is a sketch for a proposed Device abstraction:
> {code}
> class Device {
> enum DeviceKind { KIND_CPU, KIND_CUDA };
> virtual DeviceKind kind() const;
> //MemoryPool* default_memory_pool() const;
> //std::shared_ptr<Buffer> Allocate(...);
> };
> class CpuDevice : public Device {};
> class CudaDevice : public Device {
> int device_num() const;
> };
> class Buffer {
> virtual DeviceKind device_kind() const;
> virtual std::shared_ptr<Device> device() const;
> virtual bool on_cpu() const {
> return true;
> }
> const uint8_t* cpu_data() const {
> return on_cpu() ? data() : nullptr;
> }
> uint8_t* cpu_mutable_data() {
> return on_cpu() ? mutable_data() : nullptr;
> }
> virtual CopyToCpu(std::shared_ptr<Buffer> dest) const;
> virtual CopyFromCpu(std::shared_ptr<Buffer> src);
> };
> class CudaBuffer : public Buffer {
> virtual bool on_cpu() const {
> return false;
> }
> };
> CopyBuffer(std::shared_ptr<Buffer> dest, const std::shared_ptr<Buffer> src);
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)