Hi Matt,

I've posted comments on the PR. Besides:

    * The ArrowDeviceArray contains a pointer to an ArrowArray alongside
the device information related to allocation. The reason for using a
pointer is so that future modifications of the ArrowArray struct do not
cause the size of this struct to change (as it would still just be a
pointer to the ArrowArray struct).

The ArrowArray struct is not allowed to change, as it would break the ABI:
https://arrow.apache.org/docs/format/CDataInterface.html#updating-this-specification

Remaining Concerns that I can think of:
     * Alignment and padding of allocations can have a larger impact when
dealing with non-cpu devices than with CPUs, and this design provides no
way to communicate potential extra padding on a per-buffer basis. We could
attempt to codify a convention that allocations should have a specific
alignment and a particular padding, but that doesn't actually enforce
anything nor allow communicating if for some reason those conventions
weren't followed. Should we add some way of passing this info or punt this
for a future modification?

How exactly would this be communicated? Is the information actually useful? I got the impression that the CUDA programming model allows you to access exactly the right amount of data, regardless of HW parallelism.

This is part of a wider effort I'm attempting to address to
improve the non-cpu memory support in the Arrow libraries, such as enhanced
Buffer types in the C++ library that will have the device_id and
device_type information in addition to the `is_cpu` flag that currently
exists.

The C++ Device class already exists for this. You can get a Buffer's device pretty easily (by going through the MemoryManager, IIRC).

Regards

Antoine.

Reply via email to