Hi all,

Last year, we defined a protocol exposing the C Data Interface
(schema, array and stream) in Python through PyCapsule objects and
dunder methods `__arrow_c_schema/array/stream__`
(https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html).

A bit earlier last year, we also expanded the C Data Interface with
device capabilities:
https://arrow.apache.org/docs/dev/format/CDeviceDataInterface.html.

Combining those two, we are now proposing to expand the PyCapsule
protocol to additionally support the C Device Data interface as well.
- Issue where this is being discussed:
https://github.com/apache/arrow/issues/38325
- PR with the changes to the PyCapsule specification:
https://github.com/apache/arrow/pull/40708
- PR with an implementation for PyArrow:
https://github.com/apache/arrow/pull/40717

We would welcome your feedback in the issue or PR.

While the mechanics of the addition are quite straightforward (for
example for `__arrow_c_array__`, we add an equivalent
`__arrow_c_device_array__` which works exactly the same, except that
the returned data capsule holds a ArrowDeviceArray struct instead of a
ArrowArray struct), there is some discussion about the semantics,
mostly around expectations on cross-device copies (disallow them,
allow to copy implicitly, or provide a way to ask data on a specific
device?).

Best,
Joris

Reply via email to