Re: [I] [Python] Expose the device interface through the Arrow PyCapsule protocol [arrow]

via GitHub Thu, 21 Mar 2024 08:08:43 -0700


pitrou commented on issue #38325:
URL: https://github.com/apache/arrow/issues/38325#issuecomment-2012549275


   >     * For a CPU-only library, it is encouraged to implement both the 
standard and device version of the protocol methods (i.e. both 
`__arrow_c_array__` and  `__arrow_c_device_array__`, and/or both 
`__arrow_c_stream__` and `__arrow_c_device_stream__`)
   
   +0. I'm not sure it makes sense to ask producers for more effort in this 
regard.
   
   >     * The presence of only the standard version (e.g. only 
`__arrow_c_array__` and not `__arrow_c_device_array__`) means that this is a 
CPU-only data object.
   
   +1
   
   >     * For a device-aware library, and for data structures that can only 
reside in non-CPU memory, you should _only_ implement the device version of the 
protocol (e.g. only add `__arrow_c_device_array__`, and never add a 
`__arrow_c_array__`)
   
   +1
   
   >       * Libraries can of course have data structures that can live on both 
CPU or non-CPU, and for those it is fine that they implement both versions (and 
error in the non-device version if the data is not on the CPU)?
   
   +1
   
   >         EDIT: this _has_ to be fine of course, given that pyarrow is in 
this situation, and we want to define both methods. But should we error in 
`__arrow_c_array__` for non-CPU data? (right now we don't actually check the 
device here, but silently return an ArrowArray struct with null buffer pointers)
   
   Yes, we should. The expectation of the (regular) C Data Interface is that 
data lives on the CPU.
   
   >     * Do we want to say something about expectations that no cross-device 
copies happen?
   
   In the producer or in the consumer? IMHO the consumer is free to do whatever 
suits them. On the producer side the question is a bit more delicate. Perhaps 
we need to pass some options to `__arrow_c_device_array__`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [Python] Expose the device interface through the Arrow PyCapsule protocol [arrow]

Reply via email to