jorisvandenbossche commented on issue #38325:
URL: https://github.com/apache/arrow/issues/38325#issuecomment-2029775202

   Practical experimentation will help in informing the decisions we have to 
make here regarding the control of cross-device copies (e.g. would a 
`requested_device` keyword be useful?). Therefore I would like to suggest that 
we start with a minimal addition (just the new methods as currently described 
in https://github.com/apache/arrow/pull/40708, without further keywords), and 
get the implementation for pyarrow merged for 16.0. The guidelines / 
recommendations section can later be updated while we get experience with the 
first implementations.
   
   Based on the above discussion, I would add the following to the PR?
   
   - The `device` protocol methods should return data as-is on the device it is 
currently on (i.e. the expectation is that there is no cross-device copy 
happening in this method)
     (sidenote: of course in case someone would implement a tabular object that 
could use different devices for different columns, this guarantee of "no device 
copy" cannot be made, given that the resulting structure's data should live on 
a single device. But that seems a corner case not worth mentioning in 
(complicating) the spec?)
   - A device-aware producer _can_ implement `__arrow_c_array/stream__` that 
does an implicit device to CPU copy when called. 
     (this means that a consumer supporting multiple devices (like pyarrow) 
should always first check the device protocol methods before the CPU-only 
versions. And checking my PR implementing this for pyarrow 
(https://github.com/apache/arrow/pull/40717), I see I need to update for that) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to