Re: [DISCUSSION] C-Data API for Non-CPU Use Cases

2023-05-17 Thread Matt Topol
Reviving this discussion a bit now that things have stabilized on the PR[1]. > I do wonder if the stream interface is a little CUDA-specific...my first reaction was wondering if it shouldn't live in a CUDA header (or connector library including a CUDA header) since it contains direct references

Re: [DISCUSSION] C-Data API for Non-CPU Use Cases

2023-04-10 Thread Dewey Dunnington
I left some comments on the PR as well...I think this is an important addition and I'm excited to see this discussion! If there is further information that needs to be passed along in the future, schema metadata could be used. Even with schema metadata, the device type and ID will always need to

Re: [DISCUSSION] C-Data API for Non-CPU Use Cases

2023-04-10 Thread Weston Pace
Sorry, I meant: I am *now* a solid +1 On Mon, Apr 10, 2023 at 1:26 PM Weston Pace wrote: > I am not a solid +1 and I can see the usefulness. Matt and I spoke on > this externally and I think Matt has written a great summary. There were a > few more points that came up in the discussion that

Re: [DISCUSSION] C-Data API for Non-CPU Use Cases

2023-04-10 Thread Weston Pace
I am not a solid +1 and I can see the usefulness. Matt and I spoke on this externally and I think Matt has written a great summary. There were a few more points that came up in the discussion that I think are particularly compelling. * Avoiding device location is generally fatal In other cases

Re: [DISCUSSION] C-Data API for Non-CPU Use Cases

2023-04-10 Thread Matt Topol
> There's nothing in the spec today that prevents users from creating `ArrowDeviceArray` and `ArrowDeviceArrayStream` themselves True, but third-party applications aren't going to be the only downstream users of this API. We also want to build on this within Arrow itself to enable easier usage of

Re: [DISCUSSION] C-Data API for Non-CPU Use Cases

2023-04-10 Thread Weston Pace
I suppose I'm a weak +1 in "I don't entirely believe this will be useful but it doesn't seem harmful". There's nothing in the spec today that prevents users from creating `ArrowDeviceArray` and `ArrowDeviceArrayStream` themselves and I'm not familiar enough with the systems producing / consuming

Re: [DISCUSSION] C-Data API for Non-CPU Use Cases

2023-04-10 Thread Matt Topol
> The ArrowArray struct is not allowed to change, as it would break the ABI: https://arrow.apache.org/docs/format/CDataInterface.html#updating-this-specification I was referring more to the future case where we might need to introduce an `ArrowArrayV2` or something similar precisely because the

Re: [DISCUSSION] C-Data API for Non-CPU Use Cases

2023-04-08 Thread Antoine Pitrou
Hi Matt, I've posted comments on the PR. Besides: * The ArrowDeviceArray contains a pointer to an ArrowArray alongside the device information related to allocation. The reason for using a pointer is so that future modifications of the ArrowArray struct do not cause the size of this

[DISCUSSION] C-Data API for Non-CPU Use Cases

2023-04-07 Thread Matt Topol
Hey all, In order to facilitate the propagation of use cases that want to pass data allocated on non-cpu devices around between environments (like between Python and C++) we should enhance the C-Data API to account for passing memory and device information alongside the arrays themselves. In this