[GitHub] [arrow] zeroshade commented on a diff in pull request #34972: GH-34971: [Format] Enhance C-Data API to support non-cpu cases

via GitHub Sun, 09 Apr 2023 08:22:04 -0700


zeroshade commented on code in PR #34972:
URL: https://github.com/apache/arrow/pull/34972#discussion_r1161302114



##########
cpp/src/arrow/c/abi.h:
##########
@@ -106,6 +169,77 @@ struct ArrowArrayStream {
 
 #endif  // ARROW_C_STREAM_INTERFACE
 
+#ifndef ARROW_C_DEVICE_STREAM_INTERFACE
+#define ARROW_C_DEVICE_STREAM_INTERFACE
+
+struct ArrowDeviceArrayStream {
+  // The device that this stream produces data on.
+  // All ArrowDeviceArrays that are produced by this
+  // stream should have the same device_type as set
+  // here. The device_type needs to be provided here
+  // so that consumers can provide the correct type
+  // of stream_ptr when calling get_next.
+  ArrowDeviceType device_type;
+
+  // Callback to get the stream schema
+  // (will be the same for all arrays in the stream).
+  //
+  // Return value: 0 if successful, an `errno`-compatible error code otherwise.
+  //
+  // If successful, the ArrowSchema must be released independently from the 
stream.
+  int (*get_schema)(struct ArrowDeviceArrayStream*, struct ArrowSchema* out);
+
+  // Callback to get the device id for the next array.
+  // This is necessary so that the proper/correct stream pointer can be 
provided
+  // to get_next. The parameter provided must not be null.
+  //
+  // Return value: 0 if successful, an `errno`-compatible error code otherwise.
+  //
+  // The next call to `get_next` should provide an ArrowDeviceArray whose
+  // device_id matches what is provided here, and whose device_type is the
+  // same as the device_type member of this stream.
+  int (*get_next_device_id)(struct ArrowDeviceArrayStream*, int* 
out_device_id);
+
+  // Callback to get the next array
+  // (if no error and the array is released, the stream has ended)
+  //
+  // the provided stream_ptr should be the appropriate stream, or
+  // equivalent object, for the device that the data is allocated on
+  // to indicate where the consumer wants the data to be accessible.

Review Comment:
   Stream lifetime is a tough thing to handle for GPUs because everything is 
async. So after discussing with @kkraus14 we came to the conclusion that the 
safest API would be for a consumer to specify which stream they wanted to be 
able to access the data from and the producer would manage any event and stream 
synchronizing necessary to make it happen. This puts the stream lifetime 
management into the consumer (where it's easier to manage) rather than the 
producer.
   
   @kkraus14 you can probably explain this better than I just did.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] zeroshade commented on a diff in pull request #34972: GH-34971: [Format] Enhance C-Data API to support non-cpu cases

Reply via email to