zeroshade commented on issue #811:
URL: https://github.com/apache/arrow-adbc/issues/811#issuecomment-2123404768
@CurtHagenlocher @lidavidm What do you two think about the following idea:
```c++
struct AsyncArrowStream {
int (*on_schema)(struct AsyncArrowStream* self, struct ArrowSchema* out,
AdbcStatusCode status, struct
AdbcError* error);
int (*on_next)(struct AsyncArrowStream* self, struct ArrowDeviceArray*
out,
AdbcStatusCode status, struct AdbcError*
error);
void (*release)(struct AsyncArrowSTream* self);
void* private_data;
};
```
Which would be used like:
```c++
AdbcStatusCode AdbcStatementExecuteQuery(struct AdbcStatement* statement,
struct AsyncArrowStream* stream_handler,
int64_t* rows_affected, struct AdbcError* sync_error);
```
The caller would populate the `AsyncArrowStream`'s callbacks. `on_schema`
would be called as soon as the schema is available, with calls to on_next as
each record batch is available. Semantically:
* `private_data` should be populated by the caller with any contextual
information that is needed by the async callbacks.
* `sync_error` is populated if a synchronous error happens before any
asynchronous operations have begun.
* If an error is encountered asynchronously trying to get the schema, then
the status code and error are populated to call `on_schema` with a `nullptr`
for the `ArrowSchema`. `on_next` will not be called in this scenario.
* `rows_affected` should be populated if available before the call to
`on_schema`.
* If an error is encountered retrieving data, `on_next` is called with the
error and status code and `nullptr` for the `ArrowDeviceArray`.
* To signal the end of the stream, `on_next` is called with `ADBC_STATUS_OK`
and a `nullptr` for the `ArrowDeviceArray`.
* the async callbacks return int rather than void so that the callbacks can
indicate whether an error was encountered on their end and that the producer
should cancel/stop calling callback methods.
The following rules should be observed by drivers:
* The async callback for one call should complete before the next async
callback is called, avoiding potential race conditions on a single result
stream.
* Once the last callback in a stream completes, (`on_schema` or `on_next`)
the producer should then call `release`.
This would work for any and all of the cases that work with
ArrowArrayStreams. For the other scenarios, (`ExecuteUpdate`, `GetTableSchema`,
etc.) something closer to what @CurtHagenlocher was suggesting with an
`ArrowAsyncInfo` type object might be more useful maybe?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]