Re: [PR] feat: Spec multi-result-set API [arrow-adbc]

via GitHub Mon, 12 Jan 2026 08:57:09 -0800


abonander commented on code in PR #3871:
URL: https://github.com/apache/arrow-adbc/pull/3871#discussion_r2683112181



##########
c/include/arrow-adbc/adbc.h:
##########
@@ -2013,6 +2070,57 @@ AdbcStatusCode AdbcStatementExecuteQuery(struct 
AdbcStatement* statement,
                                          struct ArrowArrayStream* out,
                                          int64_t* rows_affected, struct 
AdbcError* error);
 
+/// \brief Retrieve the next result set from a multi-result-set execution.
+///
+/// This AdbcStatement must outlive the returned ArrowArrayStream.
+///
+/// \since ADBC API revision 1.2.0
+///
+/// For an execution that returns multiple result sets, this can be called 
after
+/// iterating the first result set to get the next and subsequent result sets. 
 A
+/// driver MAY support calling AdbcStatementNextResultSet while the previous 
result is
+/// still being consumed (i.e. before the previous ArrowArrayStream is 
released),
+/// but this is not required.  If the driver does not support this, it should 
return
+/// ADBC_STATUS_INVALID_STATE if the previous result set is still active.  
Otherwise
+/// a driver should return ADBC_STATUS_OK to indicate successful execution of 
this
+/// function regardless of whether or not an additional result set is 
available.
+///
+/// If the original execution was via AdbcStatementExecuteSchema, then the
+/// ArrowArrayStream populated by this function should only contain the schema 
of the
+/// result set and not any data.
+///
+/// Either partitions or out must be NULL to indicate which style of output is 
desired
+/// by the caller.  Supplying non-NULL values to both must result in
+/// ADBC_STATUS_INVALID_ARGUMENT. If the original execution was via
+/// AdbcStatementExecuteQuery and the call to AdbcStatementNextResultSet has a 
non-NULL
+/// partitions, or the original was via AdbcStatementExecutePartitions and 
this call has a
+/// non-NULL out, then the driver may choose to return the data in a different 
style than
+/// the original result set.  If it does not (or cannot) then it should return
+/// ADBC_STATUS_INVALID_ARGUMENT.
+///
+/// The driver indicates that no additional result sets are available by 
setting the
+/// release callback on out (or partitions, whichever was not-NULL) to NULL 
and returning
+/// ADBC_STATUS_OK.
+///
+/// \param[in] statement The statement to fetch a subsequent result for
+/// \param[out] out The result stream to populate, or NULL if using partitions
+/// \param[out] partitions The partitions to populate, or NULL if using out
+/// \param[out] rows_affected The number of rows affected if known, else -1. 
Pass NULL if
+///   the client does not want this information.
+/// \param[out] error An optional location to return an error message if 
necessary.
+///
+/// \return ADBC_STATUS_NOT_IMPLEMENTED if the driver does not support 
multi-result set
+///   execution, ADBC_STATUS_INVALID_STATE if called at an inappropriate time 
but the
+///   driver does support multi-result set execution, 
ADBC_STATUS_INVALID_ARGUMENT if both
+///   out and partitions are non-NULL, or if the output style is incompatible 
with the
+///   original execution, and ADBC_STATUS_OK (or an appropriate error code) 
otherwise.
+ADBC_EXPORT
+AdbcStatusCode AdbcStatementNextResultSet(struct AdbcStatement* statement,
+                                          struct ArrowArrayStream* out,
+                                          struct AdbcPartitions* partitions,
+                                          int64_t* rows_affected,
+                                          struct AdbcError* error);

Review Comment:
   > Maybe we can specify that the query must be `NULL` in those cases?
   
   Does that mean the user has to explicitly set it back to `NULL`? Or that the 
statement should have internally set it to `NULL` when it was executed? That's 
kind of annoying if you're trying to re-use the same `Statement` instance to 
execute the same query multiple times if you have to re-set the query to 
execute it again. Some users might expect that they need to hold on to the 
`Statement` to execute it without paying the cost of preparing it again. They 
might even try to cache `Statement`s that they've already prepared alongside 
the parent connection.
   
   ---
   
   Including the partitions API makes this method weirdly modal, I would say if 
you wanted to lean into the partitions API then you should commit to it and 
just return a partition from here. Or just split it into two separate methods.
   
   But it's already hard to understand from the API design how you're supposed 
to use partitions, much less how you're supposed to implement them in the 
driver. Having to go _back_ to the connection to read the partition doesn't 
make a ton of sense to me, unless it's a separate query or protocol command to 
read from it, like fetching from a named cursor.
   
   ---
   
   I suggested privately that there should be a new `ExecuteMulti` method that 
explicitly returns a multi-result-set stream. That encodes the user's 
expectation up-front that there will be multiple result sets from the query, 
and doesn't require `Statement` itself to become any more stateful than it is 
now. 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] feat: Spec multi-result-set API [arrow-adbc]

Reply via email to