This is an automated email from the ASF dual-hosted git repository.
lidavidm pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/main by this push:
new 1b634e7d27 GH-37061: [Docs][Format] Clarify semantics of GetSchema in
FSQL (#38549)
1b634e7d27 is described below
commit 1b634e7d274cf42089d1ab237905a550de36c260
Author: David Li <[email protected]>
AuthorDate: Thu Dec 7 08:35:34 2023 -0500
GH-37061: [Docs][Format] Clarify semantics of GetSchema in FSQL (#38549)
### Rationale for this change
Schemas of result sets and bind parameters are ambiguous in a few cases
when they interact.
### What changes are included in this PR?
Add documentation clarifying the expected behavior.
### Are these changes tested?
N/A
### Are there any user-facing changes?
No
* Closes: #37061
Authored-by: David Li <[email protected]>
Signed-off-by: David Li <[email protected]>
---
docs/source/format/FlightSql.rst | 21 +++++++++++++++++++++
format/FlightSql.proto | 10 ++++++++--
2 files changed, 29 insertions(+), 2 deletions(-)
diff --git a/docs/source/format/FlightSql.rst b/docs/source/format/FlightSql.rst
index f7521c3876..add044c2d3 100644
--- a/docs/source/format/FlightSql.rst
+++ b/docs/source/format/FlightSql.rst
@@ -120,6 +120,23 @@ the ``type`` should be ``ClosePreparedStatement``).
``ActionCreatePreparedStatementRequest``
Create a new prepared statement for a SQL query.
+ The response will contain an opaque handle used to identify the
+ prepared statement. It may also contain two optional schemas: the
+ Arrow schema of the result set, and the Arrow schema of the bind
+ parameters (if any). Because the schema of the result set may
+ depend on the bind parameters, the schemas may not necessarily be
+ provided here as a result, or if provided, they may not be accurate.
+ Clients should not assume the schema provided here will be the
+ schema of any data actually returned by executing the prepared
+ statement.
+
+ Some statements may have bind parameters without any specific type.
+ (As a trivial example for SQL, consider ``SELECT ?``.) It is
+ not currently specified how this should be handled in the bind
+ parameter schema above. We suggest either using a union type to
+ enumerate the possible types, or using the NA (null) type as a
+ wildcard/placeholder.
+
``CommandPreparedStatementQuery``
Execute a previously created prepared statement and get the results.
@@ -128,6 +145,10 @@ the ``type`` should be ``ClosePreparedStatement``).
When used with GetFlightInfo: execute the prepared statement. The
prepared statement can be reused after fetching results.
+ When used with GetSchema: get the expected Arrow schema of the
+ result set. If the client has bound parameter values with DoPut
+ previously, the server should take those values into account.
+
``CommandPreparedStatementUpdate``
Execute a previously created prepared statement that does not
return results.
diff --git a/format/FlightSql.proto b/format/FlightSql.proto
index 9b5968e530..581cf1f76d 100644
--- a/format/FlightSql.proto
+++ b/format/FlightSql.proto
@@ -1537,11 +1537,14 @@ message ActionCreatePreparedStatementResult {
bytes prepared_statement_handle = 1;
// If a result set generating query was provided, dataset_schema contains the
- // schema of the dataset as described in Schema.fbs::Schema, it is
serialized as an IPC message.
+ // schema of the result set. It should be an IPC-encapsulated Schema, as
described in Schema.fbs.
+ // For some queries, the schema of the results may depend on the schema of
the parameters. The server
+ // should provide its best guess as to the schema at this point. Clients
must not assume that this
+ // schema, if provided, will be accurate.
bytes dataset_schema = 2;
// If the query provided contained parameters, parameter_schema contains the
- // schema of the expected parameters as described in Schema.fbs::Schema, it
is serialized as an IPC message.
+ // schema of the expected parameters. It should be an IPC-encapsulated
Schema, as described in Schema.fbs.
bytes parameter_schema = 3;
}
@@ -1743,6 +1746,9 @@ message TicketStatementQuery {
* - ARROW:FLIGHT:SQL:IS_CASE_SENSITIVE - "1" indicates if the column is
case-sensitive, "0" otherwise.
* - ARROW:FLIGHT:SQL:IS_READ_ONLY - "1" indicates if the column is
read only, "0" otherwise.
* - ARROW:FLIGHT:SQL:IS_SEARCHABLE - "1" indicates if the column is
searchable via WHERE clause, "0" otherwise.
+ *
+ * If the schema is retrieved after parameter values have been bound with
DoPut, then the server should account
+ * for the parameters when determining the schema.
* - DoPut: bind parameter values. All of the bound parameter sets will be
executed as a single atomic execution.
* - GetFlightInfo: execute the prepared statement instance.
*/