This is an automated email from the ASF dual-hosted git repository.

lidavidm pushed a commit to branch spec-1.1.0
in repository https://gitbox.apache.org/repos/asf/arrow-adbc.git

commit 581c2a1db002b0089d9cb4c380e7ea03796178f3
Author: David Li <[email protected]>
AuthorDate: Mon Jul 24 15:44:05 2023 -0400

    docs: update prose for 1.1.0 and clarify cancellation (#932)
    
    Fixes #928.
    Fixes #929.
---
 adbc.h                               |  12 ++--
 docs/source/format/specification.rst | 110 +++++++++++++++++++++++++++++++++++
 docs/source/format/versioning.rst    |  28 +++++----
 go/adbc/drivermgr/adbc.h             |  12 ++--
 4 files changed, 143 insertions(+), 19 deletions(-)

diff --git a/adbc.h b/adbc.h
index badfc7d6..f7ea69af 100644
--- a/adbc.h
+++ b/adbc.h
@@ -1292,9 +1292,11 @@ AdbcStatusCode AdbcConnectionRelease(struct 
AdbcConnection* connection,
 /// or while consuming an ArrowArrayStream returned from such.
 /// Calling this function should make the other functions return
 /// ADBC_STATUS_CANCELLED (from ADBC functions) or ECANCELED (from
-/// methods of ArrowArrayStream).
+/// methods of ArrowArrayStream).  (It is not guaranteed to, for
+/// instance, the result set may be buffered in memory already.)
 ///
-/// This must always be thread-safe (other operations are not).
+/// This must always be thread-safe (other operations are not).  It is
+/// not necessarily signal-safe.
 ///
 /// \since ADBC API revision 1.1.0
 /// \addtogroup adbc-1.1.0
@@ -1947,9 +1949,11 @@ AdbcStatusCode AdbcStatementBindStream(struct 
AdbcStatement* statement,
 /// or while consuming an ArrowArrayStream returned from such.
 /// Calling this function should make the other functions return
 /// ADBC_STATUS_CANCELLED (from ADBC functions) or ECANCELED (from
-/// methods of ArrowArrayStream).
+/// methods of ArrowArrayStream).  (It is not guaranteed to, for
+/// instance, the result set may be buffered in memory already.)
 ///
-/// This must always be thread-safe (other operations are not).
+/// This must always be thread-safe (other operations are not).  It is
+/// not necessarily signal-safe.
 ///
 /// \since ADBC API revision 1.1.0
 /// \addtogroup adbc-1.1.0
diff --git a/docs/source/format/specification.rst 
b/docs/source/format/specification.rst
index e7a44a41..88515e41 100644
--- a/docs/source/format/specification.rst
+++ b/docs/source/format/specification.rst
@@ -57,6 +57,26 @@ implementations will support this.
 - Go: ``OptionKeyAutoCommit``
 - Java: ``org.apache.arrow.adbc.core.AdbcConnection#setAutoCommit(boolean)``
 
+Metadata
+--------
+
+ADBC exposes a variety of metadata about the database, such as what catalogs,
+schemas, and tables exist, the Arrow schema of tables, and so on.
+
+.. _specification-statistics:
+
+Statistics
+----------
+
+.. note:: Since API revision 1.1.0
+
+ADBC exposes table/column statistics, such as the (unique) row count, min/max
+values, and so on.  The goal here is to make ADBC work better in federation
+scenarios, where one query engine wants to read Arrow data from another
+database.  Having statistics available lets the "outer" query planner make
+better choices about things like join order, or even decide to skip reading
+data entirely.
+
 Statements
 ==========
 
@@ -84,6 +104,16 @@ frees the user from knowing the right SQL syntax for their 
database.
 - Go: ``OptionKeyIngestTargetTable``
 - Java: ``org.apache.arrow.adbc.core.AdbcConnection#bulkIngest(String, 
org.apache.arrow.adbc.core.BulkIngestMode)``
 
+.. _specification-cancellation:
+
+Cancellation
+------------
+
+.. note:: Since API revision 1.1.0
+
+Queries (and operations that implicitly represent queries, like fetching
+:ref:`specification-statistics`) can be cancelled.
+
 Partitioned Result Sets
 -----------------------
 
@@ -97,6 +127,16 @@ machines.
 - Go: ``Statement.ExecutePartitions``
 - Java: ``org.apache.arrow.adbc.core.AdbcStatement#executePartitioned()``
 
+.. _specification-incremental-execution:
+
+In principle, a vendor could return the results of partitioned execution as
+they are available, instead of all at once.  Incremental execution allows
+drivers to expose this.  When enabled, each call to ``ExecutePartitions`` will
+return available endpoints to read instead of blocking to retrieve all
+endpoints.
+
+.. note:: Since API revision 1.1.0
+
 Lifecycle & Usage
 -----------------
 
@@ -135,3 +175,73 @@ Partitioned Execution
 .. mermaid:: AdbcStatementPartitioned.mmd
    :caption: This is similar to fetching data in Arrow Flight RPC (by
              design). See :doc:`"Downloading Data" <arrow:format/Flight>`.
+
+Error Handling
+==============
+
+The error handling strategy varies by language.
+
+In C, most methods take a :cpp:class:`AdbcError`.  In Go, most methods return
+an error that can be cast to an ``AdbcError``.  In Java, most methods raise an
+``AdbcException``.
+
+In all cases, an error contains:
+
+- A status code,
+- An error message,
+- An optional vendor code (a vendor-specific status code),
+- An optional 5-character "SQLSTATE" code (a SQL-like vendor-specific code).
+
+.. _specification-rich-error-metadata:
+
+Rich Error Metadata
+-------------------
+
+.. note:: Since API revision 1.1.0
+
+Drivers can expose additional rich error metadata.  This can be used to return
+structured error information.  For example, a driver could use something like
+the `Googleapis ErrorDetails`_.
+
+In C, special option values can be read after receiving an error to get error
+metadata.  In Go and Java, ``AdbcError`` and ``AdbcException`` respectively
+expose a list of additional metadata.
+
+.. _Googleapis ErrorDetails: 
https://github.com/googleapis/googleapis/blob/master/google/rpc/error_details.proto
+
+Changelog
+=========
+
+Version 1.1.0
+-------------
+
+The info key ADBC_INFO_DRIVER_ADBC_VERSION can be used to retrieve the
+driver's supported ADBC version.
+
+The canonical options "uri", "username", and "password" were added to make
+configuration consistent between drivers.
+
+:ref:`specification-cancellation` and the ability to both get and set options
+of different types were added.  (Previously, you could set string options but
+could not get option values or get/set values of other types.)  This can be
+used to get and set the current active catalog and/or schema through a pair of
+new canonical options.
+
+:ref:`specification-bulk-ingestion` supports two additional modes:
+
+- "adbc.ingest.mode.replace" will drop existing data, then behave like
+  "create".
+- "adbc.ingest.mode.create_append" will behave like "create", except if the
+  table already exists, it will not error.
+
+:ref:`specification-rich-error-metadata` has been added, allowing clients to
+get additional error metadata.
+
+The ability to retrive table/column :ref:`statistics
+<specification-statistics>` was added.  The goal here is to make ADBC work
+better in federation scenarios, where one query engine wants to read Arrow
+data from another database.
+
+:ref:`Incremental execution <specification-incremental-execution>` allows
+streaming partitions of a result set as they are available instead of blocking
+and waiting for query execution to finish before reading results.
diff --git a/docs/source/format/versioning.rst 
b/docs/source/format/versioning.rst
index 3205b792..b255aeeb 100644
--- a/docs/source/format/versioning.rst
+++ b/docs/source/format/versioning.rst
@@ -29,14 +29,19 @@ choices were made:
 Of course, we can never add/remove/change struct members, and we can
 never change the signatures of existing functions.
 
-The main point of concern is compatibility of :cpp:class:`AdbcDriver`.
+In ADBC 1.1.0, it was decided this would only apply to the "public"
+API, and not the driver-internal API (:cpp:class:`AdbcDriver`).  New
+members were added to this struct in the 1.1.0 revision.
+Compatibility is handled as follows:
 
 The driver entrypoint, :cpp:type:`AdbcDriverInitFunc`, is given a
-version and a pointer to a table of function pointers to initialize.
-The type of the table will depend on the version; when a new version
-of ADBC is accepted, then a new table of function pointers will be
-added.  That way, the driver knows the type of the table.  If/when we
-add a new ADBC version, the following scenarios are possible:
+version and a pointer to a table of function pointers to initialize
+(the :cpp:class:`AdbcDriver`).  The size of the table will depend on
+the version; when a new version of ADBC is accepted, then a new table
+of function pointers may be expanded.  For each version, the driver
+knows the expected size of the table, and must not read/write fields
+beyond that size.  If/when we add a new ADBC version, the following
+scenarios are possible:
 
 - An updated client application uses an old driver library.  The
   client will pass a `version` field greater than what the driver
@@ -46,7 +51,8 @@ add a new ADBC version, the following scenarios are possible:
 - An old client application uses an updated driver library.  The
   client will pass a ``version`` lower than what the driver
   recognizes, so the driver can either error, or if it can still
-  implement the old API contract, initialize the older table.
+  implement the old API contract, initialize the subset of the table
+  corresponding to the older version.
 
 This approach does not let us change the signatures of existing
 functions, but we can add new functions and remove existing ones.
@@ -64,7 +70,7 @@ backwards-incompatible versions such as 2.0.0, but which still
 implement the API standard version 1.0.0.
 
 Similarly, this documentation describes the ADBC API standard version
-1.0.0.  If/when a compatible revision is made (e.g. new standard
-options are defined), the next version would be 1.1.0.  If
-incompatible changes are made (e.g. new API functions), the next
-version would be 2.0.0.
+1.1.0.  If/when a compatible revision is made (e.g. new standard
+options or API functions are defined), the next version would be
+1.2.0.  If incompatible changes are made (e.g. changing the signature
+or semantics of a function), the next version would be 2.0.0.
diff --git a/go/adbc/drivermgr/adbc.h b/go/adbc/drivermgr/adbc.h
index badfc7d6..f7ea69af 100644
--- a/go/adbc/drivermgr/adbc.h
+++ b/go/adbc/drivermgr/adbc.h
@@ -1292,9 +1292,11 @@ AdbcStatusCode AdbcConnectionRelease(struct 
AdbcConnection* connection,
 /// or while consuming an ArrowArrayStream returned from such.
 /// Calling this function should make the other functions return
 /// ADBC_STATUS_CANCELLED (from ADBC functions) or ECANCELED (from
-/// methods of ArrowArrayStream).
+/// methods of ArrowArrayStream).  (It is not guaranteed to, for
+/// instance, the result set may be buffered in memory already.)
 ///
-/// This must always be thread-safe (other operations are not).
+/// This must always be thread-safe (other operations are not).  It is
+/// not necessarily signal-safe.
 ///
 /// \since ADBC API revision 1.1.0
 /// \addtogroup adbc-1.1.0
@@ -1947,9 +1949,11 @@ AdbcStatusCode AdbcStatementBindStream(struct 
AdbcStatement* statement,
 /// or while consuming an ArrowArrayStream returned from such.
 /// Calling this function should make the other functions return
 /// ADBC_STATUS_CANCELLED (from ADBC functions) or ECANCELED (from
-/// methods of ArrowArrayStream).
+/// methods of ArrowArrayStream).  (It is not guaranteed to, for
+/// instance, the result set may be buffered in memory already.)
 ///
-/// This must always be thread-safe (other operations are not).
+/// This must always be thread-safe (other operations are not).  It is
+/// not necessarily signal-safe.
 ///
 /// \since ADBC API revision 1.1.0
 /// \addtogroup adbc-1.1.0

Reply via email to