birschick-bq opened a new issue, #1906:
URL: https://github.com/apache/arrow-adbc/issues/1906

   ### What feature or improvement would you like to see?
   
   As we develop drivers for various data sources, we find that consumers of 
the driver not only need a reliable API, but also reliable metadata results to 
make consuming different drivers less data source specific.
   
   While consumers can use `GetTableSchema`, it may not provide enough 
information about the data source's unique column properties. 
   
   So consumers will use GetObjects to get more information about the native 
metadata of the data source. However, there is a large amount of flexibility 
afforded the values in the [COLUMN_SCHEMA 
structure](https://arrow.apache.org/docs/format/ADBC/C.html ).
   
   ```
   /// 3. Optional value.  Should be null if not supported by the driver.
   ///    xdbc_ values are meant to provide JDBC/ODBC-compatible metadata
   ///    in an agnostic manner.
   ```
   
   I'd like to propose a more restrictive or suggestive description of the 
field contents so that consuming this information can be more portable. I 
believe the "agnostic manner" intention is to use JDBC/ODBC values, if 
possible. Or reading into this more, values that can be reliable understood by 
the consumer of the call. The other possibility is to add new fields to the 
structure which would follow more restrictive specifications.
   
   Examples:
   `xdbc_type_name` should contain string values taken from either or both the 
JDBC [JDBCType 
enumeration](https://docs.oracle.com/en/java/javase/21/docs/api/java.sql/java/sql/JDBCType.html)
 or [ODBC 
identifiers](https://learn.microsoft.com/en-us/sql/odbc/reference/appendixes/sql-data-types?view=sql-server-ver16)
 - in a ADBC defined list of acceptable values.
   
   `xdbc_data_type` and `xdbc_sql_data_type` are not clearly defined nor is 
their difference (if any). It could be that `xdbc_data_type` is defined by the 
JDBC values and `xdbc_sql_data_type` could be defined by the ODBC values.
   
   Still, carrying around these legacy value is not ideal and we should likely 
associate an ADBC-defined value to one or both of these two fields
   
   `xdbc_nullable` as int16 - should be explicitly defined as `0` (not 
nullable) and `1` (nullable) and `2` (unknown) or `null` (unsupported by data 
source)
   
   `xdbc_is_nullable` - should be explicitly defined as "NO", "YES", "" 
(unknown)  or `null` (unsupported by data source)
   
   The result of this discussion should be 
   1. Improved documentation on allowable values
   2. Tests in each driver to confirm values are in scope.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to