(Referring to the CommandXTypeInfo message):
The intent of SQL_DATA_TYPE field was to hold source-specific data type codes 
rather than the usual external facing SQL types reported in ODBC and JDBC API 
calls.

A developer writing code for a specific database, but using a general API might 
be able to take advantage of these (for example to identify types reported 
through ODBC as LONGVARCHAR but are internally XML, JSON, etc).

The data_type field is the equivalent of ODBC’s SQL type and JDBC java.sql.Type.

Application types (ie C types in ODBC) aren’t really modelled in metadata since 
data always comes back as Arrow (unlike ODBC and JDBC where you can get a cell 
as whatever the ODBC/JDBC conversion tables allow).

From: Curt Hagenlocher <c...@hagenlocher.org>
Date: Thursday, January 11, 2024 at 11:04 AM
To: dev@arrow.apache.org <dev@arrow.apache.org>
Cc: James Duong <james.du...@improving.com>
Subject: Re: ADBC: xdbc_data_type and xdbc_sql_data_type
Interestingly, the description of sql_data_type in FlightSql.proto includes 
"The value of the SQL DATA TYPE which has the same values as data_type value."



On Thu, Jan 11, 2024 at 10:06 AM David Li 
<lidav...@apache.org<mailto:lidav...@apache.org>> wrote:
Those values are inherited from Flight SQL [1] which effectively borrowed types 
from JDBC/ODBC.

xdbc_sql_data_type [2] is defined by an enum [3]. This is the database's type 
in its SQL dialect, not the Arrow type. Arrow types are always represented in 
Arrow schemas. (This field is a little contradictory to JDBC, which specifies 
sql_data_type is unused/reserved.)

xdbc_data_type [4] is ill-defined I think. James Duong, do you have a 
clarification about Dremio's original intent here? In JDBC this is a 
java.sql.Types value but it is not explained in Flight SQL. In fact it seems 
the proto interchanged the definitions of the two fields, since the enum above 
is java.sql.Types.


[1]: 
https://github.com/apache/arrow-adbc/blob/6b73e529ced2f057aa463e7599c6e1227104b025/adbc.h#L1520-L1522
[2]: 
https://github.com/apache/arrow/blob/2b4a70320232647f730b19d2fea5746c3baec752/format/FlightSql.proto#L1098-L1102
[3]: 
https://github.com/apache/arrow/blob/2b4a70320232647f730b19d2fea5746c3baec752/format/FlightSql.proto#L944-L973
[4]: 
https://github.com/apache/arrow/blob/2b4a70320232647f730b19d2fea5746c3baec752/format/FlightSql.proto#L1067

On Thu, Jan 11, 2024, at 12:37, David Coe wrote:
> I recently raised csharp/src/Apache.Arrow/Types/ArrowType: There are
> different type IDs for values after 21, including Decimal128 and
> Decimal256, than for Python * Issue #39568 * apache/arrow
> (github.com<http://github.com>)<https://github.com/apache/arrow/issues/39568> 
> because I
> have a downstream system that is interpreting the
> XDBC_DATA_TYPE<https://github.com/apache/arrow-adbc/blob/6b73e529ced2f057aa463e7599c6e1227104b025/adbc.h#L1501>
> as the ArrowTypeId and those are different values in different
> languages.
>
> For ADBC, what is the intended distinction between xdbc_data_type and
> xdbc_sql_data_type? Is the xdbc_data_type intended to mimic the C types
> in ODBC? Or is there a different interpretation? And if there are docs
> I don't seem to be finding, please refer me to those.
>
> Thanks,
>
>   *   David

Reply via email to