Andy Grove created ARROW-7744:
---------------------------------
Summary: [Java] Implement Flight JDBC Driver
Key: ARROW-7744
URL: https://issues.apache.org/jira/browse/ARROW-7744
Project: Apache Arrow
Issue Type: New Feature
Components: Java
Reporter: Andy Grove
Assignee: Andy Grove
Fix For: 1.0.0
As a Java developer, I would like the ability to use JDBC to interact with
Flight servers. For example, there is now an example in the Arrow repo to run a
Flight server wrapping DataFusion and it supports executing SQL against CSV and
Parquet files. I would like to be able to call this from Java.
A flight Arrow JDBC driver would also then simplify developing integrations
with other Apache projects, such as building a Spark V2 Data Source or a Drill
storage plugin. It would also be directly usable from many BI tools.
I propose that the class name of the driver should be
"org.apache.arrow.jdbc.Driver" and the connection string should be
"jdbc:arrow://host:port?[properties]". I'm purposely leaving "flight" out of
these because I don't think it makes sense to support multiple protocols now
that we have flight and it is easier for users to remember "arrow" rather than
needing to know about the protocol. This is easy to change if there are
objections.
JDBC is designed around sending queries as strings and then receiving results.
These strings could be SQL queries, JSON-encoded query plans, or something
else. The JDBC driver will not make any assumptions about the format or dialect
of these strings. Queries would be executed using the "DoGet" method.
The JDBC metadata functionality for reading schema information could possibly
use ListFlights but I haven't looked into this part yet.
I do expect that this JDBC driver will serve as a base that could be extended
to add specific functionality for different Flight servers rather than attempt
to support them all.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)