[
https://issues.apache.org/jira/browse/ARROW-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17123484#comment-17123484
]
Uwe Korn commented on ARROW-4144:
---------------------------------
Yes, the usecase would be to write large {{pandas.DataFrames}} to a database
layer that only has performant JDBC drivers. Personally, all my JDBC sources
are read-only and thus I didn't write a WriteToJDBC function but other people
will also use these technologies with more access rights. I have used the
"pyarrow->Arrow Java -> JDBC" successfully with Apache Drill and Denodo. I also
heard that some people use it together with Amazon Athena and here a performant
INSERT might be interesting
[https://docs.aws.amazon.com/athena/latest/ug/insert-into.html] as the JDBC
driver seems to be the most performant currently.
> [Java] Arrow-to-JDBC
> --------------------
>
> Key: ARROW-4144
> URL: https://issues.apache.org/jira/browse/ARROW-4144
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Java
> Reporter: Michael Pigott
> Assignee: Chen
> Priority: Major
>
> ARROW-1780 reads a query from a JDBC data source and converts the ResultSet
> to an Arrow VectorSchemaRoot. However, there is no built-in adapter for
> writing an Arrow VectorSchemaRoot back to the database.
> ARROW-3966 adds JDBC field metadata:
> * The Catalog Name
> * The Table Name
> * The Field Name
> * The Field Type
> We can use this information to ask for the field information from the
> database via the
> [DatabaseMetaData|https://docs.oracle.com/javase/7/docs/api/java/sql/DatabaseMetaData.html]
> object. We can then create INSERT or UPDATE statements based on the [list
> of primary
> keys|https://docs.oracle.com/javase/7/docs/api/java/sql/DatabaseMetaData.html#getPrimaryKeys(java.lang.String,%20java.lang.String,%20java.lang.String)]
> in the table:
> * If the value in the VectorSchemaRoot corresponding to the primary key is
> NULL, insert that record into the database.
> * If the value in the VectorSchemaRoot corresponding to the primary key is
> not NULL, update the existing record in the database.
> We can also perform the same data conversion in reverse based on the field
> types queried from the database.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)