[ 
https://issues.apache.org/jira/browse/DRILL-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Bünger updated DRILL-5216:
---------------------------------
    Summary: Make use of FetchSize in JDBC storage plugin  (was: Set FetchSize 
to Speed up Metadata retrieval for JDBC storage plugin over high latency 
connections)

> Make use of FetchSize in JDBC storage plugin
> --------------------------------------------
>
>                 Key: DRILL-5216
>                 URL: https://issues.apache.org/jira/browse/DRILL-5216
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - JDBC
>    Affects Versions: 1.9.0
>         Environment: drill-embedded on ubuntu client - connected to a remote 
> Oracle
>            Reporter: Thomas Bünger
>            Priority: Minor
>
> The metadata retrieval uses the default fetchsize for the underlying JDBC 
> driver, which in case of Oracle is only 10.
> In larger scenarios - as in mine - the Oracle cluster hosts thousands of 
> schemas and the small fetchsize results in hundres of individual roundtrips.
> In the end every Drill query against this storage takes at least a minute 
> (server is remote)
> So far, Drill is using the JDBC metadata API 
> {{java.sql.DatabaseMetaData.getSchemas()}} inside JdbcStoragePlugin.java and 
> could set an appropriate fetchsize before iterating the result set.
> I've tested this locally and improved latency a lot, but am note sure how 
> this affects other non-oracle JDBC drivers.
> The other (potentially long) query is the table enumeration.
> From what I've seen is Drill not calling the JDBC driver directly, but goes 
> through apache.calcite calling {{getTableNames()}} which under the hood calls 
> {{java.sql.DatabaseMetaData.getTables()}} and also contributes to slow 
> metadata retrieval due to small default fetch size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to