[
https://issues.apache.org/jira/browse/HIVE-19996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kevin Minder reassigned HIVE-19996:
-----------------------------------
Assignee: Kevin Minder
> Beeline performance poor with drivers having slow
> DatabaseMetaData.getPrimaryKeys impl
> --------------------------------------------------------------------------------------
>
> Key: HIVE-19996
> URL: https://issues.apache.org/jira/browse/HIVE-19996
> Project: Hive
> Issue Type: Improvement
> Components: Beeline
> Affects Versions: 1.2.1
> Environment: Issue detected using Beeline with HBase Phoenix thin
> driver and a result set with many columns.
> Reporter: Kevin Minder
> Assignee: Kevin Minder
> Priority: Major
> Attachments: HIVE-19996.1.patch
>
>
> Beeline performance is rather poor for table output format when two
> conditions occur for the same result set.
> # The result set has a large number of columns.
> # The driver being used has a slow implementation of
> DatabaseMetaData.getPrimaryKeys.
> For example testing has shown that for a query with ~100 columns using the
> HBase Phoenix thin driver the execution time can be cut from ~30 seconds to
> ~2 seconds by using CSV output format vs table output format. For example:
> {{select * from system.catalog;}}
> This is due to how primary keys are detected. Currently the Rows
> implementation will make a metadata call for every column to determine it is
> a primary key for display purposes. I propose optimizing this such that a
> metadata call is only made for each unique table in the result set's columns.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)