[jira] [Updated] (DRILL-5972) Slow performance for query on INFORMATION_SCHEMA.TABLE

Padma Penumarthy (JIRA) Wed, 15 Nov 2017 22:06:17 -0800

     [ 
https://issues.apache.org/jira/browse/DRILL-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Padma Penumarthy updated DRILL-5972:
------------------------------------
    Description: 
A query like the following on INFORMATION_SCHEMA takes a long time to execute. 

select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from 
INFORMATION_SCHEMA.`TABLES` WHERE TABLE_NAME LIKE '%' AND ( TABLE_SCHEMA = 
'hive.default' ) ORDER BY TABLE_TYPE, TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME; 

Reason being we fetch table information for all schemas instead of just 
'hive.default' schema.

If we  change the predicate like this, it executes very fast.

select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from 
INFORMATION_SCHEMA.`TABLES` WHERE  ( TABLE_SCHEMA = 'hive.default' ) AND 
TABLE_NAME LIKE '%'  ORDER BY TABLE_TYPE, TABLE_CATALOG, TABLE_SCHEMA, 
TABLE_NAME; 

The difference is in the order in which we evaluate the expressions in the 
predicate.
In the first case,  we first evaluate TABLE_NAME LIKE '%' and decide that it is 
inconclusive (since we do not know the schema). So, we go get all tables for 
all the schemas.

In the second case, we first evaluate  TABLE_SCHEMA = 'hive.default' and decide 
that we need to fetch only tables for that schema.



  was:
A query like the following on INFORMATION_SCHEMA takes a long time to execute. 

select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from 
INFORMATION_SCHEMA.`TABLES` WHERE TABLE_NAME LIKE '%' AND ( TABLE_SCHEMA = 
'hive.default' ) ORDER BY TABLE_TYPE, TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME; 

Reason being we fetch table information for all schemas instead of just 
'hive.default' schema.

If we  change the predicate like this, it executes very fast.

select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from 
INFORMATION_SCHEMA.`TABLES` WHERE  ( TABLE_SCHEMA = 'hive.default' ) AND 
TABLE_NAME LIKE '%'  ORDER BY TABLE_TYPE, TABLE_CATALOG, TABLE_SCHEMA, 
TABLE_NAME; 

The difference is in the order in which we evaluate the expressions in the 
predicate.
In the first case,  we first evaluate TABLE_NAME LIKE '%' and decide that it is 
inconclusive (since we do not know the schema). So, we go get all tables for 
all the schemas.

In the second case, we first evaluate  TABLE_SCHEMA = 'hive.default' and decide 
that we need to fetch only tables for that schema.



> Slow performance for query on INFORMATION_SCHEMA.TABLE
> ------------------------------------------------------
>
>                 Key: DRILL-5972
>                 URL: https://issues.apache.org/jira/browse/DRILL-5972
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Information Schema
>    Affects Versions: 1.11.0
>            Reporter: Padma Penumarthy
>            Assignee: Padma Penumarthy
>             Fix For: 1.12.0
>
>
> A query like the following on INFORMATION_SCHEMA takes a long time to 
> execute. 
> select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from 
> INFORMATION_SCHEMA.`TABLES` WHERE TABLE_NAME LIKE '%' AND ( TABLE_SCHEMA = 
> 'hive.default' ) ORDER BY TABLE_TYPE, TABLE_CATALOG, TABLE_SCHEMA, 
> TABLE_NAME; 
> Reason being we fetch table information for all schemas instead of just 
> 'hive.default' schema.
> If we  change the predicate like this, it executes very fast.
> select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from 
> INFORMATION_SCHEMA.`TABLES` WHERE  ( TABLE_SCHEMA = 'hive.default' ) AND 
> TABLE_NAME LIKE '%'  ORDER BY TABLE_TYPE, TABLE_CATALOG, TABLE_SCHEMA, 
> TABLE_NAME; 
> The difference is in the order in which we evaluate the expressions in the 
> predicate.
> In the first case,  we first evaluate TABLE_NAME LIKE '%' and decide that it 
> is inconclusive (since we do not know the schema). So, we go get all tables 
> for all the schemas.
> In the second case, we first evaluate  TABLE_SCHEMA = 'hive.default' and 
> decide that we need to fetch only tables for that schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5972) Slow performance for query on INFORMATION_SCHEMA.TABLE

Reply via email to