[jira] [Commented] (DRILL-5972) Slow performance for query on INFORMATION_SCHEMA.TABLE

ASF GitHub Bot (JIRA) Tue, 28 Nov 2017 16:14:28 -0800

    [ 
https://issues.apache.org/jira/browse/DRILL-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16269783#comment-16269783
 ]


ASF GitHub Bot commented on DRILL-5972:
---------------------------------------

Github user ppadma commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1038#discussion_r153661820
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/ischema/InfoSchemaFilter.java
 ---
    @@ -202,14 +202,19 @@ private Result evaluateHelperFunction(Map<String, 
String> recordValues, Function
             // If at least one arg returns FALSE, then the AND function value 
is FALSE
             // If at least one arg returns INCONCLUSIVE, then the AND function 
value is INCONCLUSIVE
             // If all args return TRUE, then the AND function value is TRUE
    +        Result result = Result.TRUE;
    +
             for(ExprNode arg : exprNode.args) {
               Result exprResult = evaluateHelper(recordValues, arg);
    -          if (exprResult != Result.TRUE) {
    +          if (exprResult == Result.FALSE) {
                 return exprResult;
               }
    +          if (exprResult == Result.INCONCLUSIVE) {
    --- End diff --
    
    @parthchandra yes, that is correct. 


> Slow performance for query on INFORMATION_SCHEMA.TABLE
> ------------------------------------------------------
>
>                 Key: DRILL-5972
>                 URL: https://issues.apache.org/jira/browse/DRILL-5972
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Information Schema
>    Affects Versions: 1.11.0
>            Reporter: Padma Penumarthy
>            Assignee: Padma Penumarthy
>             Fix For: 1.13.0
>
>
> A query like the following on INFORMATION_SCHEMA takes a long time to 
> execute. 
> select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from 
> INFORMATION_SCHEMA.`TABLES` WHERE TABLE_NAME LIKE '%' AND ( TABLE_SCHEMA = 
> 'hive.default' ) ORDER BY TABLE_TYPE, TABLE_CATALOG, TABLE_SCHEMA, 
> TABLE_NAME; 
> Reason being we fetch table information for all schemas instead of just 
> 'hive.default' schema.
> If we  change the predicate like this, it executes very fast.
> select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from 
> INFORMATION_SCHEMA.`TABLES` WHERE  ( TABLE_SCHEMA = 'hive.default' ) AND 
> TABLE_NAME LIKE '%'  ORDER BY TABLE_TYPE, TABLE_CATALOG, TABLE_SCHEMA, 
> TABLE_NAME; 
> The difference is in the order in which we evaluate the expressions in the 
> predicate.
> In the first case,  we first evaluate TABLE_NAME LIKE '%' and decide that it 
> is inconclusive (since we do not know the schema). So, we go get all tables 
> for all the schemas.
> In the second case, we first evaluate  TABLE_SCHEMA = 'hive.default' and 
> decide that we need to fetch only tables for that schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5972) Slow performance for query on INFORMATION_SCHEMA.TABLE

Reply via email to