[jira] [Closed] (DRILL-4127) HiveSchema.getSubSchema() should use lazy loading of all the table names

Dechang Gu (JIRA) Thu, 21 Jul 2016 17:21:43 -0700

     [ 
https://issues.apache.org/jira/browse/DRILL-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Dechang Gu closed DRILL-4127.
-----------------------------

verified with perf test framework.
without the patch (commit id: 539cbba):

91_539cbba_HIVE_20160720_113024/HIVE_limit1_02/HIVE_limit1_02.log:[STAT] TOTAL 
TIME : 126599 msec
91_539cbba_HIVE_20160720_113024/HIVE_limit1_02/HIVE_limit1_02.log:[STAT] TOTAL 
TIME : 165969 msec
91_539cbba_HIVE_20160720_113024/HIVE_limit1_02/HIVE_limit1_02.log:[STAT] TOTAL 
TIME : 163977 msec


with the patch (Apache Drill 1.5.0 GA, commit id: 3f228d3), the same query:
95_3f228d3_HIVE_20160721_130712/HIVE_limit1_02/HIVE_limit1_02.log:[STAT] TOTAL 
TIME : 1664 msec
95_3f228d3_HIVE_20160721_130712/HIVE_limit1_02/HIVE_limit1_02.log:[STAT] TOTAL 
TIME : 157 msec
95_3f228d3_HIVE_20160721_130712/HIVE_limit1_02/HIVE_limit1_02.log:[STAT] TOTAL 
TIME : 167 msec


So, LGTM.

> HiveSchema.getSubSchema() should use lazy loading of all the table names
> ------------------------------------------------------------------------
>
>                 Key: DRILL-4127
>                 URL: https://issues.apache.org/jira/browse/DRILL-4127
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Jinfeng Ni
>            Assignee: Jinfeng Ni
>             Fix For: 1.5.0
>
>
> Currently, HiveSchema.getSubSchema() will pre-load all the table names when 
> it constructs the subschema, even though those tables names are not requested 
> at all. This could cause considerably big performance overhead, especially 
> when the hive schema contains large # of objects (thousands of tables/views 
> are not un-common in some use case). 
> In stead, we should change the loading of table names to on-demand. Only when 
> there is a request of get all table names, we load them into hive schema.
> This should help "show schemas", since it only requires the schema name, not 
> the table names in the schema. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Closed] (DRILL-4127) HiveSchema.getSubSchema() should use lazy loading of all the table names

Reply via email to