[ 
https://issues.apache.org/jira/browse/DRILL-7115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16809268#comment-16809268
 ] 

ASF GitHub Bot commented on DRILL-7115:
---------------------------------------

vdiravka commented on pull request #1706: DRILL-7115: Improve Hive schema show 
tables performance
URL: https://github.com/apache/drill/pull/1706#discussion_r271919375
 
 

 ##########
 File path: 
contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/client/TableEntryCacheLoader.java
 ##########
 @@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.hive.client;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import org.apache.drill.common.AutoCloseables;
+import org.apache.drill.exec.store.hive.ColumnListsCache;
+import org.apache.drill.exec.store.hive.HiveReadEntry;
+import org.apache.drill.exec.store.hive.HiveTableWithColumnCache;
+import org.apache.drill.exec.store.hive.HiveTableWrapper;
+import org.apache.drill.exec.store.hive.HiveUtilities;
+import org.apache.drill.shaded.guava.com.google.common.cache.CacheLoader;
+import org.apache.hadoop.hive.metastore.api.MetaException;
+import org.apache.hadoop.hive.metastore.api.NoSuchObjectException;
+import org.apache.hadoop.hive.metastore.api.Partition;
+import org.apache.hadoop.hive.metastore.api.Table;
+import org.apache.hadoop.hive.metastore.api.UnknownTableException;
+import org.apache.thrift.TException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * CacheLoader that synchronized on client and tries to reconnect when
+ * client fails. Used by {@link HiveMetadataCache}.
+ */
+final class TableEntryCacheLoader extends CacheLoader<TableName, 
HiveReadEntry> {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(TableNameLoader.class);
+
+  private final DrillHiveMetaStoreClient client;
+
+  TableEntryCacheLoader(DrillHiveMetaStoreClient client) {
+    this.client = client;
+  }
+
+
+  @Override
+  @SuppressWarnings("NullableProblems")
+  public HiveReadEntry load(TableName key) throws Exception {
+    Table table;
+    List<Partition> partitions;
+    synchronized (client) {
+      table = getTable(key);
+      partitions = getPartitions(key);
+    }
+    HiveTableWithColumnCache hiveTable = new HiveTableWithColumnCache(table, 
new ColumnListsCache(table));
+    List<HiveTableWrapper.HivePartitionWrapper> partitionWrappers = 
partitions.isEmpty()
+        ? null
 
 Review comment:
   You are right. Not sure that current logic is good, which is based on null 
checks instead of list size check, but it shouldn't be part of this PR. 
   Thanks for looking into it.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve Hive schema show tables performance
> -------------------------------------------
>
>                 Key: DRILL-7115
>                 URL: https://issues.apache.org/jira/browse/DRILL-7115
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Hive, Storage - Information Schema
>    Affects Versions: 1.15.0
>            Reporter: Igor Guzenko
>            Assignee: Igor Guzenko
>            Priority: Major
>             Fix For: 1.16.0
>
>
> In Sqlline(Drill), "show tables" on a Hive schema is taking nearly 15mins to 
> 20mins. The schema has nearly ~8000 tables.
> Whereas the same in beeline(Hive) is throwing the result in a split second(~ 
> 0.2 secs).
> I tested the same in my test cluster by creating 6000 tables(empty!) in Hive 
> and then doing "show tables" in Drill. It took more than 2 mins(~140 secs).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to