[jira] [Commented] (HIVE-18705) Improve HiveMetaStoreClient.dropDatabase

Peter Vary (JIRA) Wed, 02 May 2018 06:16:14 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-18705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461004#comment-16461004
 ]


Peter Vary commented on HIVE-18705:
-----------------------------------

{quote}+So here's a question+: should I get rid of the batched scenario as all 
the tables are queried and are accessible at a time already, and there's little 
reason for me to query them in batches later (for memory reasons) instead of 
all of them at once. This way I could have the non-batched (send one dropDB 
only) scenario only which doesn't suffer from all the slowing effects I 
described above, and is generally 4-5 times faster than the current 
implementation.
{quote}
As we discussed offline I think we should keep the batched scenario. There are 
constant memory problems, and  we should strive to remove places from code 
where we query every table/partition to the memory, not introducing new ones :).

Also it would be good idea to check if it is possible to shorten the closure 
time for the DFSClient.

> Improve HiveMetaStoreClient.dropDatabase
> ----------------------------------------
>
>                 Key: HIVE-18705
>                 URL: https://issues.apache.org/jira/browse/HIVE-18705
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Adam Szita
>            Assignee: Adam Szita
>            Priority: Major
>         Attachments: HIVE-18705.0.patch, HIVE-18705.1.patch, 
> HIVE-18705.2.patch, HIVE-18705.4.patch
>
>
> {{HiveMetaStoreClient.dropDatabase}} has a strange implementation to ensure 
> dealing with client side hooks (for non-native tables e.g. HBase). Currently 
> it starts by retrieving all the tables from HMS, and then sends {{dropTable}} 
> calls to HMS table-by-table. At the end a {{dropDatabase}} just to be sure :) 
> I believe this could be refactored so that it speeds up the dropDB in 
> situations where the average table count per DB is very high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18705) Improve HiveMetaStoreClient.dropDatabase

Reply via email to