[jira] [Commented] (HIVE-21034) Add option to schematool to drop Hive databases

Daniel Voros (JIRA) Thu, 13 Dec 2018 01:17:49 -0800


    [ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16719958#comment-16719958
 ]


Daniel Voros commented on HIVE-21034:
-------------------------------------

This new option could be used for (or during) dropping the HMS. When you're 
dropping that, you might want to remove any data associated with it, since 
you're losing the metadata that would help reading it anyway.

By "returning the datastore" do you mean dropping the S3 bucket for example? If 
yes, then I was thinking about the use-case when you're reusing the datastore, 
but want to free up space to save cost.

I believe an inverse of -initSchema could be useful too, both for the security 
concern you've described and simply to clean up the RDBMS behind HMS.

All in all I think we need to work on defining the process(es) of uninstalling 
Hive, keeping cloud workloads in mind. This new schematool option could be the 
first step in that direction.

> Add option to schematool to drop Hive databases
> -----------------------------------------------
>
>                 Key: HIVE-21034
>                 URL: https://issues.apache.org/jira/browse/HIVE-21034
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Daniel Voros
>            Assignee: Daniel Voros
>            Priority: Major
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21034) Add option to schematool to drop Hive databases

Reply via email to