[jira] [Commented] (HIVE-6980) Drop table by using direct sql

Peter Vary (JIRA) Thu, 26 Apr 2018 05:07:51 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-6980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16453924#comment-16453924
 ]


Peter Vary commented on HIVE-6980:
----------------------------------

[~sershe]: Thanks for the review!

HIVE-2758 turned off level2 caching for datanucleus, so even if the change is 
done directly to the db datanucleus will be able to provide the new data. We 
even have a unit test verifying this 
{{TestHiveMetaStore.testConcurrentMetastores}}

Since the changes are done in one transaction, and the transaction level is set 
to read committed, then I think we are safe with the default configurations.

Do you think we need extra tests to cover this case as well?

Thanks,
Peter

> Drop table by using direct sql
> ------------------------------
>
>                 Key: HIVE-6980
>                 URL: https://issues.apache.org/jira/browse/HIVE-6980
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore
>    Affects Versions: 0.12.0
>            Reporter: Selina Zhang
>            Assignee: Peter Vary
>            Priority: Major
>         Attachments: HIVE-6980.patch
>
>
> Dropping table which has lots of partitions is slow. Even after applying the 
> patch of HIVE-6265, the drop table still takes hours (100K+ partitions). 
> The fixes come with two parts:
> 1. use directSQL to query the partitions protect mode;
> the current implementation needs to transfer the Partition object to client 
> and check the protect mode for each partition. I'd like to move this part of 
> logic to metastore. The check will be done by direct sql (if direct sql is 
> disabled, execute the same logic in the ObjectStore);
> 2. use directSQL to drop partitions for table;
> there maybe two solutions here:
> 1. add "DELETE CASCADE" in the schema. In this way we only need to delete 
> entries from partitions table use direct sql. May need to change 
> datanucleus.deletionPolicy = DataNucleus. 
> 2. clean up the dependent tables by issue DELETE statement. This also needs 
> to turn on datanucleus.query.sql.allowAll
> Both of above solutions should be able to fix the problem. The DELETE CASCADE 
> has to change schemas and prepare upgrade scripts. The second solutions added 
> maintenance cost if new tables added in the future releases.
> Please advice. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-6980) Drop table by using direct sql

Reply via email to