Selina Zhang created HIVE-6980:
----------------------------------

             Summary: Drop table by using direct sql
                 Key: HIVE-6980
                 URL: https://issues.apache.org/jira/browse/HIVE-6980
             Project: Hive
          Issue Type: Improvement
          Components: Metastore
    Affects Versions: 0.12.0
            Reporter: Selina Zhang
            Assignee: Selina Zhang


Dropping table which has lots of partitions is slow. Even after applying the 
patch of HIVE-6265, the drop table still takes hours (100K+ partitions). 

The fixes come with two parts:
1. use directSQL to query the partitions protect mode;
the current implementation needs to transfer the Partition object to client and 
check the protect mode for each partition. I'd like to move this part of logic 
to metastore. The check will be done by direct sql (if direct sql is disabled, 
execute the same logic in the ObjectStore);

2. use directSQL to drop partitions for table;
there maybe two solutions here:
1. add "DELETE CASCADE" in the schema. In this way we only need to delete 
entries from partitions table use direct sql. May need to change 
datanucleus.deletionPolicy = DataNucleus. 
2. clean up the dependent tables by issue DELETE statement. This also needs to 
turn on datanucleus.query.sql.allowAll

Both of above solutions should be able to fix the problem. The DELETE CASCADE 
has to change schemas and prepare upgrade scripts. The second solutions added 
maintenance cost if new tables added in the future releases.

Please advice. 





--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to