Rajkumar Singh commented on HIVE-22081:

 {quote}Is this for cases where the automatic compaction was turned off for a 
while, and then someone turns that on later?{quote} yes, that right other than 
this starting Hive3 by default hive tables managed tables are Acids and the 
user who upgraded to hive3 will see more no of managed ACID tables.
currently org.apache.hadoop.hive.ql.txn.compactor.Initiator#checkForCompaction 
do lots of HDFS blocking operation which is time-consuming, per your suggestion 
I review what objects/results can be cached to make it more efficient. will 
upload the new patch with checkstyle warning and test failure. Thanks

> Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there 
> are too many Table/partitions are eligible for compaction 
> --------------------------------------------------------------------------------------------------------------------------------------
>                 Key: HIVE-22081
>                 URL: https://issues.apache.org/jira/browse/HIVE-22081
>             Project: Hive
>          Issue Type: Improvement
>          Components: Transactions
>    Affects Versions: 3.1.1
>            Reporter: Rajkumar Singh
>            Assignee: Rajkumar Singh
>            Priority: Major
>         Attachments: HIVE-22081.patch
> if Automatic Compaction is turned on, Initiator thread check for potential 
> table/partitions which are eligible for compactions and run some checks in 
> for loop before requesting compaction for eligibles. Though initiator thread 
> is configured to run at interval 5 min default, in case of many objects it 
> keeps on running as these checks are IO intensive and hog cpu.
> In the proposed changes, I am planning to do
> 1. passing less object to for loop by filtering out the objects based on the 
> condition which we are checking within the loop.
> 2. Doing Async call using future to determine compaction type(this is where 
> we do FileSystem calls)

This message was sent by Atlassian JIRA

Reply via email to