[ 
https://issues.apache.org/jira/browse/HIVE-29467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahesh Raju Somalaraju updated HIVE-29467:
------------------------------------------
    Summary: Separate Config to Limit Aborted Compactions  (was: Separate 
Config to Limit Aborted Compaction)

> Separate Config to Limit Aborted Compactions
> --------------------------------------------
>
>                 Key: HIVE-29467
>                 URL: https://issues.apache.org/jira/browse/HIVE-29467
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Mahesh Raju Somalaraju
>            Assignee: Mahesh Raju Somalaraju
>            Priority: Major
>
> Currently, both regular and aborted compaction candidates are governed by the 
> same configuration parameter, metastore.compactor.fetch.size, which controls 
> how many potential compactions the HMS Initiator can pick in a single cycle. 
> In environments with a large backlog of aborted compactions, this can lead to 
> excessive initiator workload and performance pressure on the Hive Metastore.
> Introduce a separate configuration parameter to control the rate at which 
> aborted compactions are picked by the HMS cleaner, independent of regular 
> compactions (for example, metastore.aborted.compactor.fetch.size).
> The code block in 
> {code:java}
> ReadyToCleanAbortHandler.java{code}
> {code:java}
> public ReadyToCleanAbortHandler(SQLGenerator sqlGenerator, Configuration 
> conf, long abortedTimeThreshold, int abortedThreshold)  {    
> this.sqlGenerator = sqlGenerator; this.abortedTimeThreshold = 
> abortedTimeThreshold; this.abortedThreshold = abortedThreshold; 
> this.fetchSize = MetastoreConf.getIntVar(conf, 
> ConfVars.COMPACTOR_FETCH_SIZE); // suggesting for new config instead of 
> ConfVars.COMPACTOR_FETCH_SIZE    }{code}
> {{}}
> As a result, the Initiator batch size for regular compactions and the Cleaner 
> batch size for aborted transaction cleanup are both governed by 
> metastore.compactor.fetch.size.
> if we have  a very large historical backlog of aborted transactions and wants 
> to keep metastore.compactor.fetch.size high to maintain good throughput for 
> normal compactions, while also avoiding the Cleaner picking too many aborted 
> transactions in a single cycle. This behaviour can trigger scanning of a 
> large number of directories, cause long Cleaner runtimes and performance 
> pressure, and impact overall HMS/Cleaner stability.
> Introduce a Cleaner-specific configuration to independently limit aborted 
> transaction cleanup batch size, for example: 
> metastore.aborted.compactor.fetch.size.
> This would allow independent throttling of aborted transaction cleanup, safer 
> recovery from large aborted transaction backlogs, better operational tuning 
> without impacting regular compaction throughput, and improved HMS/Cleaner 
> stability in high-churn, real-world environments.
> Even though aborted cleanup is handled exclusively by the Cleaner, the 
> batch-size control remains coupled to metastore.compactor.fetch.size.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to