[
https://issues.apache.org/jira/browse/HIVE-29467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mahesh Raju Somalaraju updated HIVE-29467:
------------------------------------------
Summary: Separate Config to Limit Aborted Compactions (was: Separate
Config to Limit Aborted Compaction)
> Separate Config to Limit Aborted Compactions
> --------------------------------------------
>
> Key: HIVE-29467
> URL: https://issues.apache.org/jira/browse/HIVE-29467
> Project: Hive
> Issue Type: Improvement
> Reporter: Mahesh Raju Somalaraju
> Assignee: Mahesh Raju Somalaraju
> Priority: Major
>
> Currently, both regular and aborted compaction candidates are governed by the
> same configuration parameter, metastore.compactor.fetch.size, which controls
> how many potential compactions the HMS Initiator can pick in a single cycle.
> In environments with a large backlog of aborted compactions, this can lead to
> excessive initiator workload and performance pressure on the Hive Metastore.
> Introduce a separate configuration parameter to control the rate at which
> aborted compactions are picked by the HMS cleaner, independent of regular
> compactions (for example, metastore.aborted.compactor.fetch.size).
> The code block in
> {code:java}
> ReadyToCleanAbortHandler.java{code}
> {code:java}
> public ReadyToCleanAbortHandler(SQLGenerator sqlGenerator, Configuration
> conf, long abortedTimeThreshold, int abortedThreshold) {
> this.sqlGenerator = sqlGenerator; this.abortedTimeThreshold =
> abortedTimeThreshold; this.abortedThreshold = abortedThreshold;
> this.fetchSize = MetastoreConf.getIntVar(conf,
> ConfVars.COMPACTOR_FETCH_SIZE); // suggesting for new config instead of
> ConfVars.COMPACTOR_FETCH_SIZE }{code}
> {{}}
> As a result, the Initiator batch size for regular compactions and the Cleaner
> batch size for aborted transaction cleanup are both governed by
> metastore.compactor.fetch.size.
> if we have a very large historical backlog of aborted transactions and wants
> to keep metastore.compactor.fetch.size high to maintain good throughput for
> normal compactions, while also avoiding the Cleaner picking too many aborted
> transactions in a single cycle. This behaviour can trigger scanning of a
> large number of directories, cause long Cleaner runtimes and performance
> pressure, and impact overall HMS/Cleaner stability.
> Introduce a Cleaner-specific configuration to independently limit aborted
> transaction cleanup batch size, for example:
> metastore.aborted.compactor.fetch.size.
> This would allow independent throttling of aborted transaction cleanup, safer
> recovery from large aborted transaction backlogs, better operational tuning
> without impacting regular compaction throughput, and improved HMS/Cleaner
> stability in high-churn, real-world environments.
> Even though aborted cleanup is handled exclusively by the Cleaner, the
> batch-size control remains coupled to metastore.compactor.fetch.size.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)