Mahesh Raju Somalaraju created HIVE-29467:
---------------------------------------------
Summary: Separate Config to Limit Aborted Compaction
Key: HIVE-29467
URL: https://issues.apache.org/jira/browse/HIVE-29467
Project: Hive
Issue Type: Improvement
Reporter: Mahesh Raju Somalaraju
Assignee: Mahesh Raju Somalaraju
Currently, both regular and aborted compaction candidates are governed by the
same configuration parameter, metastore.compactor.fetch.size, which controls
how many potential compactions the HMS Initiator can pick in a single cycle. In
environments with a large backlog of aborted compactions, this can lead to
excessive initiator workload and performance pressure on the Hive Metastore.
Introduce a separate configuration parameter to control the rate at which
aborted compactions are picked by the HMS cleaner, independent of regular
compactions (for example, metastore.aborted.compactor.fetch.size).
The code block in
{code:java}
ReadyToCleanAbortHandler.java{code}
{code:java}
public ReadyToCleanAbortHandler(SQLGenerator sqlGenerator, Configuration conf,
long abortedTimeThreshold, int abortedThreshold) { this.sqlGenerator =
sqlGenerator; this.abortedTimeThreshold = abortedTimeThreshold;
this.abortedThreshold = abortedThreshold; this.fetchSize =
MetastoreConf.getIntVar(conf, ConfVars.COMPACTOR_FETCH_SIZE); // suggesting for
new config instead of ConfVars.COMPACTOR_FETCH_SIZE }{code}
{{}}
As a result, the Initiator batch size for regular compactions and the Cleaner
batch size for aborted transaction cleanup are both governed by
metastore.compactor.fetch.size.
if we have a very large historical backlog of aborted transactions and wants
to keep metastore.compactor.fetch.size high to maintain good throughput for
normal compactions, while also avoiding the Cleaner picking too many aborted
transactions in a single cycle. This behaviour can trigger scanning of a large
number of directories, cause long Cleaner runtimes and performance pressure,
and impact overall HMS/Cleaner stability.
Introduce a Cleaner-specific configuration to independently limit aborted
transaction cleanup batch size, for example:
metastore.aborted.compactor.fetch.size.
This would allow independent throttling of aborted transaction cleanup, safer
recovery from large aborted transaction backlogs, better operational tuning
without impacting regular compaction throughput, and improved HMS/Cleaner
stability in high-churn, real-world environments.
Even though aborted cleanup is handled exclusively by the Cleaner, the
batch-size control remains coupled to metastore.compactor.fetch.size.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)