[
https://issues.apache.org/jira/browse/HIVE-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15956006#comment-15956006
]
Eugene Koifman commented on HIVE-11444:
---------------------------------------
More generally, raise alert
1. if there are too many open txns
2. if there are too many aborted txns - most likely a misconfigured streaming
ingest client. Need to include client info in the alert.
3. if there are a lot of entries in TXN_COMPONENTS - means compactor is not
keeping up
In extreme cases both can cause the amount of metadata to slow down the
metastore operations (TxnHandler/CompactionTxnHandler) a use very large amounts
of RAM (ValidTxnList)
> ACID Compactor should generate stats/alerts
> -------------------------------------------
>
> Key: HIVE-11444
> URL: https://issues.apache.org/jira/browse/HIVE-11444
> Project: Hive
> Issue Type: Improvement
> Components: Transactions
> Affects Versions: 1.0.0
> Reporter: Eugene Koifman
> Assignee: Eugene Koifman
>
> Compaction should generate stats about number of files it reads, min/max/avg
> size etc. It should also generate alerts if it looks like the system is not
> configured correctly.
> For example, if there are lots of delta files with very small files, it's a
> good sign that Streaming API is configured with batches that are too small.
> Simplest idea is to add another periodic task to AcidHouseKeeperService to
> //periodically do select count(*), min(txnid),max(txnid), type from
> txns group by type.
> //1. dump that to log file at info
> //2. could also keep counts for last 10min, hour, 6 hours, 24 hours,
> etc
> //2.2 if a large increase is detected - issue alert (at least to the
> log for now) at warn/error
> Should also alert if there is ACID activity but no compactions running.
> One way to do this is to add logic to TxnHandler to periodically check
> contents of COMPACTION_QUEUE table and keep a simple histogram of
> compactions over last few hours.
> Similarly can run a periodic check of transactions started (or
> committed/aborted) and keep a simple histogram. Then the 2 can be used to
> detect that there is ACID write activity but no compaction activity.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)