[ 
https://issues.apache.org/jira/browse/HBASE-29263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HBASE-29263:
------------------------------------

    Assignee: Prathyusha

> Metrics for long running procedures
> -----------------------------------
>
>                 Key: HBASE-29263
>                 URL: https://issues.apache.org/jira/browse/HBASE-29263
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 3.0.0-beta-1, 2.5.11, 2.6.2
>            Reporter: Viraj Jasani
>            Assignee: Prathyusha
>            Priority: Major
>
> As of today, the procedure metrics we have include:
>  * 
> SubmittedCount: Counter
>  * 
> Time: Histogram
>  * 
> FailedCount: Counter
> While the SubmittedCount is updated when the given procedure is submitted for 
> execution, the Time histogram and FailedCount metrics are updated upon the 
> termination of the procedures.
> With recent incidents like HBASE-29251, we have realized that we don't have 
> metrics to indicate long running or stuck procedures on which we can create 
> alerts.
> The purpose of this Jira is to introduce metrics for long running procedures. 
> One possible way to introduce such metric is by a chore that can periodically 
> look into how many procedures are currently being executed and have exceeded 
> certain amount of configurable time duration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to