[
https://issues.apache.org/jira/browse/HBASE-24543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sandeep Pal reassigned HBASE-24543:
-----------------------------------
Assignee: Sandeep Pal
> ScheduledChore logging is too chatty, replace with metrics
> ----------------------------------------------------------
>
> Key: HBASE-24543
> URL: https://issues.apache.org/jira/browse/HBASE-24543
> Project: HBase
> Issue Type: Improvement
> Components: metrics, Operability
> Reporter: Andrew Kyle Purtell
> Assignee: Sandeep Pal
> Priority: Minor
>
> ScheduledChore logs at DEBUG level the execution time of each chore.
> We used to log an average execution time across all chores every five
> minutes, which by consensus was judged to not be useful. Derived metrics like
> averages or histograms should be calculated per chore. So we modified the
> logging to dump the chore execution time each time it runs, to facilitate
> such calculations with the log aggregation and searching tool of choice. Per
> chore execution logging is more useful, in that sense, but may be too chatty.
> This is not unexpected but let me provide my observations so we can revisit
> this.
> On the master, for example, this is logged every second:
> {noformat}
> 2020-06-11 16:35:28,263 DEBUG
> [master/apurtell-ltm:8100.splitLogManager..Chore.1] hbase.ScheduledChore:
> SplitLogManager Timeout Monitor execution time: 0 ms.
> {noformat}
> Does the value of these lines outweigh the cost of 86,400 log lines per day
> per master instance? (At least.)
> On the regionserver it is somewhat better, these are logged every 10 seconds:
> {noformat}
> 2020-06-11 16:37:57,203 DEBUG [regionserver/apurtell-ltm:8120.Chore.1]
> hbase.ScheduledChore: CompactionChecker execution time: 0 ms.
> 2020-06-11 16:37:57,203 DEBUG [regionserver/apurtell-ltm:8120.Chore.1]
> hbase.ScheduledChore: MemstoreFlusherChore execution time: 0 ms.
> {noformat}
> So that will be 17,280 log lines per day per regionserver. (At least.)
> These could be moved to TRACE level, perhaps.
> I propose we replace this logging with histogram metrics. There should be a
> separate metric for each distinct chore classname, allocated as needed.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)