[
https://issues.apache.org/jira/browse/HBASE-27966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bryan Beaudreault resolved HBASE-27966.
---------------------------------------
Resolution: Fixed
Pushed to branch-3. Test looks good there. I quickly checked the other branches
and looks like they were properly backported.
> HBase Master/RS JVM metrics populated incorrectly
> -------------------------------------------------
>
> Key: HBASE-27966
> URL: https://issues.apache.org/jira/browse/HBASE-27966
> Project: HBase
> Issue Type: Bug
> Components: metrics
> Affects Versions: 2.0.0-alpha-4
> Reporter: Nihal Jain
> Assignee: Nihal Jain
> Priority: Major
> Fix For: 2.6.0, 3.0.0-beta-1, 2.5.6
>
> Attachments: test_patch.txt
>
>
> HBase Master/RS JVM metrics populated incorrectly due to regression causing
> ambari metrics system to not able to capture them.
> Based on my analysis the issue is relevant for all release post 2.0.0-alpha-4
> and seems to be caused due to HBASE-18846.
> Have been able to compare the JVM metrics across 3 versions of HBase and
> attaching results of same below:
> HBase: 1.1.2
> {code:java}
> {
> "name" : "Hadoop:service=HBase,name=JvmMetrics",
> "modelerType" : "JvmMetrics",
> "tag.Context" : "jvm",
> "tag.ProcessName" : "RegionServer",
> "tag.SessionId" : "",
> "tag.Hostname" : "HOSTNAME",
> "MemNonHeapUsedM" : 196.05664,
> "MemNonHeapCommittedM" : 347.60547,
> "MemNonHeapMaxM" : 4336.0,
> "MemHeapUsedM" : 7207.315,
> "MemHeapCommittedM" : 66080.0,
> "MemHeapMaxM" : 66080.0,
> "MemMaxM" : 66080.0,
> "GcCount" : 3953,
> "GcTimeMillis" : 662520,
> "ThreadsNew" : 0,
> "ThreadsRunnable" : 214,
> "ThreadsBlocked" : 0,
> "ThreadsWaiting" : 626,
> "ThreadsTimedWaiting" : 78,
> "ThreadsTerminated" : 0,
> "LogFatal" : 0,
> "LogError" : 0,
> "LogWarn" : 0,
> "LogInfo" : 0
> },
> {code}
> HBase 2.0.2
> {code:java}
> {
> "name" : "Hadoop:service=HBase,name=JvmMetrics",
> "modelerType" : "JvmMetrics",
> "tag.Context" : "jvm",
> "tag.ProcessName" : "IO",
> "tag.SessionId" : "",
> "tag.Hostname" : "HOSTNAME",
> "MemNonHeapUsedM" : 203.86688,
> "MemNonHeapCommittedM" : 740.6953,
> "MemNonHeapMaxM" : -1.0,
> "MemHeapUsedM" : 14879.477,
> "MemHeapCommittedM" : 31744.0,
> "MemHeapMaxM" : 31744.0,
> "MemMaxM" : 31744.0,
> "GcCount" : 75922,
> "GcTimeMillis" : 5134691,
> "ThreadsNew" : 0,
> "ThreadsRunnable" : 90,
> "ThreadsBlocked" : 3,
> "ThreadsWaiting" : 158,
> "ThreadsTimedWaiting" : 36,
> "ThreadsTerminated" : 0,
> "LogFatal" : 0,
> "LogError" : 0,
> "LogWarn" : 0,
> "LogInfo" : 0
> },
> {code}
> HBase: 2.5.2
> {code:java}
> {
> "name": "Hadoop:service=HBase,name=JvmMetrics",
> "modelerType": "JvmMetrics",
> "tag.Context": "jvm",
> "tag.ProcessName": "IO",
> "tag.SessionId": "",
> "tag.Hostname": "HOSTNAME",
> "MemNonHeapUsedM": 192.9798,
> "MemNonHeapCommittedM": 198.4375,
> "MemNonHeapMaxM": -1.0,
> "MemHeapUsedM": 773.23584,
> "MemHeapCommittedM": 1004.0,
> "MemHeapMaxM": 1024.0,
> "MemMaxM": 1024.0,
> "GcCount": 2048,
> "GcTimeMillis": 25440,
> "ThreadsNew": 0,
> "ThreadsRunnable": 22,
> "ThreadsBlocked": 0,
> "ThreadsWaiting": 121,
> "ThreadsTimedWaiting": 49,
> "ThreadsTerminated": 0,
> "LogFatal": 0,
> "LogError": 0,
> "LogWarn": 0,
> "LogInfo": 0
> },
> {code}
> It can be observed that 2.0.x onwards the field "tag.ProcessName" is
> populating as "IO" instead of expected "RegionServer" or "Master".
> Ambari relies on this field process name to create a metric
> 'jvm.RegionServer.JvmMetrics.GcTimeMillis' etc. See
> [code.|https://github.com/apache/ambari/blob/2ec4b055d99ec84c902da16dd57df91d571b48d6/ambari-server/src/main/java/org/apache/ambari/server/controller/metrics/timeline/AMSPropertyProvider.java#L722]
> But post 2.0.x the field is getting populated as 'IO' and hence a metric with
> name 'jvm.JvmMetrics.GcTimeMillis' is created instead of expected
> 'jvm.RegionServer.JvmMetrics.GcTimeMillis', thus mixing up the metric with
> various other metrics coming from rs, master, spark executor etc. running on
> same host.
> *Expected*
> Field "tag.ProcessName" should be populated as "RegionServer" or "Master"
> instead of "IO".
> *Actual*
> Field "tag.ProcessName" is populating as "IO" instead of expected
> "RegionServer" or "Master" causing incorrect metric being published by ambari
> and thus mixing up all metrics and raising various alerts around JVM metrics.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)