[ https://issues.apache.org/jira/browse/HBASE-27966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bryan Beaudreault reopened HBASE-27966: --------------------------------------- Re-opening because I just realized that this was not included in branch-3. Perhaps it was committed around the time of our branching that. We need to cherry-pick to branch-3, which I will do shortly. > HBase Master/RS JVM metrics populated incorrectly > ------------------------------------------------- > > Key: HBASE-27966 > URL: https://issues.apache.org/jira/browse/HBASE-27966 > Project: HBase > Issue Type: Bug > Components: metrics > Affects Versions: 2.0.0-alpha-4 > Reporter: Nihal Jain > Assignee: Nihal Jain > Priority: Major > Fix For: 2.6.0, 2.5.6, 3.0.0-beta-1 > > Attachments: test_patch.txt > > > HBase Master/RS JVM metrics populated incorrectly due to regression causing > ambari metrics system to not able to capture them. > Based on my analysis the issue is relevant for all release post 2.0.0-alpha-4 > and seems to be caused due to HBASE-18846. > Have been able to compare the JVM metrics across 3 versions of HBase and > attaching results of same below: > HBase: 1.1.2 > {code:java} > { > "name" : "Hadoop:service=HBase,name=JvmMetrics", > "modelerType" : "JvmMetrics", > "tag.Context" : "jvm", > "tag.ProcessName" : "RegionServer", > "tag.SessionId" : "", > "tag.Hostname" : "HOSTNAME", > "MemNonHeapUsedM" : 196.05664, > "MemNonHeapCommittedM" : 347.60547, > "MemNonHeapMaxM" : 4336.0, > "MemHeapUsedM" : 7207.315, > "MemHeapCommittedM" : 66080.0, > "MemHeapMaxM" : 66080.0, > "MemMaxM" : 66080.0, > "GcCount" : 3953, > "GcTimeMillis" : 662520, > "ThreadsNew" : 0, > "ThreadsRunnable" : 214, > "ThreadsBlocked" : 0, > "ThreadsWaiting" : 626, > "ThreadsTimedWaiting" : 78, > "ThreadsTerminated" : 0, > "LogFatal" : 0, > "LogError" : 0, > "LogWarn" : 0, > "LogInfo" : 0 > }, > {code} > HBase 2.0.2 > {code:java} > { > "name" : "Hadoop:service=HBase,name=JvmMetrics", > "modelerType" : "JvmMetrics", > "tag.Context" : "jvm", > "tag.ProcessName" : "IO", > "tag.SessionId" : "", > "tag.Hostname" : "HOSTNAME", > "MemNonHeapUsedM" : 203.86688, > "MemNonHeapCommittedM" : 740.6953, > "MemNonHeapMaxM" : -1.0, > "MemHeapUsedM" : 14879.477, > "MemHeapCommittedM" : 31744.0, > "MemHeapMaxM" : 31744.0, > "MemMaxM" : 31744.0, > "GcCount" : 75922, > "GcTimeMillis" : 5134691, > "ThreadsNew" : 0, > "ThreadsRunnable" : 90, > "ThreadsBlocked" : 3, > "ThreadsWaiting" : 158, > "ThreadsTimedWaiting" : 36, > "ThreadsTerminated" : 0, > "LogFatal" : 0, > "LogError" : 0, > "LogWarn" : 0, > "LogInfo" : 0 > }, > {code} > HBase: 2.5.2 > {code:java} > { > "name": "Hadoop:service=HBase,name=JvmMetrics", > "modelerType": "JvmMetrics", > "tag.Context": "jvm", > "tag.ProcessName": "IO", > "tag.SessionId": "", > "tag.Hostname": "HOSTNAME", > "MemNonHeapUsedM": 192.9798, > "MemNonHeapCommittedM": 198.4375, > "MemNonHeapMaxM": -1.0, > "MemHeapUsedM": 773.23584, > "MemHeapCommittedM": 1004.0, > "MemHeapMaxM": 1024.0, > "MemMaxM": 1024.0, > "GcCount": 2048, > "GcTimeMillis": 25440, > "ThreadsNew": 0, > "ThreadsRunnable": 22, > "ThreadsBlocked": 0, > "ThreadsWaiting": 121, > "ThreadsTimedWaiting": 49, > "ThreadsTerminated": 0, > "LogFatal": 0, > "LogError": 0, > "LogWarn": 0, > "LogInfo": 0 > }, > {code} > It can be observed that 2.0.x onwards the field "tag.ProcessName" is > populating as "IO" instead of expected "RegionServer" or "Master". > Ambari relies on this field process name to create a metric > 'jvm.RegionServer.JvmMetrics.GcTimeMillis' etc. See > [code.|https://github.com/apache/ambari/blob/2ec4b055d99ec84c902da16dd57df91d571b48d6/ambari-server/src/main/java/org/apache/ambari/server/controller/metrics/timeline/AMSPropertyProvider.java#L722] > But post 2.0.x the field is getting populated as 'IO' and hence a metric with > name 'jvm.JvmMetrics.GcTimeMillis' is created instead of expected > 'jvm.RegionServer.JvmMetrics.GcTimeMillis', thus mixing up the metric with > various other metrics coming from rs, master, spark executor etc. running on > same host. > *Expected* > Field "tag.ProcessName" should be populated as "RegionServer" or "Master" > instead of "IO". > *Actual* > Field "tag.ProcessName" is populating as "IO" instead of expected > "RegionServer" or "Master" causing incorrect metric being published by ambari > and thus mixing up all metrics and raising various alerts around JVM metrics. -- This message was sent by Atlassian Jira (v8.20.10#820010)