[ https://issues.apache.org/jira/browse/HBASE-24402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17111712#comment-17111712 ]
Jeongdae Kim edited comment on HBASE-24402 at 5/20/20, 5:35 AM: ---------------------------------------------------------------- I think the reason why this warning occurs is that HBASE-9774 was ported to branch-1 HBASE-18060, and HBASE-18060 only fit for hadoop version < 2.6.0 by using reflections to remove some internal data. unfortunately, since hadoop version 2.6.0 HADOOP-11105 , some more internal data is added, which HBase 1.4.13 can't remove, so it makes this warning every 5 minutes and the metric source for coprocessor will be never re-registered. More specifically, It happens in the following scenario. * At first, The warning started to appear after the meta region was moved out from a host and then moved back to the host again some minutes after. {noformat} 2020-04-21 15:08:07,749 DEBUG [RS_CLOSE_META-XXX:11471-0] handler.CloseRegionHandler: Closed hbase:meta,,1.1588230740 ... 2020-04-21 15:08:19,169 INFO [RpcServer.priority.FPBQ.Fifo.handler=18,queue=0,port=11471] regionserver.RSRpcServices: Open hbase:meta,,1.1588230740 {noformat} * since HBASE-12016 ported to branch-1, the coprocessor 'MultiRowMutationEndpoint' has added to the meta table. {noformat} 2020-04-21 15:08:19,172 DEBUG [RS_OPEN_META-XXX:11471-0] coprocessor.CoprocessorHost: Loading coprocessor class org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint with path null and priority 536870911 2020-04-21 15:08:19,173 DEBUG [RS_OPEN_META-XXX:11471-0] regionserver.HRegion: Registered coprocessor service: region=hbase:meta,,1 service=MultiRowMut ationService 2020-04-21 15:08:19,174 INFO [RS_OPEN_META-XXX:11471-0] regionserver.RegionCoprocessorHost: Loaded coprocessor org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint from HTD of hbase:meta successfully. {noformat} * Because Coprocessor metrics introduced HBASE-18060, whenever meta region is closed or opened, a new metric source is added to hadoop metrics2 system or removed from it. {noformat} // (1) registered a metric source after a region server restarted 2020-04-20 14:23:04,579 DEBUG [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering adapter for the MetricRegistry: RegionServer,sub=Coprocessor.R egion.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint 2020-04-20 14:23:04,580 INFO [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint Metrics about HBase RegionObservers // (2) removed it because of moving meta out from the region server 2020-04-21 15:08:14,579 DEBUG [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Removing adapter for the MetricRegistry: RegionServer,sub=Coprocessor.Regi on.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint // (3) then trying to re-register it after meta region moved into the same region server 2020-04-21 15:08:24,579 DEBUG [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering adapter for the MetricRegistry: RegionServer,sub=Coprocessor.R egion.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint 2020-04-21 15:08:24,579 INFO [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint Metrics about HBase RegionObservers // and, the log appears every 5 minutes {noformat} [https://github.com/apache/hbase/blob/rel/1.4.13/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/metrics/impl/GlobalMetricRegistriesAdapter.java#L151] [https://github.com/apache/hbase/blob/rel/1.4.13/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/metrics/impl/GlobalMetricRegistriesAdapter.java#L170-L176] * Since under 2.6.0 hadoop doesn't provide any interface to unregister existing metric sources (introduced since https://issues.apache.org/jira/browse/HADOOP-10839), branch-1 ported version uses reflections to remove a metric source from metric system. [https://github.com/apache/hbase/blob/rel/1.4.13/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/metrics/impl/GlobalMetricRegistriesAdapter.java#L197-L226] * after HADOOP-11105 introduced (2.6.0), a new field 'namedCallbacks' was added to re-register the existing sources when metric system restarted. Because current HBase implementation couldn't touch that field by reflection, a callback to re-register a metric source for the coprocessor still remains in the map, even after meta region moved out. * a scheduled task restart hadoop metric system every 5 minutes. So, a new metric source for MultiRowMutationEndpoint coprocessor is registered every restart, even after meta region moved out. [https://github.com/apache/hbase/blob/rel/1.4.13/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java#L59-L63] [https://github.com/apache/hbase/blob/rel/1.4.13/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/impl/JmxCacheBuster.java#L103-L110] * When meta region was moved back at the time (3) in above logs, HBase tried to register a new metric source, and that try failed because the source with the same name already exists. However, even though the registration failed, that try replaces existing metric source (re-registered by restart) with new one which have 'mbeanName' == null due to registration failure. [https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java#L267] [https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/util/MBeans.java#L124] [https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java#L222] {noformat} 2020-04-21 15:08:24,579 INFO [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint Metrics about HBase RegionObservers 2020-04-21 15:08:24,579 WARN [HBase-Metrics2-1] util.MBeans: Error creating MBean object name: Hadoop:service=HBase,name=RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint org.apache.hadoop.metrics2.MetricsException: org.apache.hadoop.metrics2.MetricsException: Hadoop:service=HBase,name=RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint already exists! {noformat} * when metric system is stopped by a scheduled task, de-registration process by stop for a metric source should unregister MBean and remove its name from DefaultMetricsSystem("mBeanNames" field), but it couldn't, because existing metric source have null for 'mbeanName' by replacement. [https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/util/MBeans.java#L105-L114] * So, when metric system starts again, registering a metric source for the coprocessor is failed, because its name still exists in DefaultMetricsSystem. and finally we can show the exception in log whenever metric system restart every 5 minutes. was (Author: jeongdae kim): I think the reason why this warning occurs is that HBASE-9774 was ported to branch-1 HBASE-18060, and HBASE-18060 only fit for hadoop version < 2.6.0 by using reflections to remove some internal data. unfortunately, since hadoop version 2.6.0 HADOOP-11105 , some more internal data is added, which HBase 1.4.13 can't remove, so it makes this warning every 5 minutes and the metric source for coprocessor will be never re-registered. More specifically, It happens in the following scenario. * At first, The warning started to appear after the meta region was moved out from a host and then moved back to the host again some minutes after. {noformat} 2020-04-21 15:08:07,749 DEBUG [RS_CLOSE_META-XXX:11471-0] handler.CloseRegionHandler: Closed hbase:meta,,1.1588230740 ... 2020-04-21 15:08:19,169 INFO [RpcServer.priority.FPBQ.Fifo.handler=18,queue=0,port=11471] regionserver.RSRpcServices: Open hbase:meta,,1.1588230740 {noformat} * since HBASE-12016 ported to branch-1, the coprocessor 'MultiRowMutationEndpoint' has added to the meta table. {noformat} 2020-04-21 15:08:19,172 DEBUG [RS_OPEN_META-XXX:11471-0] coprocessor.CoprocessorHost: Loading coprocessor class org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint with path null and priority 536870911 2020-04-21 15:08:19,173 DEBUG [RS_OPEN_META-XXX:11471-0] regionserver.HRegion: Registered coprocessor service: region=hbase:meta,,1 service=MultiRowMut ationService 2020-04-21 15:08:19,174 INFO [RS_OPEN_META-XXX:11471-0] regionserver.RegionCoprocessorHost: Loaded coprocessor org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint from HTD of hbase:meta successfully. {noformat} * Because Coprocessor metrics introduced HBASE-18060, whenever meta region is closed or opened, a new metric source is added to hadoop metrics2 system or removed from it. {noformat} // (1) registered a metric source after a region server restarted 2020-04-20 14:23:04,579 DEBUG [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering adapter for the MetricRegistry: RegionServer,sub=Coprocessor.R egion.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint 2020-04-20 14:23:04,580 INFO [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint Metrics about HBase RegionObservers // (2) removed it because of moving meta out from the region server 2020-04-21 15:08:14,579 DEBUG [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Removing adapter for the MetricRegistry: RegionServer,sub=Coprocessor.Regi on.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint // (3) then trying to re-register it after meta region moved into the same region server 2020-04-21 15:08:24,579 DEBUG [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering adapter for the MetricRegistry: RegionServer,sub=Coprocessor.R egion.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint 2020-04-21 15:08:24,579 INFO [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint Metrics about HBase RegionObservers // and, the log appears every 5 minutes {noformat} [https://git.linecorp.com/LINE-Server/line-hbase/blob/1.4.13-line/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/metrics/impl/GlobalMetricRegistriesAdapter.java#L151] [https://git.linecorp.com/LINE-Server/line-hbase/blob/1.4.13-line/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/metrics/impl/GlobalMetricRegistriesAdapter.java#L170-L176] * Since under 2.6.0 hadoop doesn't provide any interface to unregister existing metric sources (introduced since https://issues.apache.org/jira/browse/HADOOP-10839), branch-1 ported version uses reflections to remove a metric source from metric system. [https://git.linecorp.com/LINE-Server/line-hbase/blob/1.4.13-line/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/metrics/impl/GlobalMetricRegistriesAdapter.java#L197-L226] * after HADOOP-11105 introduced (2.6.0), a new field 'namedCallbacks' was added to re-register the existing sources when metric system restarted. Because current HBase implementation couldn't touch that field by reflection, a callback to re-register a metric source for the coprocessor still remains in the map, even after meta region moved out. * a scheduled task restart hadoop metric system every 5 minutes. So, a new metric source for MultiRowMutationEndpoint coprocessor is registered every restart, even after meta region moved out. [https://git.linecorp.com/LINE-Server/line-hbase/blob/1.4.13-line/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionAggregateSourceImpl.java#L59-L63] [https://git.linecorp.com/LINE-Server/line-hbase/blob/1.4.13-line/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/impl/JmxCacheBuster.java#L103-L110] * When meta region was moved back at the time (3) in above logs, HBase tried to register a new metric source, and that try failed because the source with the same name already exists. However, even though the registration failed, that try replaces existing metric source (re-registered by restart) with new one which have 'mbeanName' == null due to registration failure. [https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java#L267] [https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/util/MBeans.java#L124] [https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java#L222] {noformat} 2020-04-21 15:08:24,579 INFO [HBase-Metrics2-1] impl.GlobalMetricRegistriesAdapter: Registering RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint Metrics about HBase RegionObservers 2020-04-21 15:08:24,579 WARN [HBase-Metrics2-1] util.MBeans: Error creating MBean object name: Hadoop:service=HBase,name=RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint org.apache.hadoop.metrics2.MetricsException: org.apache.hadoop.metrics2.MetricsException: Hadoop:service=HBase,name=RegionServer,sub=Coprocessor.Region.CP_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint already exists! {noformat} * when metric system is stopped by a scheduled task, de-registration process by stop for a metric source should unregister MBean and remove its name from DefaultMetricsSystem("mBeanNames" field), but it couldn't, because existing metric source have null for 'mbeanName' by replacement. [https://github.com/apache/hadoop/blob/branch-2.8.5/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/util/MBeans.java#L105-L114] * So, when metric system starts again, registering a metric source for the coprocessor is failed, because its name still exists in DefaultMetricsSystem. and finally we can show the exception in log whenever metric system restart every 5 minutes. > Moving the meta region causes MetricsException when using above 2.6.0 hadoop > version > ------------------------------------------------------------------------------------ > > Key: HBASE-24402 > URL: https://issues.apache.org/jira/browse/HBASE-24402 > Project: HBase > Issue Type: Bug > Affects Versions: 1.4.13 > Reporter: Jeongdae Kim > Assignee: Jeongdae Kim > Priority: Minor > > When the meta region moved out from a region server then moved in to the one > again, we got MetricsException below and the metric source for coprocessor > couldn't be re-registered. it happened with hadoop 2.8.5 > {noformat} > 2020-04-21 15:13:02,956 WARN [HBase-Metrics2-1] util.MBeans: Error creating > MBean object name: > Hadoop:service=HBase,name=RegionServer,sub=Coprocessor.Region.C > P_org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint > org.apache.hadoop.metrics2.MetricsException: > org.apache.hadoop.metrics2.MetricsException: > Hadoop:service=HBase,name=RegionServer,sub=Coprocessor.Region.CP_org. > apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint already exists! > at > org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newObjectName(DefaultMetricsSystem.java:127) > at > org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newMBeanName(DefaultMetricsSystem.java:102) > at org.apache.hadoop.metrics2.util.MBeans.getMBeanName(MBeans.java:121) > at org.apache.hadoop.metrics2.util.MBeans.register(MBeans.java:64) > at > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.startMBeans(MetricsSourceAdapter.java:222) > at > org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.start(MetricsSourceAdapter.java:100) > at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl.registerSource(MetricsSystemImpl.java:268) > at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl$1.postStart(MetricsSystemImpl.java:239) > at jdk.internal.reflect.GeneratedMethodAccessor82.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl$3.invoke(MetricsSystemImpl.java:320) > at com.sun.proxy.$Proxy10.postStart(Unknown Source) > at > org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:193) > at > org.apache.hadoop.metrics2.impl.JmxCacheBuster$JmxCacheBusterRunnable.run(JmxCacheBuster.java:109) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)