[
https://issues.apache.org/jira/browse/HBASE-19409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275700#comment-16275700
]
Haibo Chen commented on HBASE-19409:
------------------------------------
NM registers the JvmMetrics in JvmMetrics.create(MetricsSystem) which does not
share/instantiate the JvmMetrics singleton instance and it happens to be the
case that NM is using the static DefaultMetricsSystem.instance(). What this
effectively does, is NM creates an instance of JvmMetrics that is not
JvmMetrics.Singleton.INSTANCE, and registers a JvmMetric source with the
default metrics system.
When the HBase client starts up, it will reference
JvmMetrics.Singleton.INSTANCE instead, see the variable `impl` as null and
tries to register the JvmMetrics with DefaultMetricsSystem.instance() again.
Despite the two JvmMetrics instances are different, Metrics System identifies
sources by their name, so there is a duplicate as far as Metrics System is
concerned.
I will try to change NM so that it calls JvmMetrics.initSingleton() instead,
which should avoid the crash here, and give updates.
In the meantime,
```DefaultMetricsSystem.initialize(HBASE_METRICS_SYSTEM_NAME);``` does seem
problematic because this will override the prefix that NM has set up
previously. All NM metrics will take "HBASE" as the prefix after that.
> HBase client brings down YARN node manager when it tries to register
> JvmMetrics in the hadoop metrics system
> ------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-19409
> URL: https://issues.apache.org/jira/browse/HBASE-19409
> Project: HBase
> Issue Type: Bug
> Components: Client
> Affects Versions: 2.0.0-alpha-4
> Reporter: Haibo Chen
> Priority: Critical
>
> YARN ATSv2 leverages HBase as its data store. When ATSv2 is enabled,
> YARN NM will act as HBase clients to write data into HBase cluster.
> Because YARN NM jvms already register jvmMetrics in the metrics system and
> no duplicate is allowed, when HBase client tries to register jvmMetrics
> again, NM will crash with the following exception.
> {code}
> ERROR org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting
> NodeManager
> org.apache.hadoop.service.ServiceStateException: java.io.IOException:
> java.lang.reflect.InvocationTargetException
> at
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:173)
> at
> org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager.serviceInit(TimelineCollectorManager.java:62)
> at
> org.apache.hadoop.yarn.server.timelineservice.collector.NodeTimelineCollectorManager.serviceInit(NodeTimelineCollectorManager.java:112)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> at
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService.serviceInit(PerNodeTimelineCollectorsAuxService.java:87)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:167)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> at
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:315)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> at
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:440)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:833)
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:894)
> Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
> at
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:221)
> at
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:114)
> at
> org.apache.hadoop.yarn.server.timelineservice.storage.HBaseTimelineWriterImpl.serviceInit(HBaseTimelineWriterImpl.java:123)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> ... 15 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at
> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:219)
> ... 18 more
> Caused by: java.lang.RuntimeException: Could not create interface
> org.apache.hadoop.hbase.zookeeper.MetricsZooKeeperSource Is the hadoop
> compatibility jar on the classpath?
> at
> org.apache.hadoop.hbase.CompatibilitySingletonFactory.getInstance(CompatibilitySingletonFactory.java:75)
> at
> org.apache.hadoop.hbase.zookeeper.MetricsZooKeeper.<init>(MetricsZooKeeper.java:38)
> at
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.<init>(RecoverableZooKeeper.java:130)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.connect(ZKUtil.java:137)
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:134)
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:108)
> at
> org.apache.hadoop.hbase.client.ZooKeeperKeepAliveConnection.<init>(ZooKeeperKeepAliveConnection.java:43)
> at
> org.apache.hadoop.hbase.client.ConnectionImplementation.getKeepAliveZooKeeperWatcher(ConnectionImplementation.java:1231)
> at
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:101)
> at
> org.apache.hadoop.hbase.client.ConnectionImplementation.retrieveClusterId(ConnectionImplementation.java:526)
> at
> org.apache.hadoop.hbase.client.ConnectionImplementation.<init>(ConnectionImplementation.java:288)
> ... 23 more
> Caused by: java.util.ServiceConfigurationError:
> org.apache.hadoop.hbase.zookeeper.MetricsZooKeeperSource: Provider
> org.apache.hadoop.hbase.zookeeper.MetricsZooKeeperSourceImpl could not be
> instantiated
> at java.util.ServiceLoader.fail(ServiceLoader.java:232)
> at java.util.ServiceLoader.access$100(ServiceLoader.java:185)
> at
> java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)
> at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
> at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
> at
> org.apache.hadoop.hbase.CompatibilitySingletonFactory.getInstance(CompatibilitySingletonFactory.java:59)
> ... 33 more
> Caused by: org.apache.hadoop.metrics2.MetricsException: Metrics source
> JvmMetrics already exists!
> at
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:152)
> at
> org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:125)
> at
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:229)
> at
> org.apache.hadoop.metrics2.source.JvmMetrics.create(JvmMetrics.java:111)
> at
> org.apache.hadoop.metrics2.source.JvmMetrics$Singleton.init(JvmMetrics.java:61)
> at
> org.apache.hadoop.metrics2.source.JvmMetrics.initSingleton(JvmMetrics.java:120)
> at
> org.apache.hadoop.hbase.metrics.BaseSourceImpl$DefaultMetricsSystemInitializer.init(BaseSourceImpl.java:52)
> at
> org.apache.hadoop.hbase.metrics.BaseSourceImpl.<init>(BaseSourceImpl.java:112)
> at
> org.apache.hadoop.hbase.zookeeper.MetricsZooKeeperSourceImpl.<init>(MetricsZooKeeperSourceImpl.java:56)
> at
> org.apache.hadoop.hbase.zookeeper.MetricsZooKeeperSourceImpl.<init>(MetricsZooKeeperSourceImpl.java:51)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at java.lang.Class.newInstance(Class.java:442)
> at
> java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)
> ... 36 more
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)