[
https://issues.apache.org/jira/browse/HBASE-6873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871319#comment-13871319
]
Gary Helmling commented on HBASE-6873:
--------------------------------------
I understand the argument about data consistency in the case of unloaded
coprocessors, so I see the need to change the default value of
hbase.coprocessor.abortonerror. But does this make it so that a misconfigured
table-level coprocessor can bring down an entire cluster due to a missing
dependency class? If so, that really doesn't help our stability or
multi-tenant story. Seems like a more ideal approach would be to offline the
table due to the coprocessor loading error (say with a prominent message on the
master UI), but keep servers up so that the problem can be fixed. For system
level coprocessors, an abort might still be appropriate.
Maybe out of scope for this change, but what do you think about this as a
follow on issue?
> Clean up Coprocessor loading failure handling
> ---------------------------------------------
>
> Key: HBASE-6873
> URL: https://issues.apache.org/jira/browse/HBASE-6873
> Project: HBase
> Issue Type: Sub-task
> Components: Coprocessors, regionserver
> Affects Versions: 0.98.0
> Reporter: David Arthur
> Assignee: Andrew Purtell
> Priority: Blocker
> Fix For: 0.98.0, 0.99.0
>
> Attachments: 6873.patch, 6873.patch, 6873.patch, 6873.patch,
> 6873.patch, 6873.patch
>
>
> When registering a coprocessor with a missing dependency, the regionserver
> gets stuck in an infinite fail loop. Restarting the regionserver and/or
> master has no affect.
> E.g.,
> Load coprocessor from my-coproc.jar, that uses an external dependency (kafka)
> that is not included with HBase.
> {code}
> 12/09/24 13:13:15 INFO handler.OpenRegionHandler: Opening of region {NAME =>
> 'documents,,1348505987177.6d1e1b7bb93486f096173bd401e8ef6b.', STARTKEY => '',
> ENDKEY => '', ENCODED => 6d1e1b7bb93486f096173bd401e8ef6b,} failed, marking
> as FAILED_OPEN in ZK
> 12/09/24 13:13:15 DEBUG zookeeper.ZKAssign:
> regionserver:60020-0x139f43af2a70043 Attempting to transition node
> 6d1e1b7bb93486f096173bd401e8ef6b from RS_ZK_REGION_OPENING to
> RS_ZK_REGION_FAILED_OPEN
> 12/09/24 13:13:15 DEBUG zookeeper.ZKAssign:
> regionserver:60020-0x139f43af2a70043 Successfully transitioned node
> 6d1e1b7bb93486f096173bd401e8ef6b from RS_ZK_REGION_OPENING to
> RS_ZK_REGION_FAILED_OPEN
> 12/09/24 13:13:15 INFO regionserver.HRegionServer: Received request to open
> region: documents,,1348505987177.6d1e1b7bb93486f096173bd401e8ef6b.
> 12/09/24 13:13:15 DEBUG zookeeper.ZKAssign:
> regionserver:60020-0x139f43af2a70043 Attempting to transition node
> 6d1e1b7bb93486f096173bd401e8ef6b from M_ZK_REGION_OFFLINE to
> RS_ZK_REGION_OPENING
> 12/09/24 13:13:15 DEBUG zookeeper.ZKAssign:
> regionserver:60020-0x139f43af2a70043 Successfully transitioned node
> 6d1e1b7bb93486f096173bd401e8ef6b from M_ZK_REGION_OFFLINE to
> RS_ZK_REGION_OPENING
> 12/09/24 13:13:15 DEBUG regionserver.HRegion: Opening region: {NAME =>
> 'documents,,1348505987177.6d1e1b7bb93486f096173bd401e8ef6b.', STARTKEY => '',
> ENDKEY => '', ENCODED => 6d1e1b7bb93486f096173bd401e8ef6b,}
> 12/09/24 13:13:15 INFO regionserver.HRegion: Setting up tabledescriptor
> config now ...
> 12/09/24 13:13:15 INFO coprocessor.CoprocessorHost: Class
> com.mycompany.hbase.documents.DocumentObserverCoprocessor needs to be loaded
> from a file - file:/path/to/my-coproc.jar.
> 12/09/24 13:13:16 ERROR handler.OpenRegionHandler: Failed open of
> region=documents,,1348505987177.6d1e1b7bb93486f096173bd401e8ef6b., starting
> to roll back the global memstore size.
> java.lang.IllegalStateException: Could not instantiate a region instance.
> at
> org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3595)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3733)
> at
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
> at
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
> at
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.GeneratedConstructorAccessor15.newInstance(Unknown
> Source)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3592)
> ... 7 more
> Caused by: java.lang.NoClassDefFoundError:
> kafka/common/NoBrokersForPartitionException
> at java.lang.Class.getDeclaredConstructors0(Native Method)
> at java.lang.Class.privateGetDeclaredConstructors(Class.java:2389)
> at java.lang.Class.getConstructor0(Class.java:2699)
> at java.lang.Class.newInstance0(Class.java:326)
> at java.lang.Class.newInstance(Class.java:308)
> at
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.loadInstance(CoprocessorHost.java:254)
> at
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.load(CoprocessorHost.java:227)
> at
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.loadTableCoprocessors(RegionCoprocessorHost.java:162)
> at
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.<init>(RegionCoprocessorHost.java:126)
> at org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:417)
> ... 11 more
> Caused by: java.lang.ClassNotFoundException:
> kafka.common.NoBrokersForPartitionException
> at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> ... 21 more
> 12/09/24 13:13:16 INFO handler.OpenRegionHandler: Opening of region {NAME =>
> 'documents,,1348505987177.6d1e1b7bb93486f096173bd401e8ef6b.', STARTKEY => '',
> ENDKEY => '', ENCODED => 6d1e1b7bb93486f096173bd401e8ef6b,} failed, marking
> as FAILED_OPEN in ZK
> 12/09/24 13:13:16 DEBUG zookeeper.ZKAssign:
> regionserver:60020-0x139f43af2a70043 Attempting to transition node
> 6d1e1b7bb93486f096173bd401e8ef6b from RS_ZK_REGION_OPENING to
> RS_ZK_REGION_FAILED_OPEN
> 12/09/24 13:13:16 DEBUG zookeeper.ZKAssign:
> regionserver:60020-0x139f43af2a70043 Successfully transitioned node
> 6d1e1b7bb93486f096173bd401e8ef6b from RS_ZK_REGION_OPENING to
> RS_ZK_REGION_FAILED_OPEN
> 12/09/24 13:13:16 INFO regionserver.HRegionServer: Received request to open
> region: documents,,1348505987177.6d1e1b7bb93486f096173bd401e8ef6b.
> 12/09/24 13:13:16 DEBUG zookeeper.ZKAssign:
> regionserver:60020-0x139f43af2a70043 Attempting to transition node
> 6d1e1b7bb93486f096173bd401e8ef6b from M_ZK_REGION_OFFLINE to
> RS_ZK_REGION_OPENING
> 12/09/24 13:13:16 DEBUG zookeeper.ZKAssign:
> regionserver:60020-0x139f43af2a70043 Successfully transitioned node
> 6d1e1b7bb93486f096173bd401e8ef6b from M_ZK_REGION_OFFLINE to
> RS_ZK_REGION_OPENING
> 12/09/24 13:13:16 DEBUG regionserver.HRegion: Opening region: {NAME =>
> 'documents,,1348505987177.6d1e1b7bb93486f096173bd401e8ef6b.', STARTKEY => '',
> ENDKEY => '', ENCODED => 6d1e1b7bb93486f096173bd401e8ef6b,}
> 12/09/24 13:13:16 INFO regionserver.HRegion: Setting up tabledescriptor
> config now ...
> 12/09/24 13:13:16 INFO coprocessor.CoprocessorHost: Class
> com.mycompany.hbase.documents.DocumentObserverCoprocessor needs to be loaded
> from a file - file:/path/to/my-coproc.jar.
> 12/09/24 13:13:17 ERROR handler.OpenRegionHandler: Failed open of
> region=documents,,1348505987177.6d1e1b7bb93486f096173bd401e8ef6b., starting
> to roll back the global memstore size.
> java.lang.IllegalStateException: Could not instantiate a region instance.
> at
> org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3595)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3733)
> at
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
> at
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
> at
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.GeneratedConstructorAccessor15.newInstance(Unknown
> Source)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3592)
> ... 7 more
> Caused by: java.lang.NoClassDefFoundError:
> kafka/common/NoBrokersForPartitionException
> at java.lang.Class.getDeclaredConstructors0(Native Method)
> at java.lang.Class.privateGetDeclaredConstructors(Class.java:2389)
> at java.lang.Class.getConstructor0(Class.java:2699)
> at java.lang.Class.newInstance0(Class.java:326)
> at java.lang.Class.newInstance(Class.java:308)
> at
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.loadInstance(CoprocessorHost.java:254)
> at
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.load(CoprocessorHost.java:227)
> at
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.loadTableCoprocessors(RegionCoprocessorHost.java:162)
> at
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.<init>(RegionCoprocessorHost.java:126)
> at org.apache.hadoop.hbase.regionserver.HRegion.<init>(HRegion.java:417)
> ... 11 more
> Caused by: java.lang.ClassNotFoundException:
> kafka.common.NoBrokersForPartitionException
> at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> ... 21 more
> {code}
> Ad infinitum.
> It seems that upon failing to open a region after adding a coprocessor, that
> coprocessor should be unregister or at least disabled/blacklisted.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)