Josh Elser created HBASE-22701: ---------------------------------- Summary: Better handle invalid local directory for DynamicClassLoader Key: HBASE-22701 URL: https://issues.apache.org/jira/browse/HBASE-22701 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Josh Elser Assignee: Josh Elser Fix For: 2.3.0, 2.2.1, 2.1.6
If you give HBase an {{hbase.local.dir}} (usually, "{{hbase.tmp.dir}}/local") which is not writable to it, you will get some weird errors on the scan path. I just saw this (again?) with Phoenix. Specifically, the first attempt to reference DynamicClassLoader (via ProtobufUtil), will result in an ExceptionInInitializationError because the unchecked exception coming out of DynamicClassLoader's constructor interrupts the loading of {{DynamicClassLoader.class}}. {noformat} 2019-07-14 06:25:34,284 ERROR [RpcServer.Metadata.Fifo.handler=12,queue=0,port=16020] coprocessor.MetaDataEndpointImpl: dropTable failed org.apache.hadoop.hbase.DoNotRetryIOException: java.lang.ExceptionInInitializerError at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.translateException(RpcRetryingCallerImpl.java:221) at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:194) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:387) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:361) at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107) at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ExceptionInInitializerError at org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toFilter(ProtobufUtil.java:1598) at org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toScan(ProtobufUtil.java:1152) at org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:2967) at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3301) at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:332) at org.apache.hadoop.hbase.client.ScannerCallable.rpcCall(ScannerCallable.java:242) at org.apache.hadoop.hbase.client.ScannerCallable.rpcCall(ScannerCallable.java:58) at org.apache.hadoop.hbase.client.RegionServerCallable.call(RegionServerCallable.java:127) at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192) ... 7 more Caused by: java.lang.RuntimeException: Failed to create local dir /hadoopfs/fs1/hbase/local/jars, DynamicClassLoader failed to init at org.apache.hadoop.hbase.util.DynamicClassLoader.initTempDir(DynamicClassLoader.java:110) at org.apache.hadoop.hbase.util.DynamicClassLoader.<init>(DynamicClassLoader.java:98) at org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil$ClassLoaderHolder.lambda$static$0(ProtobufUtil.java:261) at java.security.AccessController.doPrivileged(Native Method) at org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil$ClassLoaderHolder.<clinit>(ProtobufUtil.java:260) ... 16 more {noformat} Every subsequent call will result in a NoClassDefFoundError, because we already tried to load DynamicClassLoader.class once and failed. {noformat} 2019-07-14 06:25:34,380 ERROR [RpcServer.Metadata.Fifo.handler=2,queue=2,port=16020] coprocessor.MetaDataEndpointImpl: dropTable failed org.apache.hadoop.hbase.DoNotRetryIOException: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil$ClassLoaderHolder at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.translateException(RpcRetryingCallerImpl.java:221) at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:194) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:387) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:361) at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107) at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil$ClassLoaderHolder at org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toFilter(ProtobufUtil.java:1598) at org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toScan(ProtobufUtil.java:1152) at org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:2967) at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3301) at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:332) at org.apache.hadoop.hbase.client.ScannerCallable.rpcCall(ScannerCallable.java:242) at org.apache.hadoop.hbase.client.ScannerCallable.rpcCall(ScannerCallable.java:58) at org.apache.hadoop.hbase.client.RegionServerCallable.call(RegionServerCallable.java:127) at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192) ... 7 more {noformat} The client gets an error about this, and would presumably know that something is amiss, but an operator wouldn't potentially see this on their own. I see two options: # We abort the RegionServer when the DynamicClassLoader fails to run # We catch the exception and treat the DynamicClassLoader as disabled (same action as if you had set {{hbase.use.dynamic.jars=false}}). I want to do #1 so that we don't propagate bogus configuration, but it feels a bit "harsh" to do that. I think #2 is the right solution with a big-fat-warning. -- This message was sent by Atlassian JIRA (v7.6.14#76016)