[jira] [Commented] (HBASE-20035) .TestQuotaStatusRPCs.testQuotaStatusFromMaster failed with NPEs and RuntimeExceptions

stack (JIRA) Wed, 21 Feb 2018 08:12:22 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-20035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371596#comment-16371596
 ]


stack commented on HBASE-20035:
-------------------------------

bq. The test is ultimately waiting 30s for the presence of the hbase:quota 
table and some data which should be written in there.

What is interesting about the second exception is that we've just decided to 
make the hbase:quota table...  and are making progress but then do NPE. I was 
thinking this renders a body blow to startup. I've not spent much time at it.

For the first exception, it looks like I pasted the wrong-one. There is another 
RuntimeException making its way out which seems to be related to unexpected 
Scan accounting math brought on by quota.

Should if you don't have time boss and I'll spend more time on it. Thanks 
[~elserj]


> .TestQuotaStatusRPCs.testQuotaStatusFromMaster failed with NPEs and 
> RuntimeExceptions
> -------------------------------------------------------------------------------------
>
>                 Key: HBASE-20035
>                 URL: https://issues.apache.org/jira/browse/HBASE-20035
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: stack
>            Priority: Major
>
> It failed the nightly.
> Says this...
> Error Message
> Waiting timed out after [30,000] msec
> Stacktrace
> java.lang.AssertionError: Waiting timed out after [30,000] msec
>       at 
> org.apache.hadoop.hbase.quotas.TestQuotaStatusRPCs.testQuotaStatusFromMaster(TestQuotaStatusRPCs.java:267)
> ... but looking in log I see following:
> Odd thing is the test is run three times and it failed all three times for 
> same reason.
> [ERROR] Failures: 
> [ERROR] 
> org.apache.hadoop.hbase.quotas.TestQuotaStatusRPCs.testQuotaStatusFromMaster(org.apache.hadoop.hbase.quotas.TestQuotaStatusRPCs)
> [ERROR]   Run 1: TestQuotaStatusRPCs.testQuotaStatusFromMaster:267 Waiting 
> timed out after [30,000] msec
> [ERROR]   Run 2: TestQuotaStatusRPCs.testQuotaStatusFromMaster:267 Waiting 
> timed out after [30,000] msec
> [ERROR]   Run 3: TestQuotaStatusRPCs.testQuotaStatusFromMaster:267 Waiting 
> timed out after [30,000] msec
> If you go to build artifacts you can download full -output.txt log. I see 
> stuff like this which might be ok....
> {code}
> 2018-02-21 01:29:59,546 INFO  
> [StoreCloserThread-testQuotaStatusFromMaster4,0,1519176558800.1dbd00f38915cd276410065f85140b26.-1]
>  regionserver.HStore(930): Closed f1
> 2018-02-21 01:29:59,551 ERROR [master/ad51e354307e:0.Chore.2] 
> hbase.ScheduledChore(189): Caught error
> java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: 
> Task 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@79ec2ef9
>  rejected from java.util.concurrent.ThreadPoolExecutor@5198a326[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 142]
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:200)
>   at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:269)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:437)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.nextWithSyncCache(ClientScanner.java:312)
>   at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:597)
>   at 
> org.apache.hadoop.hbase.quotas.QuotaRetriever.next(QuotaRetriever.java:106)
>   at 
> org.apache.hadoop.hbase.quotas.QuotaRetriever$Iter.<init>(QuotaRetriever.java:125)
>   at 
> org.apache.hadoop.hbase.quotas.QuotaRetriever.iterator(QuotaRetriever.java:117)
>   at 
> org.apache.hadoop.hbase.quotas.QuotaObserverChore.fetchAllTablesWithQuotasDefined(QuotaObserverChore.java:458)
>   at 
> org.apache.hadoop.hbase.quotas.QuotaObserverChore._chore(QuotaObserverChore.java:148)
>   at 
> org.apache.hadoop.hbase.quotas.QuotaObserverChore.chore(QuotaObserverChore.java:136)
>   at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>   at 
> org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@79ec2ef9
>  rejected from java.util.concurrent.ThreadPoolExecutor@5198a326[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 142]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
>   at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.submit(ResultBoundedCompletionService.java:171)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.addCallsForCurrentReplica(ScannerCallableWithReplicas.java:320)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:182)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:58)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192)
>   ... 19 more
> {code}
> .. but then this which looks less ok:
> {code}
> 2018-02-21 01:30:24,044 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=39067] 
> ipc.CallRunner(141): callId: 16 service: RegionServerStatusService 
> methodName: ReportRegionSpaceUse size: 109 connection: 172.17.0.2:57185 
> deadline: 1519176634044
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2741)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.reportRegionSpaceUse(MasterRpcServices.java:2135)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:11053)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> 2018-02-21 01:30:24,046 ERROR [regionserver/ad51e354307e:0.Chore.1] 
> hbase.ScheduledChore(189): Caught error
> java.io.UncheckedIOException: org.apache.hadoop.hbase.TableNotFoundException: 
> hbase:quota
>   at 
> org.apache.hadoop.hbase.client.ResultScanner$1.hasNext(ResultScanner.java:55)
>   at 
> org.apache.hadoop.hbase.quotas.SpaceQuotaRefresherChore.fetchSnapshotsFromQuotaTable(SpaceQuotaRefresherChore.java:151)
>   at 
> org.apache.hadoop.hbase.quotas.SpaceQuotaRefresherChore.chore(SpaceQuotaRefresherChore.java:84)
>   at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>   at 
> org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hbase.TableNotFoundException: hbase:quota
>   at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:842)
>   at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:733)
>   at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:325)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:153)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:58)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192)
>   at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:269)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:437)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.nextWithSyncCache(ClientScanner.java:312)
>   at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:597)
>   at 
> org.apache.hadoop.hbase.client.ResultScanner$1.hasNext(ResultScanner.java:53)
>   ... 11 more
> 2018-02-21 01:30:24,179 DEBUG [Time-limited test-EventThread] 
> zookeeper.ZKWatcher(443): master:39067-0x161b5fb62d10000, 
> quorum=localhost:49877, baseZNode=/hbase Received ZooKeeper Event, 
> type=NodeChildrenChanged, state=SyncConnected, path=/hbase/namespace
> 2018-02-21 01:30:24,604 INFO  [PEWorker-13] 
> procedure2.ProcedureExecutor(1247): Finished pid=6, state=SUCCESS; 
> CreateNamespaceProcedure, namespace=hbase in 1.0980sec
> 2018-02-21 01:30:24,772 DEBUG [Time-limited test-EventThread] 
> zookeeper.ZKWatcher(443): master:39067-0x161b5fb62d10000, 
> quorum=localhost:49877, baseZNode=/hbase Received ZooKeeper Event, 
> type=NodeDataChanged, state=SyncConnected, path=/hbase/namespace/default
> 2018-02-21 01:30:24,830 DEBUG [Time-limited test-EventThread] 
> zookeeper.ZKWatcher(443): master:39067-0x161b5fb62d10000, 
> quorum=localhost:49877, baseZNode=/hbase Received ZooKeeper Event, 
> type=NodeDataChanged, state=SyncConnected, path=/hbase/namespace/hbase
> 2018-02-21 01:30:24,831 INFO  [M:0;ad51e354307e:39067] master.HMaster(927): 
> Master has completed initialization 17.276sec
> 2018-02-21 01:30:24,832 INFO  [M:0;ad51e354307e:39067] 
> quotas.MasterQuotaManager(87): Quota table not found. Creating...
> 2018-02-21 01:30:24,833 INFO  [M:0;ad51e354307e:39067] master.HMaster(1783): 
> Client=null/null create 'hbase:quota', {NAME => 'q', VERSIONS => '1', 
> EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', 
> KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', 
> DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', 
> REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 
> 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', 
> PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 
> 'true', BLOCKSIZE => '65536'}, {NAME => 'u', VERSIONS => '1', 
> EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', 
> KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', 
> DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', 
> REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 
> 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', 
> PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 
> 'true', BLOCKSIZE => '65536'}
> 2018-02-21 01:30:24,979 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=39067] 
> ipc.RpcServer(464): Unexpected throwable object
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.reportRegionSpaceUse(MasterRpcServices.java:2142)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:11053)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> 2018-02-21 01:30:24,980 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=39067] 
> ipc.CallRunner(141): callId: 18 service: RegionServerStatusService 
> methodName: ReportRegionSpaceUse size: 109 connection: 172.17.0.2:57185 
> deadline: 1519176634979
> java.io.IOException
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:465)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.reportRegionSpaceUse(MasterRpcServices.java:2142)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:11053)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
>   ... 3 more
> 2018-02-21 01:30:24,982 DEBUG [regionserver/ad51e354307e:0.Chore.1] 
> regionserver.HRegionServer(1245): Failed to report region sizes to Master. 
> This will be retried.
> java.io.IOException: java.io.IOException
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:465)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.reportRegionSpaceUse(MasterRpcServices.java:2142)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:11053)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
>   ... 3 more
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:358)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:335)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.reportRegionSizesForQuotas(HRegionServer.java:1224)
>   at 
> org.apache.hadoop.hbase.quotas.FileSystemUtilizationChore.reportRegionSizesToMaster(FileSystemUtilizationChore.java:187)
>   at 
> org.apache.hadoop.hbase.quotas.FileSystemUtilizationChore.chore(FileSystemUtilizationChore.java:136)
>   at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>   at 
> org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): 
> java.io.IOException
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:465)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.reportRegionSpaceUse(MasterRpcServices.java:2142)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:11053)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
>   ... 3 more
> {code}
> You might have some input mighty [~elserj] ? Thanks sir. It failed only this 
> once. I could up the timeouts from 30 seconds because that allowed but my 
> guess it he NPE does us in. Thanks sir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20035) .TestQuotaStatusRPCs.testQuotaStatusFromMaster failed with NPEs and RuntimeExceptions

Reply via email to