[jira] [Commented] (HBASE-20035) .TestQuotaStatusRPCs.testQuotaStatusFromMaster failed with NPEs and RuntimeExceptions

Josh Elser (JIRA) Wed, 21 Feb 2018 13:00:35 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-20035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372002#comment-16372002
 ]


Josh Elser commented on HBASE-20035:
------------------------------------

Looks like the test itself is just "wrong" now due to size of serialized data 
on disk.

This is the perceived amount of data ingestion for this test from the first run
{noformat}
2018-02-21 07:16:00,077 INFO  [Time-limited test] 
quotas.TestQuotaStatusRPCs$4(225): Table snapshot after initial ingest: 
SpaceQuotaSnapshot[policy=SpaceQuotaStatus[policy=NO_INSERTS, 
inViolation=true], use=47.69 KB/10 KB]
2018-02-21 07:16:00,081 DEBUG [Time-limited test] 
quotas.TestQuotaStatusRPCs$5(239): Namespace snapshot after initial ingest: 
SpaceQuotaSnapshot[policy=SpaceQuotaStatus[policy=null, inViolation=false], 
use=207.46 KB/8.00 EB]
{noformat}

Then, on a second ingest, we didn't actually increase the size of the table 
since it was already in violation. So, the condition could never be true 
because we didn't actually get to ingest more data.

{code}
snapshot.getUsage() > nsUsage.get()
{code}

Let me try to tweak.

> .TestQuotaStatusRPCs.testQuotaStatusFromMaster failed with NPEs and 
> RuntimeExceptions
> -------------------------------------------------------------------------------------
>
>                 Key: HBASE-20035
>                 URL: https://issues.apache.org/jira/browse/HBASE-20035
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: stack
>            Priority: Major
>
> It failed the nightly.
> Says this...
> Error Message
> Waiting timed out after [30,000] msec
> Stacktrace
> java.lang.AssertionError: Waiting timed out after [30,000] msec
>       at 
> org.apache.hadoop.hbase.quotas.TestQuotaStatusRPCs.testQuotaStatusFromMaster(TestQuotaStatusRPCs.java:267)
> ... but looking in log I see following:
> Odd thing is the test is run three times and it failed all three times for 
> same reason.
> [ERROR] Failures: 
> [ERROR] 
> org.apache.hadoop.hbase.quotas.TestQuotaStatusRPCs.testQuotaStatusFromMaster(org.apache.hadoop.hbase.quotas.TestQuotaStatusRPCs)
> [ERROR]   Run 1: TestQuotaStatusRPCs.testQuotaStatusFromMaster:267 Waiting 
> timed out after [30,000] msec
> [ERROR]   Run 2: TestQuotaStatusRPCs.testQuotaStatusFromMaster:267 Waiting 
> timed out after [30,000] msec
> [ERROR]   Run 3: TestQuotaStatusRPCs.testQuotaStatusFromMaster:267 Waiting 
> timed out after [30,000] msec
> If you go to build artifacts you can download full -output.txt log. I see 
> stuff like this which might be ok....
> {code}
> 2018-02-21 01:29:59,546 INFO  
> [StoreCloserThread-testQuotaStatusFromMaster4,0,1519176558800.1dbd00f38915cd276410065f85140b26.-1]
>  regionserver.HStore(930): Closed f1
> 2018-02-21 01:29:59,551 ERROR [master/ad51e354307e:0.Chore.2] 
> hbase.ScheduledChore(189): Caught error
> java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: 
> Task 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@79ec2ef9
>  rejected from java.util.concurrent.ThreadPoolExecutor@5198a326[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 142]
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:200)
>   at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:269)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:437)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.nextWithSyncCache(ClientScanner.java:312)
>   at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:597)
>   at 
> org.apache.hadoop.hbase.quotas.QuotaRetriever.next(QuotaRetriever.java:106)
>   at 
> org.apache.hadoop.hbase.quotas.QuotaRetriever$Iter.<init>(QuotaRetriever.java:125)
>   at 
> org.apache.hadoop.hbase.quotas.QuotaRetriever.iterator(QuotaRetriever.java:117)
>   at 
> org.apache.hadoop.hbase.quotas.QuotaObserverChore.fetchAllTablesWithQuotasDefined(QuotaObserverChore.java:458)
>   at 
> org.apache.hadoop.hbase.quotas.QuotaObserverChore._chore(QuotaObserverChore.java:148)
>   at 
> org.apache.hadoop.hbase.quotas.QuotaObserverChore.chore(QuotaObserverChore.java:136)
>   at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>   at 
> org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@79ec2ef9
>  rejected from java.util.concurrent.ThreadPoolExecutor@5198a326[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 142]
>   at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
>   at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
>   at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379)
>   at 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.submit(ResultBoundedCompletionService.java:171)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.addCallsForCurrentReplica(ScannerCallableWithReplicas.java:320)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:182)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:58)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192)
>   ... 19 more
> {code}
> .. but then this which looks less ok:
> {code}
> 2018-02-21 01:30:24,044 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=39067] 
> ipc.CallRunner(141): callId: 16 service: RegionServerStatusService 
> methodName: ReportRegionSpaceUse size: 109 connection: 172.17.0.2:57185 
> deadline: 1519176634044
> org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
>   at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2741)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.reportRegionSpaceUse(MasterRpcServices.java:2135)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:11053)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> 2018-02-21 01:30:24,046 ERROR [regionserver/ad51e354307e:0.Chore.1] 
> hbase.ScheduledChore(189): Caught error
> java.io.UncheckedIOException: org.apache.hadoop.hbase.TableNotFoundException: 
> hbase:quota
>   at 
> org.apache.hadoop.hbase.client.ResultScanner$1.hasNext(ResultScanner.java:55)
>   at 
> org.apache.hadoop.hbase.quotas.SpaceQuotaRefresherChore.fetchSnapshotsFromQuotaTable(SpaceQuotaRefresherChore.java:151)
>   at 
> org.apache.hadoop.hbase.quotas.SpaceQuotaRefresherChore.chore(SpaceQuotaRefresherChore.java:84)
>   at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>   at 
> org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hbase.TableNotFoundException: hbase:quota
>   at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:842)
>   at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:733)
>   at 
> org.apache.hadoop.hbase.client.ConnectionUtils$ShortCircuitingClusterConnection.locateRegion(ConnectionUtils.java:131)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:325)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:153)
>   at 
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:58)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192)
>   at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:269)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:437)
>   at 
> org.apache.hadoop.hbase.client.ClientScanner.nextWithSyncCache(ClientScanner.java:312)
>   at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:597)
>   at 
> org.apache.hadoop.hbase.client.ResultScanner$1.hasNext(ResultScanner.java:53)
>   ... 11 more
> 2018-02-21 01:30:24,179 DEBUG [Time-limited test-EventThread] 
> zookeeper.ZKWatcher(443): master:39067-0x161b5fb62d10000, 
> quorum=localhost:49877, baseZNode=/hbase Received ZooKeeper Event, 
> type=NodeChildrenChanged, state=SyncConnected, path=/hbase/namespace
> 2018-02-21 01:30:24,604 INFO  [PEWorker-13] 
> procedure2.ProcedureExecutor(1247): Finished pid=6, state=SUCCESS; 
> CreateNamespaceProcedure, namespace=hbase in 1.0980sec
> 2018-02-21 01:30:24,772 DEBUG [Time-limited test-EventThread] 
> zookeeper.ZKWatcher(443): master:39067-0x161b5fb62d10000, 
> quorum=localhost:49877, baseZNode=/hbase Received ZooKeeper Event, 
> type=NodeDataChanged, state=SyncConnected, path=/hbase/namespace/default
> 2018-02-21 01:30:24,830 DEBUG [Time-limited test-EventThread] 
> zookeeper.ZKWatcher(443): master:39067-0x161b5fb62d10000, 
> quorum=localhost:49877, baseZNode=/hbase Received ZooKeeper Event, 
> type=NodeDataChanged, state=SyncConnected, path=/hbase/namespace/hbase
> 2018-02-21 01:30:24,831 INFO  [M:0;ad51e354307e:39067] master.HMaster(927): 
> Master has completed initialization 17.276sec
> 2018-02-21 01:30:24,832 INFO  [M:0;ad51e354307e:39067] 
> quotas.MasterQuotaManager(87): Quota table not found. Creating...
> 2018-02-21 01:30:24,833 INFO  [M:0;ad51e354307e:39067] master.HMaster(1783): 
> Client=null/null create 'hbase:quota', {NAME => 'q', VERSIONS => '1', 
> EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', 
> KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', 
> DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', 
> REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 
> 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', 
> PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 
> 'true', BLOCKSIZE => '65536'}, {NAME => 'u', VERSIONS => '1', 
> EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', 
> KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', 
> DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', 
> REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 
> 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', 
> PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 
> 'true', BLOCKSIZE => '65536'}
> 2018-02-21 01:30:24,979 ERROR 
> [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=39067] 
> ipc.RpcServer(464): Unexpected throwable object
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.reportRegionSpaceUse(MasterRpcServices.java:2142)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:11053)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> 2018-02-21 01:30:24,980 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=39067] 
> ipc.CallRunner(141): callId: 18 service: RegionServerStatusService 
> methodName: ReportRegionSpaceUse size: 109 connection: 172.17.0.2:57185 
> deadline: 1519176634979
> java.io.IOException
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:465)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.reportRegionSpaceUse(MasterRpcServices.java:2142)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:11053)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
>   ... 3 more
> 2018-02-21 01:30:24,982 DEBUG [regionserver/ad51e354307e:0.Chore.1] 
> regionserver.HRegionServer(1245): Failed to report region sizes to Master. 
> This will be retried.
> java.io.IOException: java.io.IOException
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:465)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.reportRegionSpaceUse(MasterRpcServices.java:2142)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:11053)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
>   ... 3 more
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.instantiateException(RemoteWithExtrasException.java:100)
>   at 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:90)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:358)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:335)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.reportRegionSizesForQuotas(HRegionServer.java:1224)
>   at 
> org.apache.hadoop.hbase.quotas.FileSystemUtilizationChore.reportRegionSizesToMaster(FileSystemUtilizationChore.java:187)
>   at 
> org.apache.hadoop.hbase.quotas.FileSystemUtilizationChore.chore(FileSystemUtilizationChore.java:136)
>   at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:186)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>   at 
> org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:111)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): 
> java.io.IOException
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:465)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.reportRegionSpaceUse(MasterRpcServices.java:2142)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:11053)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
>   ... 3 more
> {code}
> You might have some input mighty [~elserj] ? Thanks sir. It failed only this 
> once. I could up the timeouts from 30 seconds because that allowed but my 
> guess it he NPE does us in. Thanks sir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20035) .TestQuotaStatusRPCs.testQuotaStatusFromMaster failed with NPEs and RuntimeExceptions

Reply via email to