[ 
https://issues.apache.org/jira/browse/HBASE-26754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kaushik mandal updated HBASE-26754:
-----------------------------------
    Component/s: master

> hbase master crash after running couple of days with error STUCK 
> Region-In-Transition rit=FAILED_OPEN, location=null, table=hbase:meta, 
> region=xxxxxxxxxx
> ---------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-26754
>                 URL: https://issues.apache.org/jira/browse/HBASE-26754
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 2.4.8
>            Reporter: kaushik mandal
>            Priority: Major
>
> hbase master not responding after running couple of days and region server 
> keep restarting.
> we are seeing bellow warning in master and region server
>  
>  
> WARN [ProcExecTimeout] assignment.AssignmentManager: STUCK 
> Region-In-Transition rit=FAILED_OPEN, location=null, table=hbase:meta, 
> region=xxxxxxxxxxxxx
> [master/xxxx-infra-xxxxx-hbase-master-0:16000.Chore.3] master.HMaster: Not 
> running balancer because processing dead regionserver(s): 2022-02-07 
> 19:54:11,512 INFO [ReadOnlyZKClient-xxxxxx-zookeeper:2181@0x2fcc92d9] 
> zookeeper.ZooKeeper: Initiating client connection, 
> connectString=xxxx-zookeeper:2181 sessionTimeout=90000 
> watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$158/0x000000010057b440@48d2e00b
>  
>  
> WARN [ProcExecTimeout] assignment.AssignmentManager: STUCK 
> Region-In-Transition rit=FAILED_OPEN, location=null, table=hbase:meta, 
> region=1588230740 2022-02-07 19:54:15,643 INFO 
> [hconnection-0x31420403-shared-pool7-t9731] client.RpcRetryingCallerImpl: 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3223)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1414)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:2947)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3272)
>  at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42002)
>  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) at 
> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130) at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) , 
> details=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, 
> hostname=xxx-infra-xxxx-hbase-regionserver-0.xxx-infra-xxxx-hbase-regionserver.default.svc.cluster.local,16020,1644089730940,
>  seqNum=-1
>  
> from region server logs
> 2022-02-05 19:39:16,722 WARN 
> [RpcServer.default.FPBQ.Fifo.handler=109,queue=5,port=16020] 
> regionserver.RSRpcServices: Client tried to access missing scanner 0 
> 2022-02-05 19:39:16,722 WARN 
> [RpcServer.default.FPBQ.Fifo.handler=25,queue=12,port=16020] 
> regionserver.RSRpcServices: Client tried to access missing scanner 0 
> 2022-02-05 19:39:16,721 WARN 
> [RpcServer.default.FPBQ.Fifo.handler=24,queue=11,port=16020] 
> regionserver.RSRpcServices: Client tried to access missing scanner 0 
> 2022-02-05 19:39:16,721 WARN 
> [RpcServer.default.FPBQ.Fifo.handler=112,queue=8,port=16020] 
> regionserver.RSRpcServices: Client tried to access missing scanner 0 
> 2022-02-05 19:39:16,721 WARN 
> [RpcServer.default.FPBQ.Fifo.handler=40,queue=1,port=16020] 
> regionserver.RSRpcServices: Client tried to access missing scanner 0 ==> 
> /opt/hbase-2.0.1/logs/SecurityAuth.audit <== 2022-02-05 19:39:17,882 INFO 
> SecurityLogger.org.apache.hadoop.hbase.Server: Auth successful for hdfs 
> (auth:) 2022-02-05 19:39:17,882 INFO 
> SecurityLogger.org.apache.hadoop.hbase.Server: Connection from 10.42.0.124 
> port: 44876 with unknown version info 2022-02-05 19:40:18,307 INFO 
> SecurityLogger.org.apache.hadoop.hbase.Server: Auth successful for hdfs 
> (auth:) 2022-02-05 19:40:18,307 INFO 
> SecurityLogger.org.apache.hadoop.hbase.Server: Connection from 10.42.0.124 
> port: 51098 with unknown version info ==> 
> /opt/hbase-2.0.1/logs/hbase--regionserver-xxxx-infra-xxxxx-hbase-regionserver-0.log
>  <== 2022-02-05 19:40:32,848 INFO [LruBlockCacheStatsExecutor] 
> hfile.LruBlockCache: totalSize=300.98 KB, freeSize=399.71 MB, max=400 MB, 
> blockCount=0, accesses=0, hits=0, hitRatio=0, cachingAccesses=0, 
> cachingHits=0, cachingHitsRatio=0,evictions=29, evicted=0, evictedPerRun=0.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to