[ 
https://issues.apache.org/jira/browse/HBASE-24896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17179974#comment-17179974
 ] 

Bharath Vissapragada commented on HBASE-24896:
----------------------------------------------

This happened right after startup?

I skimmed through the jstacks and the code and wondering if we are running into 
a some circular static block dependency that is causing a deadlock. Simpler 
example 
[here|[http://ternarysearch.blogspot.com/2013/07/static-initialization-deadlock.html]].
 In this case, the dependency seems to be among RegionInfo -> RegioninfoBuilder 
-> MutableRegionInfo (c'tor) -> RegionInfo. May be we need to unnest those? If 
we see the jstacks from the above blog post, they are also stuck in Object 
wait(), hence my strong suspicion in these dependencies.

 

> 'Stuck' creating RegionInfo instance
> ------------------------------------
>
>                 Key: HBASE-24896
>                 URL: https://issues.apache.org/jira/browse/HBASE-24896
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.3.1
>            Reporter: Michael Stack
>            Priority: Major
>         Attachments: hbasedn192-jstack-0.webarchive, 
> hbasedn192-jstack-1.webarchive, hbasedn192-jstack-2.webarchive
>
>
> We ran into the following deadlocked server in testing. The priority handlers 
> seem stuck across multiple thread dumps. Seven of the ten total priority 
> threads have this state:
> {code:java}
> "RpcServer.priority.RWQ.Fifo.read.handler=5,queue=1,port=16020" #82 daemon 
> prio=5 os_prio=0 cpu=0.70ms elapsed=315627.86s allocated=3744B 
> defined_classes=0 tid=0x00007f3da0983040 nid=0x62d9 in Object.wait()  
> [0x00007f3d9bc8c000]
>    java.lang.Thread.State: RUNNABLE
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3327)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1491)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:3143)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3478)
>       at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:44858)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:393)
>       at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) 
> {code}
> The anomalous three are as follows:
> h3. #1
> {code:java}
> "RpcServer.priority.RWQ.Fifo.write.handler=0,queue=0,port=16020" #77 daemon 
> prio=5 os_prio=0 cpu=175.98ms elapsed=315627.86s allocated=2153K 
> defined_classes=14 tid=0x00007f3da0ae6ec0 nid=0x62d4 in Object.wait()  
> [0x00007f3d9c190000]
>    java.lang.Thread.State: RUNNABLE
>       at 
> org.apache.hadoop.hbase.client.RegionInfo.<clinit>(RegionInfo.java:72)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3327)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1491)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2912)
>       at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:44856)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:393)
>       at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318){code}
> ...which is the creation of the UNDEFINED in RegionInfo here:
> {color:#808000}@InterfaceAudience.Public{color}{color:#000080}public 
> interface {color}RegionInfo {color:#000080}extends 
> {color}Comparable<RegionInfo> {
>  RegionInfo {color:#660e7a}UNDEFINED {color}= 
> RegionInfoBuilder.newBuilder(TableName.valueOf({color:#008000}"__UNDEFINED__"{color})).build();
>  
> h3. #2
> {code:java}
> "RpcServer.priority.RWQ.Fifo.read.handler=4,queue=1,port=16020" #81 daemon 
> prio=5 os_prio=0 cpu=53.85ms elapsed=315627.86s allocated=81984B 
> defined_classes=3 tid=0x00007f3da0981590 nid=0x62d8 in Object.wait()  
> [0x00007f3d9bd8c000]
>    java.lang.Thread.State: RUNNABLE
>       at 
> org.apache.hadoop.hbase.client.RegionInfoBuilder.<clinit>(RegionInfoBuilder.java:49)
>       at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toRegionInfo(ProtobufUtil.java:3231)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.executeOpenRegionProcedures(RSRpcServices.java:3755)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.lambda$executeProcedures$2(RSRpcServices.java:3827)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices$$Lambda$173/0x00000017c0e40040.accept(Unknown
>  Source)
>       at java.util.ArrayList.forEach([email protected]/ArrayList.java:1540)
>       at 
> java.util.Collections$UnmodifiableCollection.forEach([email protected]/Collections.java:1085)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.executeProcedures(RSRpcServices.java:3827)
>       at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:34896)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:393)
>       at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) 
> {code}
> which is here creating meta MetaRegionInfo..
>  
> {color:#000080}public static final {color}RegionInfo 
> {color:#660e7a}FIRST_META_REGIONINFO {color}=
>  {color:#000080}new {color}MutableRegionInfo({color:#0000ff}1L{color}, 
> TableName.{color:#660e7a}META_TABLE_NAME{color}, 
> RegionInfo.{color:#660e7a}DEFAULT_REPLICA_ID{color});
>  
> h3. #3
> {code:java}
> "RpcServer.priority.RWQ.Fifo.read.handler=8,queue=1,port=16020" #85 daemon 
> prio=5 os_prio=0 cpu=0.50ms elapsed=315627.85s allocated=1960B 
> defined_classes=0 tid=0x00007f3da0d851d0 nid=0x62dc in Object.wait()  
> [0x00007f3d9b989000]
>    java.lang.Thread.State: RUNNABLE
>       at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toRegionInfo(ProtobufUtil.java:3231)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.executeOpenRegionProcedures(RSRpcServices.java:3755)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.lambda$executeProcedures$2(RSRpcServices.java:3827)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices$$Lambda$173/0x00000017c0e40040.accept(Unknown
>  Source)
>       at java.util.ArrayList.forEach([email protected]/ArrayList.java:1540)
>       at 
> java.util.Collections$UnmodifiableCollection.forEach([email protected]/Collections.java:1085)
>       at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.executeProcedures(RSRpcServices.java:3827)
>       at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:34896)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:393)
>       at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
>       at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
>  {code}
> ... which is here in code
> {color:#000080}if 
> {color}(tableName.equals(TableName.{color:#660e7a}META_TABLE_NAME{color}) && 
> replicaId == defaultReplicaId) {
>  {color:#000080}return 
> {color}RegionInfoBuilder.{color:#660e7a}FIRST_META_REGIONINFO{color};
>  }
>  
> The thread dump does not seem to recognize the above as a deadlock.
>  
> ...at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3327)
>  is doing the below:
> {color:#000080}return 
> this{color}.{color:#660e7a}onlineRegions{color}.get(encodedRegionName);
> ... where onlineRegions is concurrent Map of String to HRegion.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to