[
https://issues.apache.org/jira/browse/HBASE-12901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14286853#comment-14286853
]
Rajeshbabu Chintaguntla commented on HBASE-12901:
-------------------------------------------------
No other place other than this [~enis].
> Possible deadlock while onlining a region and get region plan for other
> region run parallel
> -------------------------------------------------------------------------------------------
>
> Key: HBASE-12901
> URL: https://issues.apache.org/jira/browse/HBASE-12901
> Project: HBase
> Issue Type: Bug
> Reporter: Rajeshbabu Chintaguntla
> Assignee: Rajeshbabu Chintaguntla
> Priority: Critical
> Fix For: 1.0.0, 1.1.0
>
> Attachments: HBASE-12901.patch
>
>
> There is a deadlock when region state updating(regionOnline)after assignment
> completed and getting region plan to other region parallelly. Before onlining
> we are synchronizing on regionStates and inside synchronizing on regionPlans
> to clear the region plan. At the same time there is a chance that while
> getting plan first we synchornize on regionPlans and then regionStates while
> getting assignments of a server. This is coming after HBASE-12686 fix. This
> issue present in branch-1 and branch-1.1 only.
> {code}
> "AM.-pool1-t33":
> at
> org.apache.hadoop.hbase.master.AssignmentManager.clearRegionPlan(AssignmentManager.java:2917)
> - waiting to lock <0x00000000d0147f70> (a java.util.TreeMap)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.regionOffline(AssignmentManager.java:3617)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.regionOffline(AssignmentManager.java:1402)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1734)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.forceRegionStateToOffline(AssignmentManager.java:1821)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1456)
> at
> org.apache.hadoop.hbase.master.AssignCallable.call(AssignCallable.java:45)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> "AM.-pool1-t29":
> at
> org.apache.hadoop.hbase.master.RegionStates.getRegionAssignments(RegionStates.java:155)
> - waiting to lock <0x00000000d010b250> (a
> org.apache.hadoop.hbase.master.RegionStates)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.getSnapShotOfAssignment(AssignmentManager.java:3629)
> at
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.getRegionAssignmentsByServer(BaseLoadBalancer.java:1146)
> at
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:959)
> at
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.randomAssignment(BaseLoadBalancer.java:1010)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:2228)
> - locked <0x00000000d0147f70> (a java.util.TreeMap)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:2185)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1905)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1464)
> at
> org.apache.hadoop.hbase.master.AssignCallable.call(AssignCallable.java:45)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> "AM.ZK.Worker-pool2-t41":
> at
> org.apache.hadoop.hbase.master.AssignmentManager.clearRegionPlan(AssignmentManager.java:2917)
> - waiting to lock <0x00000000d0147f70> (a java.util.TreeMap)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.regionOnline(AssignmentManager.java:1305)
> at
> org.apache.hadoop.hbase.master.AssignmentManager$4.run(AssignmentManager.java:1196)
> - locked <0x00000000d010b250> (a
> org.apache.hadoop.hbase.master.RegionStates)
> at
> org.apache.hadoop.hbase.master.AssignmentManager$3.run(AssignmentManager.java:1142)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)