[
https://issues.apache.org/jira/browse/HBASE-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649234#action_12649234
]
stack commented on HBASE-1009:
------------------------------
Digging in, I see where master sends to an HRS the close message because it
judges the regionserver overloaded. I see the regionserver getting the close
but then the close taking too long because of issues with hdfs (or deadlock in
hdfs -- looking at that next). Meantime the regionserver is sending over
heartbeat with list of overloaded regions.
I could add a closing state and then if region is closing, this would mean that
in the heartbeat we'd not send an already closing region. Probably no harm.
Wouldn't fix the root cause. Eventually the regionserver will have no more
regions for the master to close. Will still be stuck in the loop though if
can't actually close regions.
Will add the closing state for this issue and then try and deal with the hangup
in next issue.
> Master stuck in loop wanting to assign but regions are closing
> --------------------------------------------------------------
>
> Key: HBASE-1009
> URL: https://issues.apache.org/jira/browse/HBASE-1009
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Priority: Blocker
> Fix For: 0.19.0
>
>
> From streamy logs.
> {code}
> 2008-11-19 10:36:58,933 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,�,1225411057556 because it is already closing.
> 2008-11-19 10:37:01,315 DEBUG org.apache.hadoop.hbase.master.ServerManager:
> Total Load: 138, Num Servers: 9, Avg Load: 16.0
> 2008-11-19 10:37:01,935 DEBUG org.apache.hadoop.hbase.master.RegionManager:
> Server XX.XX.XX.212:60020 is overloaded. Server load: 21 avg: 16.0, slop: 0.1
> 2008-11-19 10:37:01,935 DEBUG org.apache.hadoop.hbase.master.RegionManager:
> Choosing to reassign 5 regions. mostLoadedRegions has 10 regions in it.
> 2008-11-19 10:37:01,935 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streams,'6,1226967394935 because it is already closing.
> 2008-11-19 10:37:01,935 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,'�,1226078595896 because it is already closing.
> 2008-11-19 10:37:01,935 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,���,1225472287315 because it is already closing.
> 2008-11-19 10:37:01,935 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,X$�,1225411877996 because it is already closing.
> 2008-11-19 10:37:01,935 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,�},1225411050812 because it is already closing.
> 2008-11-19 10:37:01,935 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region api,,1222913694225 because it is already closing.
> 2008-11-19 10:37:01,935 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,0��,1226459423496 because it is already closing.
> 2008-11-19 10:37:01,935 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region items,R�,1223906859795 because it is already closing.
> 2008-11-19 10:37:01,935 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region authentication,,1222913700431 because it is already closing.
> 2008-11-19 10:37:01,935 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,�,1225411057556 because it is already closing.
> 2008-11-19 10:37:04,939 DEBUG org.apache.hadoop.hbase.master.RegionManager:
> Server XX.XX.XX.212:60020 is overloaded. Server load: 21 avg: 16.0, slop: 0.1
> 2008-11-19 10:37:04,939 DEBUG org.apache.hadoop.hbase.master.RegionManager:
> Choosing to reassign 5 regions. mostLoadedRegions has 10 regions in it.
> 2008-11-19 10:37:04,939 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streams,'6,1226967394935 because it is already closing.
> 2008-11-19 10:37:04,939 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,'�,1226078595896 because it is already closing.
> 2008-11-19 10:37:04,939 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,���,1225472287315 because it is already closing.
> 2008-11-19 10:37:04,939 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,X$�,1225411877996 because it is already closing.
> 2008-11-19 10:37:04,939 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,�},1225411050812 because it is already closing.
> 2008-11-19 10:37:04,939 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region api,,1222913694225 because it is already closing.
> 2008-11-19 10:37:04,939 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,0��,1226459423496 because it is already closing.
> 2008-11-19 10:37:04,939 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region items,R�,1223906859795 because it is already closing.
> 2008-11-19 10:37:04,939 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region authentication,,1222913700431 because it is already closing.
> 2008-11-19 10:37:04,939 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,�,1225411057556 because it is already closing.
> 2008-11-19 10:37:07,941 DEBUG org.apache.hadoop.hbase.master.RegionManager:
> Server XX.XX.XX.212:60020 is overloaded. Server load: 21 avg: 16.0, slop: 0.1
> 2008-11-19 10:37:07,941 DEBUG org.apache.hadoop.hbase.master.RegionManager:
> Choosing to reassign 5 regions. mostLoadedRegions has 10 regions in it.
> 2008-11-19 10:37:07,941 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streams,'6,1226967394935 because it is already closing.
> 2008-11-19 10:37:07,941 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,'�,1226078595896 because it is already closing.
> 2008-11-19 10:37:07,942 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,���,1225472287315 because it is already closing.
> 2008-11-19 10:37:07,942 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,X$�,1225411877996 because it is already closing.
> 2008-11-19 10:37:07,942 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,�},1225411050812 because it is already closing.
> 2008-11-19 10:37:07,942 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region api,,1222913694225 because it is already closing.
> 2008-11-19 10:37:07,942 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,0��,1226459423496 because it is already closing.
> 2008-11-19 10:37:07,942 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region items,R�,1223906859795 because it is already closing.
> 2008-11-19 10:37:07,942 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region authentication,,1222913700431 because it is already closing.
> 2008-11-19 10:37:07,942 INFO org.apache.hadoop.hbase.master.RegionManager:
> Skipping region streamitems,�,1225411057556 because it is already closing.
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.