[
https://issues.apache.org/jira/browse/HBASE-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661000#action_12661000
]
Jim Kellerman commented on HBASE-1104:
--------------------------------------
Ok, I can see where this could be confusing. In HBASE-543, a newly discovered
region would be
set to 'unassigned'.
- if a region is 'unassigned' it is a candidate to be opened by the next region
server that checks in.
- if the region is assigned to a region server, it is marked as assigned.
- when the region server reports that it has opened the region, it is marked as
pending
Once ProcessRegionOpen runs and the HRS is has been stored in the META table,
it is removed
from the Map of regionsInTransition
- When it is determined that a region should be closed, the region is marked as
'closing'
- When the master sends the close message to the HRS, the region's status is
set as
closing + closed (and if the region is being off-lined in the process, the
status is: closing +
closed + offlined)
Once the HRS reports that a region is closed, ProcessRegionClose is called. If
the region
should be reassigned (i.e., offlined == false), then the region status is set
to unassigned
so that it will get picked up and assigned to the first region server that
reports in that is not
overloaded.
If the region has been offlined, ProcessRegionClose will remove the region from
the
regionsInTransition Map.
Ok, so what does this boil down to? There are three states for getting a region
served:
1) unassigned
2) assigned
3) pending
However, for regions being closed it is more complex:
- closing means the region is in the process of being closed
- closing + closed means that the master has told the HRS to close the region.
- closing + offline means that the master wants to close the region and have it
offlined
- closing + closed + offline means that the master has told the HRS to close
the region,
and that it will be offlined once the HRS reports that it has closed the
region.
The reason for this approach was that if a region was closing, it could not be
marked
as unassigned. Only ProcessRegionClose would know if the region should be
reassigned,
and if not, it would remove the region from the regionsInTransition Map. If the
region was
to be reassigned, it would stay in the map and its status would be changed to
"unassigned"
As opening a region requires three states (unassigned, assigned, pending),
closing a region
should be similar:
- close -- region server should be told that region is to be closed when the
HRS reports in
- closing -- the HRS has been told to close the region
- closed - HRS reports that the region is closed.
When a region has a status of closing, it also has a substatus of closing
and/or offlined.
If offlined, and the status == closed, then the master should remove the region
from the
regionsInTransition Map. If not offlined, the region should have its status set
to unassigned.
So that is how it should work, but because starting up a region requires three
state transitions
and closing one down currently only requires two, it is confusing.
Changing region close to be symmetrical with region open should clarify (and
simplify) how
regions get reassigned.
> Doubly-assigned regions redux
> -----------------------------
>
> Key: HBASE-1104
> URL: https://issues.apache.org/jira/browse/HBASE-1104
> Project: Hadoop HBase
> Issue Type: Bug
> Environment: pset cluster with TRUNK.
> Reporter: stack
> Assignee: Jim Kellerman
> Fix For: 0.19.0
>
>
> Testing, I see doubly assigned regions. Below is from master log for
> TestTable,0000135598,1230761605500.
> {code}
> 2008-12-31 22:13:35,528 [IPC Server handler 2 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_SPLIT:
> TestTable,0000116170,1230761152219: TestTable,0000116170,1230761152219 split;
> daughters: TestTable,0000116170,1230761605500,
> TestTable,0000135598,1230761605500 from XX.XX.XX.142:60020
> 2008-12-31 22:13:35,528 [IPC Server handler 2 on 60000] INFO
> org.apache.hadoop.hbase.master.RegionManager: assigning region
> TestTable,0000135598,1230761605500 to server XX.XX.XX.142:60020
> 2008-12-31 22:13:38,561 [IPC Server handler 6 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN:
> TestTable,0000135598,1230761605500 from XX.XX.XX.142:60020
> 2008-12-31 22:13:38,562 [HMaster] INFO
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> TestTable,0000135598,1230761605500 open on XX.XX.XX.142:60020
> 2008-12-31 22:13:38,562 [HMaster] INFO
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> TestTable,0000135598,1230761605500 in region .META.,,1 with startcode
> 1230759988953 and server XX.XX.XX.142:60020
> 2008-12-31 22:13:44,640 [IPC Server handler 4 on 60000] DEBUG
> org.apache.hadoop.hbase.master.RegionManager: Going to close region
> TestTable,0000135598,1230761605500
> 2008-12-31 22:13:50,441 [IPC Server handler 9 on 60000] INFO
> org.apache.hadoop.hbase.master.RegionManager: assigning region
> TestTable,0000135598,1230761605500 to server XX.XX.XX.139:60020
> 2008-12-31 22:13:53,457 [IPC Server handler 5 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received
> MSG_REPORT_PROCESS_OPEN: TestTable,0000135598,1230761605500 from
> XX.XX.XX.139:60020
> 2008-12-31 22:13:53,458 [IPC Server handler 5 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN:
> TestTable,0000135598,1230761605500 from XX.XX.XX.139:60020
> 2008-12-31 22:13:53,458 [HMaster] INFO
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> TestTable,0000135598,1230761605500 open on XX.XX.XX.139:60020
> 2008-12-31 22:13:53,458 [HMaster] INFO
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> TestTable,0000135598,1230761605500 in region .META.,,1 with startcode
> 1230759988788 and server XX.XX.XX.139:60020
> 2008-12-31 22:13:53,688 [IPC Server handler 6 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_CLOSE:
> TestTable,0000135598,1230761605500 from XX.XX.XX.142:60020
> 2008-12-31 22:13:53,688 [HMaster] DEBUG
> org.apache.hadoop.hbase.master.HMaster: Processing todo: ProcessRegionClose
> of TestTable,0000135598,1230761605500, false
> 2008-12-31 22:13:54,263 [IPC Server handler 7 on 60000] INFO
> org.apache.hadoop.hbase.master.RegionManager: assigning region
> TestTable,0000135598,1230761605500 to server XX.XX.XX.141:60020
> 2008-12-31 22:13:57,273 [IPC Server handler 9 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received
> MSG_REPORT_PROCESS_OPEN: TestTable,0000135598,1230761605500 from
> XX.XX.XX.141:60020
> 2008-12-31 22:14:03,917 [IPC Server handler 0 on 60000] INFO
> org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN:
> TestTable,0000135598,1230761605500 from XX.XX.XX.141:60020
> 2008-12-31 22:14:03,917 [HMaster] INFO
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1:
> TestTable,0000135598,1230761605500 open on XX.XX.XX.141:60020
> 2008-12-31 22:14:03,918 [HMaster] INFO
> org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
> TestTable,0000135598,1230761605500 in region .META.,,1 with startcode
> 1230759989031 and server XX.XX.XX.141:60020
> 2008-12-31 22:14:29,350 [RegionManager.metaScanner] DEBUG
> org.apache.hadoop.hbase.master.BaseScanner:
> TestTable,0000135598,1230761605500 no longer has references to
> TestTable,0000116170,1230761152219
> {code}
> See how we choose to assign before we get the close back from the
> regionserver.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.