Sorry, I hate my poor English . I give a description again: Master add regionserver to onlineServers in two case: 1. Add a machine to the cluster, It includes cluster startup or add a new machine. Master can get region server information from api "regionServerStartup" and add to onlineServers set.
2. Master is restarted. Master can get region server information from api "regionServerReport" and add to onlineServers set. But It must be happened when Master called function waitForRegionServers(). If region sever reported is later, Master will take it for a dead server. The regions will be assigned. So one region is opened in different region server. I think the later region server should shutdown itself and start again. It can register by api regionServerStartup But not api regionServerReport eg: region could not be assigned by balance Hmaster logs: 2011-05-23 11:12:10,588 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. to 158-1-101-82,20020,1306117051387 2011-05-23 11:15:20,472 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. state=PENDING_OPEN, ts=1306120330588 2011-05-23 11:15:20,472 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_OPEN for too long, reassigning region=hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. 2011-05-23 11:15:20,513 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. state=PENDING_OPEN, ts=1306120330588 2011-05-23 11:15:20,513 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. so generated a random one; hri=hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d., src=, dest=158-1-101-82,20020,1306117051387; 2 (online=2, exclude=null) available servers 2011-05-23 11:15:20,513 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. to 158-1-101-82,20020,1306117051387 2011-05-23 11:18:30,473 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. state=PENDING_OPEN, ts=1306120520513 2011-05-23 11:18:30,473 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_OPEN for too long, reassigning region=hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. 2011-05-23 11:18:30,487 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. state=PENDING_OPEN, ts=1306120520513 2011-05-23 11:18:30,487 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. so generated a random one; hri=hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d., src=, dest=158-1-101-222,20020,1306119315097; 2 (online=2, exclude=null) available servers 2011-05-23 11:18:30,488 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. to 158-1-101-222,20020,1306119315097 2011-05-23 11:18:30,516 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=158-1-101-222,20020,1306119315097, region=70541f0abda274708e12570c52aa7f1d 2011-05-23 11:18:30,581 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENING, server=158-1-101-222,20020,1306119315097, region=70541f0abda274708e12570c52aa7f1d 2011-05-23 11:18:30,900 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_OPENED, server=158-1-101-222,20020,1306119315097, region=70541f0abda274708e12570c52aa7f1d 2011-05-23 11:18:30,900 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED event for 70541f0abda274708e12570c52aa7f1d; deleting unassigned node 2011-05-23 11:18:30,900 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:20000-0x2301a9c63bd0006-0x2301a9c63bd0006-0x2301a9c63bd0006 Deleting existing unassigned node for 70541f0abda274708e12570c52aa7f1d that is in expected state RS_ZK_REGION_OPENED 2011-05-23 11:18:30,930 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:20000-0x2301a9c63bd0006-0x2301a9c63bd0006-0x2301a9c63bd0006 Successfully deleted unassigned node for region 70541f0abda274708e12570c52aa7f1d in expected state RS_ZK_REGION_OPENED Regionserver logs: 2011-05-23 10:21:37,400 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opened hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. 2011-05-23 11:04:19,633 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open region: hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. 2011-05-23 11:04:19,633 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Processing open of hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. 2011-05-23 11:04:19,633 WARN org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Attempted open of hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. but already online on this server 2011-05-23 11:09:00,615 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open region: hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. 2011-05-23 11:09:00,615 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Processing open of hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. 2011-05-23 11:09:00,615 WARN org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Attempted open of hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. but already online on this server 2011-05-23 11:12:10,588 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open region: hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. 2011-05-23 11:12:10,588 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Processing open of hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. 2011-05-23 11:12:10,588 WARN org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Attempted open of hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. but already online on this server 2011-05-23 11:15:20,513 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open region: hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. 2011-05-23 11:15:20,513 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Processing open of hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. 2011-05-23 11:15:20,513 WARN org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Attempted open of hello,150900,1305944335445.70541f0abda274708e12570c52aa7f1d. but already online on this server
