[ 
https://issues.apache.org/jira/browse/HBASE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183122#comment-13183122
 ] 

ramkrishna.s.vasudevan commented on HBASE-5155:
-----------------------------------------------

{code}
2012-01-10 11:43:34,303 INFO org.apache.hadoop.hbase.master.ServerManager: 
Received REGION_SPLIT: j9t6,,1326109762514.adcbae41a5024c60c72f5752c6e1c8d4.: 
Daughters; j9t6,,1326176002507.49c3665a4bc656f3f6473659b64798f7., 
j9t6,23443]5767435g,1326176002507.0b96b5ed4c0426d3b3f13e586179c9bc. from 
linux-129,60020,1326175677339




2012-01-10 12:05:19,122 INFO 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Splitting logs 
for linux-129,60020,1326175677339
2012-01-10 12:06:07,153 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
running balancer because processing dead regionserver(s): 
[linux-129,60020,1326175677339]
2012-01-10 12:09:57,865 INFO 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Reassigning 7 
region(s) that linux-129,60020,1326175677339 was carrying (skipping 0 
regions(s) that are already in transition)




2012-01-10 12:11:30,988 INFO 
org.apache.hadoop.hbase.master.handler.DisableTableHandler: Attemping to 
disable table j9t6
2012-01-10 12:12:21,513 INFO 
org.apache.hadoop.hbase.master.handler.DisableTableHandler: Disabled table is 
done=true





2012-01-10 12:13:41,624 INFO 
org.apache.hadoop.hbase.master.handler.TableEventHandler: Handling table 
operation C_M_DELETE_TABLE on table j9t6
2012-01-10 12:14:00,811 DEBUG 
org.apache.hadoop.hbase.master.handler.DeleteTableHandler: Deleting region 
j9t6,,1326109762514.adcbae41a5024c60c72f5752c6e1c8d4. from META and FS
2012-01-10 12:14:02,230 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
Deleted region j9t6,,1326109762514.adcbae41a5024c60c72f5752c6e1c8d4. from META
2012-01-10 12:14:07,330 DEBUG 
org.apache.hadoop.hbase.master.handler.DeleteTableHandler: Deleting region 
j9t6,,1326176002507.49c3665a4bc656f3f6473659b64798f7. from META and FS
2012-01-10 12:14:07,521 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
Deleted region j9t6,,1326176002507.49c3665a4bc656f3f6473659b64798f7. from META
2012-01-10 12:14:09,860 DEBUG 
org.apache.hadoop.hbase.master.handler.DeleteTableHandler: Deleting region 
j9t6,23443]5767435g,1326176002507.0b96b5ed4c0426d3b3f13e586179c9bc. from META 
and FS
2012-01-10 12:14:10,096 INFO org.apache.hadoop.hbase.catalog.MetaEditor: 
Deleted region 
j9t6,23443]5767435g,1326176002507.0b96b5ed4c0426d3b3f13e586179c9bc. from META








2012-01-10 12:18:11,081 DEBUG 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Offlined and 
split region j9t6,,1326109762514.adcbae41a5024c60c72f5752c6e1c8d4.; checking 
daughter presence
2012-01-10 12:18:46,450 INFO 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Fixup; missing 
daughter j9t6,,1326176002507.49c3665a4bc656f3f6473659b64798f7.
2012-01-10 12:18:46,775 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Added 
daughter j9t6,,1326176002507.49c3665a4bc656f3f6473659b64798f7. in region 
.META.,,1, serverInfo=null
2012-01-10 12:18:47,135 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
master:60000-0x134c5dbd0a60000 Creating (or updating) unassigned node for 
49c3665a4bc656f3f6473659b64798f7 with OFFLINE state
2012-01-10 12:18:47,142 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
No previous transition plan was found (or we are ignoring an existing plan) for 
j9t6,,1326176002507.49c3665a4bc656f3f6473659b64798f7. so generated a random 
one; hri=j9t6,,1326176002507.49c3665a4bc656f3f6473659b64798f7., src=, 
dest=linux146,60020,1326169560093; 1 (online=1, exclude=null) available servers
2012-01-10 12:18:47,143 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Assigning region j9t6,,1326176002507.49c3665a4bc656f3f6473659b64798f7. to 
linux146,60020,1326169560093
2012-01-10 12:18:47,155 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handle region called from node nodeDataChanged
2012-01-10 12:18:47,155 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling transition=RS_ZK_REGION_OPENING, server=linux146,60020,1326169560093, 
region=49c3665a4bc656f3f6473659b64798f7
2012-01-10 12:18:47,202 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handle region called from node nodeDataChanged
2012-01-10 12:18:47,202 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling transition=RS_ZK_REGION_OPENING, server=linux146,60020,1326169560093, 
region=49c3665a4bc656f3f6473659b64798f7
2012-01-10 12:18:47,221 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handle region called from node nodeDataChanged
2012-01-10 12:18:47,221 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling transition=RS_ZK_REGION_OPENED, server=linux146,60020,1326169560093, 
region=49c3665a4bc656f3f6473659b64798f7
2012-01-10 12:18:47,222 DEBUG 
org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
event for j9t6,,1326176002507.49c3665a4bc656f3f6473659b64798f7. from 
serverName=linux146,60020,1326169560093, load=(requests=0, regions=7, 
usedHeap=30, maxHeap=996); deleting unassigned node
2012-01-10 12:18:47,222 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
master:60000-0x134c5dbd0a60000 Deleting existing unassigned node for 
49c3665a4bc656f3f6473659b64798f7 that is in expected state RS_ZK_REGION_OPENED
2012-01-10 12:18:47,230 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
master:60000-0x134c5dbd0a60000 Successfully deleted unassigned node for region 
49c3665a4bc656f3f6473659b64798f7 in expected state RS_ZK_REGION_OPENED
2012-01-10 12:18:47,232 DEBUG 
org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: The master has 
opened the region j9t6,,1326176002507.49c3665a4bc656f3f6473659b64798f7. that 
was online on serverName=linux146,60020,1326169560093, load=(requests=0, 
regions=7, usedHeap=30, maxHeap=996)

2012-01-10 12:19:01,801 INFO 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Fixup; missing 
daughter j9t6,23443]5767435g,1326176002507.0b96b5ed4c0426d3b3f13e586179c9bc.
2012-01-10 12:19:02,261 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Added 
daughter j9t6,23443]5767435g,1326176002507.0b96b5ed4c0426d3b3f13e586179c9bc. in 
region .META.,,1, serverInfo=null
2012-01-10 12:19:02,984 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
master:60000-0x134c5dbd0a60000 Creating (or updating) unassigned node for 
0b96b5ed4c0426d3b3f13e586179c9bc with OFFLINE state
2012-01-10 12:19:02,992 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
No previous transition plan was found (or we are ignoring an existing plan) for 
j9t6,23443]5767435g,1326176002507.0b96b5ed4c0426d3b3f13e586179c9bc. so 
generated a random one; 
hri=j9t6,23443]5767435g,1326176002507.0b96b5ed4c0426d3b3f13e586179c9bc., src=, 
dest=linux146,60020,1326169560093; 1 (online=1, exclude=null) available servers
2012-01-10 12:19:02,992 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Assigning region 
j9t6,23443]5767435g,1326176002507.0b96b5ed4c0426d3b3f13e586179c9bc. to 
linux146,60020,1326169560093
2012-01-10 12:19:03,062 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handle region called from node nodeDataChanged
2012-01-10 12:19:03,062 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling transition=RS_ZK_REGION_OPENING, server=linux146,60020,1326169560093, 
region=0b96b5ed4c0426d3b3f13e586179c9bc
2012-01-10 12:19:03,107 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handle region called from node nodeDataChanged
2012-01-10 12:19:03,108 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling transition=RS_ZK_REGION_OPENING, server=linux146,60020,1326169560093, 
region=0b96b5ed4c0426d3b3f13e586179c9bc
2012-01-10 12:19:03,164 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handle region called from node nodeDataChanged
2012-01-10 12:19:03,164 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling transition=RS_ZK_REGION_OPENED, server=linux146,60020,1326169560093, 
region=0b96b5ed4c0426d3b3f13e586179c9bc
2012-01-10 12:19:03,165 DEBUG 
org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
event for j9t6,23443]5767435g,1326176002507.0b96b5ed4c0426d3b3f13e586179c9bc. 
from serverName=linux146,60020,1326169560093, load=(requests=11, regions=8, 
usedHeap=33, maxHeap=996); deleting unassigned node
2012-01-10 12:19:03,165 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
master:60000-0x134c5dbd0a60000 Deleting existing unassigned node for 
0b96b5ed4c0426d3b3f13e586179c9bc that is in expected state RS_ZK_REGION_OPENED
2012-01-10 12:19:03,169 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
master:60000-0x134c5dbd0a60000 Successfully deleted unassigned node for region 
0b96b5ed4c0426d3b3f13e586179c9bc in expected state RS_ZK_REGION_OPENED
{code}
                
> ServerShutDownHandler And Disable/Delete should not happen parallely leading 
> to recreation of regions that were deleted
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-5155
>                 URL: https://issues.apache.org/jira/browse/HBASE-5155
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: ramkrishna.s.vasudevan
>            Priority: Blocker
>
> ServerShutDownHandler and disable/delete table handler races.  This is not an 
> issue due to TM.
> -> A regionserver goes down.  In our cluster the regionserver holds lot of 
> regions.
> -> A region R1 has two daughters D1 and D2.
> -> The ServerShutdownHandler gets called and scans the META and gets all the 
> user regions
> -> Parallely a table is disabled. (No problem in this step).
> -> Delete table is done.
> -> The tables and its regions are deleted including R1, D1 and D2.. (So META 
> is cleaned)
> -> Now ServerShutdownhandler starts to processTheDeadRegion
> {code}
>  if (hri.isOffline() && hri.isSplit()) {
>       LOG.debug("Offlined and split region " + hri.getRegionNameAsString() +
>         "; checking daughter presence");
>       fixupDaughters(result, assignmentManager, catalogTracker);
> {code}
> As part of fixUpDaughters as the daughers D1 and D2 is missing for R1 
> {code}
>     if (isDaughterMissing(catalogTracker, daughter)) {
>       LOG.info("Fixup; missing daughter " + daughter.getRegionNameAsString());
>       MetaEditor.addDaughter(catalogTracker, daughter, null);
>       // TODO: Log WARN if the regiondir does not exist in the fs.  If its not
>       // there then something wonky about the split -- things will keep going
>       // but could be missing references to parent region.
>       // And assign it.
>       assignmentManager.assign(daughter, true);
> {code}
> we call assign of the daughers.  
> Now after this we again start with the below code.
> {code}
>         if (processDeadRegion(e.getKey(), e.getValue(),
>             this.services.getAssignmentManager(),
>             this.server.getCatalogTracker())) {
>           this.services.getAssignmentManager().assign(e.getKey(), true);
> {code}
> Now when the SSH scanned the META it had R1, D1 and D2.
> So as part of the above code D1 and D2 which where assigned by fixUpDaughters
> is again assigned by 
> {code}
> this.services.getAssignmentManager().assign(e.getKey(), true);
> {code}
> Thus leading to a zookeeper issue due to bad version and killing the master.
> The important part here is the regions that were deleted are recreated which 
> i think is more critical.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to