[ 
https://issues.apache.org/jira/browse/HBASE-549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585872#action_12585872
 ] 

stack commented on HBASE-549:
-----------------------------

Its less likely to happen but the mess it makes when it does happen makes for 
the kind of churn that gets us to offline regions in ways logging cannot explain

> Don't CLOSE region if message is not from server that opened it or is opening 
> it
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-549
>                 URL: https://issues.apache.org/jira/browse/HBASE-549
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.16.0, 0.2.0, 0.1.1, 0.1.0
>            Reporter: stack
>             Fix For: 0.2.0
>
>
> We assign a region to a server.  It takes too long to open (HBASE-505).  
> Region gets assigned to another server.  Meantime original host returns a 
> MSG_REPORT_CLOSE (because other regions opening messes it up moving files on 
> disk out from under it).  We queue a shutdown which marks the region as 
> needing reassignment.  Second server reports in that it successfully opened 
> the region.  Master tells it it should not have opened it.  Churn ensues.
> Fix is to ignore the CLOSE if its reported server/startcode does not match 
> that of the server currently trying to open region.  Fix is not easy because 
> currently we don't keep list of server info in unassigned regions.
> Here's master log snippet showing problem:
> {code}
> ...
> 2008-03-25 19:16:43,711 INFO org.apache.hadoop.hbase.HMaster: assigning 
> region enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482 to server 
> XX.XX.XX.220:60020
> 2008-03-25 19:16:46,725 DEBUG org.apache.hadoop.hbase.HMaster: Received 
> MSG_REPORT_PROCESS_OPEN : 
> enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482 from XX.XX.XX.220:60020
> 2008-03-25 19:18:06,411 DEBUG org.apache.hadoop.hbase.HMaster: shutdown 
> scanner looking at enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482
> 2008-03-25 19:18:06,811 DEBUG org.apache.hadoop.hbase.HMaster: shutdown 
> scanner looking at enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482
> 2008-03-25 19:19:46,841 INFO org.apache.hadoop.hbase.HMaster: assigning 
> region enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482 to server 
> XX.XX.XX.221:60020
> 2008-03-25 19:19:49,849 DEBUG org.apache.hadoop.hbase.HMaster: Received 
> MSG_REPORT_PROCESS_OPEN : 
> enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482 from XX.XX.XX.221:60020
> 2008-03-25 19:19:56,883 DEBUG org.apache.hadoop.hbase.HMaster: Received 
> MSG_REPORT_CLOSE : enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482 from 
> XX.XX.XX.220:60020
> 2008-03-25 19:19:56,883 INFO org.apache.hadoop.hbase.HMaster: 
> XX.XX.XX.220:60020 no longer serving regionname: 
> enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482, startKey: 
> <iLStZ0yTnfVUziYcNVVxWV==>, endKey: <jLB27Q4hKls4tSvp64rJfF==
> >, encodedName: 1857033608, tableDesc: {name: enwiki_080103, families: 
> >{alternate_title:={name: alternate_title, max versions: 3, compression: 
> >NONE, in memory: false, max length: 2147483647, bloom filter: none}, 
> >alternate_url:={name: al
> ternate_url, max versions: 3, compression: NONE, in memory: false, max 
> length: 2147483647, bloom filter: none}, anchor:={name: anchor, max versions: 
> 3, compression: NONE, in memory: false, max length: 2147483647, bloom filter: 
> none}, mi
> sc:={name: misc, max versions: 3, compression: NONE, in memory: false, max 
> length: 2147483647, bloom filter: none}, page:={name: page, max versions: 3, 
> compression: NONE, in memory: false, max length: 2147483647, bloom filter: 
> none}, re
> direct:={name: redirect, max versions: 3, compression: NONE, in memory: 
> false, max length: 2147483647, bloom filter: none}}}
> 2008-03-25 19:19:56,885 DEBUG org.apache.hadoop.hbase.HMaster: Main 
> processing loop: ProcessRegionClose of 
> enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482, true, false
> 2008-03-25 19:19:56,885 INFO org.apache.hadoop.hbase.HMaster: region closed: 
> enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482
> 2008-03-25 19:19:56,887 INFO org.apache.hadoop.hbase.HMaster: reassign 
> region: enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482
> 2008-03-25 19:19:57,288 INFO org.apache.hadoop.hbase.HMaster: assigning 
> region enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482 to server 
> XX.XX.XX.189:60020
> 2008-03-25 19:20:00,296 DEBUG org.apache.hadoop.hbase.HMaster: Received 
> MSG_REPORT_PROCESS_OPEN : 
> enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482 from XX.XX.XX.189:60020
> 2008-03-25 19:20:16,885 DEBUG org.apache.hadoop.hbase.HMaster: Received 
> MSG_REPORT_OPEN : enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482 from 
> XX.XX.XX.221:60020
> 2008-03-25 19:20:16,885 DEBUG org.apache.hadoop.hbase.HMaster: region server 
> XX.XX.XX.221:60020 should not have opened region 
> enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482
> 2008-03-25 19:23:51,707 DEBUG org.apache.hadoop.hbase.HMaster: shutdown 
> scanner looking at enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482
> 2008-03-25 19:23:51,834 DEBUG org.apache.hadoop.hbase.HMaster: shutdown 
> scanner looking at enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482
> 2008-03-25 19:23:53,947 INFO org.apache.hadoop.hbase.HMaster: assigning 
> region enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482 to server 
> XX.XX.XX.97:60020
> ...
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to