Jimmy Xiang created HBASE-8537:
----------------------------------

             Summary: Dead region server pulled in from ZK
                 Key: HBASE-8537
                 URL: https://issues.apache.org/jira/browse/HBASE-8537
             Project: HBase
          Issue Type: Bug
          Components: master
            Reporter: Jimmy Xiang
            Assignee: Jimmy Xiang
            Priority: Minor


When a cluster restarts quickly after it's crashed, although a new region 
server is reported in, the master still pulls in the dead region server from 
the zk.

{noformat}
2013-05-12 18:32:52,996 INFO  [IPC Server handler 6 on 36000] 
org.apache.hadoop.hbase.master.ServerManager: Registering 
server=a1217.halxg.cloudera.com,36020,1368408767773
....
2013-05-12 18:32:54,653 INFO  
[master-a1220.halxg.cloudera.com,36000,1368408767520] 
org.apache.hadoop.hbase.master.HMaster: Registering server found up in zk but 
who has not yet reported in: a1217.halxg.cloudera.com,36020,1368378273768
2013-05-12 18:32:54,653 INFO  
[master-a1220.halxg.cloudera.com,36000,1368408767520] 
org.apache.hadoop.hbase.master.ServerManager: Registering 
server=a1217.halxg.cloudera.com,36020,1368378273768
{noformat}

We should not pull in the second region server instance from zk.  It is 
actually dead.  We can figure this out by the hostname, and the port.  We can 
assume no two region server instances can be alive on the same host, the same 
port.  To be more cautious, we can check the timestamp as well.  The live one 
should be that with the late timestamp, not pulled in from zk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to