[
https://issues.apache.org/jira/browse/HBASE-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Purtell resolved HBASE-3442.
-----------------------------------
Resolution: Invalid
Issue wasn't actionable
> Master failing when node disconnects or dies
> --------------------------------------------
>
> Key: HBASE-3442
> URL: https://issues.apache.org/jira/browse/HBASE-3442
> Project: HBase
> Issue Type: Bug
> Components: master, regionserver
> Affects Versions: 0.90.0
> Environment: CentOS 5, Hbase .90 RC3, Amazon EC2
> Reporter: Justin
> Priority: Minor
>
> We've got our servers running on Amazon EC2 and nodes will go through some
> shutdown scripts if/when we want to take them out of the mix. Ended up
> shutting down one of the nodes, in this case Node98, which cased the
> immediate crash of the master server. Upon restarting the master, it would
> attempt to contact the missing node, and then stop it's startup process. I
> believe the node removed itself from the DNS server first, then ran a stop on
> the datanode, and regionserver. The missing node was also removed from any
> slave/regionserver list on the master server. I finally put in a bogus entry
> in the /etc/hosts file for the missing node, pointing it back to 127.0.0.1,
> and the master server finally marked it as a dead node, ignored it, and
> finished the startup process.
> Going to try and replicate it again and save some more logs, the following
> log is the only thing I saved from the first occurrence; It's the master
> failing to start up while checking for the missing node:
> http://pastebin.com/ZyQMQm91
--
This message was sent by Atlassian JIRA
(v6.2#6252)