Apache Hadoop 2.3.0
Sent via the Samsung GALAXY S®4, an AT&T 4G LTE smartphone -------- Original message -------- From: Azuryy <[email protected]> Date:03/14/2014 10:45 PM (GMT-05:00) To: [email protected] Subject: Re: HA NN Failover question Which Hadoop version you used? Sent from my iPhone5s > On 2014年3月15日, at 9:29, dlmarion <[email protected]> wrote: > > Server 1: NN1 and ZKFC1 > Server 2: NN2 and ZKFC2 > Server 3: Journal1 and ZK1 > Server 4: Journal2 and ZK2 > Server 5: Journal3 and ZK3 > Server 6+: Datanode > > All in the same rack. I would expect the ZKFC from the active name node > server to lose its lock and the other ZKFC to tell the standby namenode that > it should become active (I’m assuming that’s how it works). > > - Dave > > From: Juan Carlos [mailto:[email protected]] > Sent: Friday, March 14, 2014 9:12 PM > To: [email protected] > Subject: Re: HA NN Failover question > > Hi Dave, > How many zookeeper servers do you have and where are them? > > Juan Carlos Fernández Rodríguez > > El 15/03/2014, a las 01:21, dlmarion <[email protected]> escribió: > > I was doing some testing with HA NN today. I set up two NN with active > failover (ZKFC) using sshfence. I tested that its working on both NN by doing > ‘kill -9 <pid>’ on the active NN. When I did this on the active node, the > standby would become the active and everything seemed to work. Next, I logged > onto the active NN and did a ‘service network stop’ to simulate a NIC/network > failure. The standby did not become the active in this scenario. In fact, it > remained in standby mode and complained in the log that it could not > communicate with (what was) the active NN. I was unable to find anything > relevant via searches in Google in Jira. Does anyone have experience > successfully testing this? I’m hoping that it is just a configuration problem. > > FWIW, when the network was restarted on the active NN, it failed over almost > immediately. > > Thanks, > > Dave
