I suppose NN2 is standby, please check ZKFC2 is alive before stop network on nn1

Sent from my iPhone5s

> On 2014年3月15日, at 10:53, dlmarion <[email protected]> wrote:
> 
> Apache Hadoop 2.3.0
> 
> 
> Sent via the Samsung GALAXY S®4, an AT&T 4G LTE smartphone
> 
> 
> -------- Original message --------
> From: Azuryy 
> Date:03/14/2014 10:45 PM (GMT-05:00) 
> To: [email protected] 
> Subject: Re: HA NN Failover question 
> 
> Which Hadoop version you used?
> 
> 
> Sent from my iPhone5s
> 
> On 2014年3月15日, at 9:29, dlmarion <[email protected]> wrote:
> 
>> Server 1: NN1 and ZKFC1
>> Server 2: NN2 and ZKFC2
>> Server 3: Journal1 and ZK1
>> Server 4: Journal2 and ZK2
>> Server 5: Journal3 and ZK3
>> Server 6+: Datanode
>>  
>> All in the same rack. I would expect the ZKFC from the active name node 
>> server to lose its lock and the other ZKFC to tell the standby namenode that 
>> it should become active (I’m assuming that’s how it works).
>>  
>> - Dave
>>  
>> From: Juan Carlos [mailto:[email protected]] 
>> Sent: Friday, March 14, 2014 9:12 PM
>> To: [email protected]
>> Subject: Re: HA NN Failover question
>>  
>> Hi Dave,
>> How many zookeeper servers do you have and where are them? 
>> 
>> Juan Carlos Fernández Rodríguez
>> 
>> El 15/03/2014, a las 01:21, dlmarion <[email protected]> escribió:
>> 
>> I was doing some testing with HA NN today. I set up two NN with active 
>> failover (ZKFC) using sshfence. I tested that its working on both NN by 
>> doing ‘kill -9 <pid>’ on the active NN. When I did this on the active node, 
>> the standby would become the active and everything seemed to work. Next, I 
>> logged onto the active NN and did a ‘service network stop’ to simulate a 
>> NIC/network failure. The standby did not become the active in this scenario. 
>> In fact, it remained in standby mode and complained in the log that it could 
>> not communicate with (what was) the active NN. I was unable to find anything 
>> relevant via searches in Google in Jira. Does anyone have experience 
>> successfully testing this? I’m hoping that it is just a configuration 
>> problem.
>>  
>> FWIW, when the network was restarted on the active NN, it failed over almost 
>> immediately.
>>  
>> Thanks,
>>  
>> Dave

Reply via email to