Re: [ClusterLabs] Virtual ip resource restarted on node with down network device

Dejan Muhamedagic Tue, 20 Sep 2016 08:40:47 -0700

On Tue, Sep 20, 2016 at 01:13:23PM +0000, Auer, Jens wrote:
> Hi,
> 
> >> I've decided to create two answers for the two problems. The cluster
> >> still fails to relocate the resource after unloading the modules even
> >> with resource-agents 3.9.7
> > From the point of view of the resource agent,
> > you configured it to use a non-existing network.
> > Which it considers to be a configuration error,
> > which is treated by pacemaker as
> > "don't try to restart anywhere
> > but let someone else configure it properly, first".
> > Still, I have yet to see what scenario you are trying to test here.
> > To me, this still looks like "scenario evil admin".  If so, I'd not even
> > try, at least not on the pacemaker configuration level.
> It's not evil admin as this would not make sense. I am trying to find a way 
> to force a failover condition e.g. by simulating a network card defect or 
> network outage without running to the server room every time.


Better use iptables. Bringing the interface down is not the same
as network card going bad.

Thanks,

Dejan

> > CONFIDENTIALITY NOTICE:
> > Oh please :-/
> > This is a public mailing list.
> Sorry, this is a standard disclaimer I usually remove. We are forced to add 
> this to e-mails, but I think this is fairly common for commercial companies.
> 
> >> Also the netmask and the ip address are wrong. I have configured the
> >> device to 192.168.120.10 with netmask 192.168.120.10. How does IpAddr2
> >> get the wrong configuration? I have no idea.
> >A netmask of "192.168.120.10" is nonsense.
> >That is the address, not a mask.
> Oops, my fault when writing the e-mail. Obviously this is the address. The 
> configured netmask for the device is 255.255.255.0, but after IPaddr2 brings 
> it up again it is 255.255.255.255 which is not what I configured in the 
> betwork configuration. 
> 
> > Also, according to some posts back,
> > you have configured it in pacemaker with
> > cidr_netmask=32, which is not particularly useful either.
> Thanks for pointing this out. I copied the parameters from the 
> manual/tutorial, but did not think about the values.
> 
> > Again: the IPaddr2 resource agent is supposed to control the assignment
> > of an IP address, hence the name.
> > It is not supposed to create or destroy network interfaces,
> > or configure bonding, or bridges, or anything like that.
> > In fact, it is not even supposed to bring up or down the interfaces,
> > even though for "convenience" it seems to do "ip link set up".
> This is what made me wonder in the beginning. When I bring down the device, 
> this leads to a failure of the resource agent which is exactly what I 
> expected. I did not expect it to bring the device up  again, and definitetly 
> not ignoring the default network configuration.
> 
> > Monitoring connectivity, or dealing with removed interface drivers,
> > or unplugged devices, or whatnot, has to be dealt with elsewhere.
> I am using a ping daemon for that. 
> 
> > What you did is: down the bond, remove all slave assignments, even
> > remove the driver, and expect the resource agent to "heal" things that
> > it does not know about. It can not.
> I am not expecting the RA to heal anything. How could it? And why would I 
> expect it? In fact I am expecting the opposite that is a consistent failure 
> when the device is down. This may be also wrong because you can assign ip 
> addresses to downed devices.
> 
> My initial expectation was that the resource cannot be started when the 
> device is down and then is relocated. I think this more or less the core 
> functionality of the cluster. I can see a reason why it does not switch to 
> another node when there is a configuration error in the cluster because it is 
> fair to assume that the configuration is identical (wrong) on all nodes. But 
> what happens if the network device is broken? The server would start, fail to 
> assign the ip address and then prevent the whole cluster from working? What 
> happens if the network card breaks while the cluster is running? 
> 
> Best wishes,
>   Jens
> 
> _______________________________________________
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Virtual ip resource restarted on node with down network device

Reply via email to