Re: [ClusterLabs] Corosync ring shown faulty between healthy nodes & networks (rrp_mode: passive)

2016-10-07 Thread Martin Schlegel
Thanks for all responses from Jan, Ulrich and Digimer ! We are already using bond'ed network interfaces, but we are also forced to go across IP-subnets. Certain routes between routers can go and have gone missing. This has happened for one of our node's public network, where it was inaccessible

Re: [ClusterLabs] Corosync ring shown faulty between healthy nodes & networks (rrp_mode: passive)

2016-10-07 Thread Jan Friesse
Martin Schlegel napsal(a): Thanks for the confirmation Jan, but this sounds a bit scary to me ! Spinning this experiment a bit further ... Would this not also mean that with a passive rrp with 2 rings it only takes 2 different nodes that are not able to communicate on different networks at the

Re: [ClusterLabs] Corosync ring shown faulty between healthy nodes & networks (rrp_mode: passive)

2016-10-06 Thread Dimitri Maziuk
PS. in security handling everything at one (high) level is known as "hard crunchy shell with soft chewy center". It's not seen as a good thing. -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature

Re: [ClusterLabs] Corosync ring shown faulty between healthy nodes & networks (rrp_mode: passive)

2016-10-06 Thread Dimitri Maziuk
On 10/06/2016 11:25 AM, Klaus Wenninger wrote: > But it is convenient because all layers on top can be completely > agnostic of the duplicity. It's also cheap: failing over a node, esp. when taking over involves replaying a database log, or even just re-establishing a bunch of nfs connections,

Re: [ClusterLabs] Corosync ring shown faulty between healthy nodes & networks (rrp_mode: passive)

2016-10-06 Thread Dimitri Maziuk
On 10/06/2016 09:26 AM, Klaus Wenninger wrote: > Usually one - at least me so far - would rather think that having > the awareness of redundany/cluster as high up as possible in the > protocol/application-stack would open up possibilities for more > appropriate reactions. The obvious

Re: [ClusterLabs] Corosync ring shown faulty between healthy nodes & networks (rrp_mode: passive)

2016-10-06 Thread Martin Schlegel
Thanks for the confirmation Jan, but this sounds a bit scary to me ! Spinning this experiment a bit further ... Would this not also mean that with a passive rrp with 2 rings it only takes 2 different nodes that are not able to communicate on different networks at the same time to have all rings

[ClusterLabs] Corosync ring shown faulty between healthy nodes & networks (rrp_mode: passive)

2016-10-04 Thread Martin Schlegel
Hello all, I am trying to understand why the following 2 Corosync heartbeat ring failure scenarios I have been testing and hope somebody can explain why this makes any sense. Consider the following cluster: * 3x Nodes: A, B and C * 2x NICs for each Node * Corosync 2.3.5 configured