Re: [ClusterLabs] [rgmanager] Recovering a failed (but running) server in rgmanager

2016-09-19 Thread Digimer
On 19/09/16 03:13 PM, Digimer wrote: > On 19/09/16 03:07 PM, Digimer wrote: >> On 19/09/16 02:39 PM, Digimer wrote: >>> On 19/09/16 02:30 PM, Jan Pokorný wrote: On 18/09/16 15:37 -0400, Digimer wrote: > If, for example, a server's definition file is corrupted while the > server is

Re: [ClusterLabs] [rgmanager] Recovering a failed (but running) server in rgmanager

2016-09-19 Thread Digimer
On 19/09/16 03:07 PM, Digimer wrote: > On 19/09/16 02:39 PM, Digimer wrote: >> On 19/09/16 02:30 PM, Jan Pokorný wrote: >>> On 18/09/16 15:37 -0400, Digimer wrote: If, for example, a server's definition file is corrupted while the server is running, rgmanager will put the server into a

Re: [ClusterLabs] [rgmanager] Recovering a failed (but running) server in rgmanager

2016-09-19 Thread Digimer
On 19/09/16 02:30 PM, Jan Pokorný wrote: > On 18/09/16 15:37 -0400, Digimer wrote: >> If, for example, a server's definition file is corrupted while the >> server is running, rgmanager will put the server into a 'failed' state. >> That's fine and fair. > > Please, be more precise. Is it "vm"

Re: [ClusterLabs] Virtual ip resource restarted on node with down network device

2016-09-19 Thread Ken Gaillot
On 09/19/2016 10:04 AM, Jan Pokorný wrote: > On 19/09/16 10:18 +, Auer, Jens wrote: >> Ok, after reading the log files again I found >> >> Sep 19 10:03:45 MDA1PFP-S01 crmd[7797]: notice: Initiating action 3: stop >> mda-ip_stop_0 on MDA1PFP-PCS01 (local) >> Sep 19 10:03:45 MDA1PFP-S01

Re: [ClusterLabs] No DRBD resource promoted to master in Active/Passive setup

2016-09-19 Thread Ken Gaillot
On 09/19/2016 09:48 AM, Auer, Jens wrote: > Hi, > >> Is the network interface being taken down here used for corosync >> communication? If so, that is a node-level failure, and pacemaker will >> fence. > > We have different connections on each server: > - A bonded 10GB network card for data

Re: [ClusterLabs] No DRBD resource promoted to master in Active/Passive setup

2016-09-19 Thread Auer, Jens
Hi, > Is the network interface being taken down here used for corosync > communication? If so, that is a node-level failure, and pacemaker will > fence. We have different connections on each server: - A bonded 10GB network card for data traffic that will be accessed via a virtual ip managed by

Re: [ClusterLabs] Virtual ip resource restarted on node with down network device

2016-09-19 Thread Auer, Jens
Hi, >> After the restart ifconfig still shows the device bond0 to be not RUNNING: >> MDA1PFP-S01 09:07:54 2127 0 ~ # ifconfig >> bond0: flags=5123 mtu 1500 >> inet 192.168.120.20 netmask 255.255.255.255 broadcast 0.0.0.0 >> ether a6:17:2c:2a:72:fc

Re: [ClusterLabs] Virtual ip resource restarted on node with down network device

2016-09-19 Thread Lars Ellenberg
On Mon, Sep 19, 2016 at 02:57:57PM +0200, Jan Pokorný wrote: > On 19/09/16 09:15 +, Auer, Jens wrote: > > After the restart ifconfig still shows the device bond0 to be not RUNNING: > > MDA1PFP-S01 09:07:54 2127 0 ~ # ifconfig > > bond0: flags=5123 mtu 1500 > >

Re: [ClusterLabs] Virtual ip resource restarted on node with down network device

2016-09-19 Thread Jan Pokorný
On 19/09/16 09:15 +, Auer, Jens wrote: > After the restart ifconfig still shows the device bond0 to be not RUNNING: > MDA1PFP-S01 09:07:54 2127 0 ~ # ifconfig > bond0: flags=5123 mtu 1500 > inet 192.168.120.20 netmask 255.255.255.255 broadcast 0.0.0.0

Re: [ClusterLabs] where do I find the null fencing device?

2016-09-19 Thread Klaus Wenninger
On 09/17/2016 04:35 PM, Dan Swartzendruber wrote: > > I wanted to do some experiments, and the null fencing agent seemed to > be just what I wanted. I don't find it anywhere, even after > installing fence-agents-all and cluster-glue (this is on CentOS 7, > btw...) Thanks... > Depending on what