[Linux-HA] Location constraint and standby/offline nodes

2011-05-17 Thread Maxim Ianoglo
Hello, I have: Node_A, Node_B, Node_C Have A MySQL Resource that should run on Node_A ( but with VIP ) and Node_B Node_C runs some other resources but this is not important now. on-fail monitor option for resource is "standby" I have stopped MySQL on Node_A so, MySQL from Node_A and VIP went to N

Re: [Linux-HA] Need HA Help - standby / online not switching automatically

2011-05-17 Thread Randy Katz
Configs as follows, drbd.conf: global { usage-count no; # minor-count dialog-refresh disable-ip-verification } resource r0 { protocol C; syncer { rate 4M; } startup { wfc-timeout 15; degr-wfc-timeout 60; } net { c

Re: [Linux-HA] Need HA Help - standby / online not switching automatically

2011-05-17 Thread Michael Schwartzkopff
> If do do on ha2: crm node online ha2.iohost.com it starts the VIP, it > will ping, but does not > do the DRBD mounts and does not start web or mysql services. If I then > issue the crm node online ha1.iohost.com > on ha1 it will make ha2 online with all services active! Then if I make > ha2 stand

Re: [Linux-HA] Need HA Help - standby / online not switching automatically

2011-05-17 Thread Randy Katz
If do do on ha2: crm node online ha2.iohost.com it starts the VIP, it will ping, but does not do the DRBD mounts and does not start web or mysql services. If I then issue the crm node online ha1.iohost.com on ha1 it will make ha2 online with all services active! Then if I make ha2 standby ha1 wi

Re: [Linux-HA] Need HA Help - standby / online not switching automatically

2011-05-17 Thread Randy Katz
In the logs, on ha2, I see at the time crm node standby ha1: May 18 10:32:54 ha2.iohost.com cib: [2378]: info: write_cib_contents: Archived previous version as /var/lib/heartbeat/crm/cib-25.raw May 18 10:32:54 ha2.iohost.com cib: [2378]: info: write_cib_contents: Wrote version 0.102.0 of the CIB

[Linux-HA] Need HA Help - standby / online not switching automatically

2011-05-17 Thread Randy Katz
Hi, Relatively new to HA though I have been using Xen and reading this list here and there, now need some help: I have 2 nodes, physical, let's call node1/node2: In each I have VM's (Xen paravirt / ha1 & ha2). In each VM I have 2 LVs which are DRBD'd (r0 and r1, mysql data, and html data). There

Re: [Linux-HA] Antw: Re: Massive amount of log messages after node failure

2011-05-17 Thread Lars Marowsky-Bree
On 2011-05-17T17:16:51, Ulrich Windl wrote: > I think that pacemaker is logging too much all the time, so you hardly can > find out if there really is a problem. For example external/sbd is logging a > message every time the shared disk is OK, that is every 30s or so. It should not - the exter

Re: [Linux-HA] Massive amount of log messages after node failure

2011-05-17 Thread Lars Marowsky-Bree
On 2011-05-17T15:25:22, Sascha Hagedorn wrote: > Hi everyone, > > I have a two node cluster running with pacemaker 1.1.2 and DRBD 8.3.7. It is > an active/active cluster so the DRBD partition is used with OCFS2. For > testing purposes I have configured external/ssh as a Stonith device. I did

[Linux-HA] Antw: Re: Massive amount of log messages after node failure

2011-05-17 Thread Ulrich Windl
Hi! I think that pacemaker is logging too much all the time, so you hardly can find out if there really is a problem. For example external/sbd is logging a message every time the shared disk is OK, that is every 30s or so. In contrast (just for example) HP ServiceGuard only logs if the status of

[Linux-HA] IPaddr not work correctly on large routing tables on Linux

2011-05-17 Thread Alexander Polyakov
IPaddr not work correctly on large routing tables on Linux If the contents of the routing table is large (eg, full view, derived from the quagga) then the node can not remove the public IP address. In the top hang two Process: route and grep "remove ip ". Because of this, there is no valid switch

Re: [Linux-HA] Massive amount of log messages after node failure

2011-05-17 Thread Dimitri Maziuk
On 5/17/2011 8:25 AM, Sascha Hagedorn wrote: > Hi everyone, ... > - Pulled the HA network cable > - Put it back after a couple of seconds > > Result: > > - Node 2 is being restarted > - Load average on Node 1 increases until the system becomes > unreachable > -

[Linux-HA] Massive amount of log messages after node failure

2011-05-17 Thread Sascha Hagedorn
Hi everyone, I have a two node cluster running with pacemaker 1.1.2 and DRBD 8.3.7. It is an active/active cluster so the DRBD partition is used with OCFS2. For testing purposes I have configured external/ssh as a Stonith device. I did the following test resulting in the surviving node becoming

Re: [Linux-HA] Always reload/restart a resource

2011-05-17 Thread Eric Warnke
If the script is dynamically changing smb.conf, why not have the script SUGHUP smbd at the same time? If you want something to happen regularly while the service is active why not modify the monitor portion of the resource script? Cheers, Eric On 5/13/11 3:52 AM, "Michael Liebl" wrote: >Hello