Re: [Linux-HA] Harmless log entries

2010-05-20 Thread Andrew Beekhof
On Wed, May 19, 2010 at 5:22 PM, mike mgbut...@nbnet.nb.ca wrote: Andrew Beekhof wrote: which is what my DBA was looking for. He wants mysql to failover if there are 3 successive failures of MySQL but only if those successive failures occur within 15 minutes. You want migration-threshold=3

Re: [Linux-HA] Harmless log entries

2010-05-20 Thread Andrew Beekhof
On Wed, May 19, 2010 at 11:30 AM, Gianluca Cecchi gianluca.cec...@gmail.com wrote: On Wed, May 19, 2010 at 10:17 AM, Andrew Beekhof and...@beekhof.net wrote:   Also, in monitor available fields for a resource there are: - interval, default 0 Does it mean no monitor at all if I don't

Re: [Linux-HA] Harmless log entries

2010-05-20 Thread Andrew Beekhof
On Wed, May 19, 2010 at 2:49 PM, Vadym Chepkov vchep...@gmail.com wrote: On May 19, 2010, at 8:36 AM, mike wrote: I assume Andrew means 15 minutes * 60 = 900 seconds * 1000 = 90 milliseconds I gathered that much, I am just surprised, that's it. Do I have to always specify time units

Re: [Linux-HA] Harmless log entries

2010-05-20 Thread Vadym Chepkov
On May 20, 2010, at 2:53 AM, Andrew Beekhof wrote: On Wed, May 19, 2010 at 2:49 PM, Vadym Chepkov vchep...@gmail.com wrote: On May 19, 2010, at 8:36 AM, mike wrote: I assume Andrew means 15 minutes * 60 = 900 seconds * 1000 = 90 milliseconds I gathered that much, I am just

Re: [Linux-HA] Harmless log entries

2010-05-20 Thread mike
So to see if I understand correctly a couple scenarios: Assume a failure-timeout of 15 minutes 1. lets assume I have 2 failures within 5 minutes and then no failure for 20 minutes afterwards. After that 20 minutes I have a failure. Are you saying no failover will occur at that point and that

Re: [Linux-HA] Harmless log entries

2010-05-20 Thread mike
ok, I actually went ahead and did a test on my cluster. The results did not occur as I would have expected. I failed ldirectord twice on the main node. I waited 20 minutes and saw this entry in the log file: May 20 08:23:10 lvsuat1a.intranet.mydomain.com pengine: [6589]: notice: get_failcount:

Re: [Linux-HA] Harmless log entries

2010-05-20 Thread Gianluca Cecchi
On Thu, May 20, 2010 at 2:45 PM, mike mgbut...@nbnet.nb.ca wrote: ok, I actually went ahead and did a test on my cluster. The results did not occur as I would have expected. I failed ldirectord twice on the main node. I waited 20 minutes and saw this entry in the log file: May 20 08:23:10

Re: [Linux-HA] Harmless log entries

2010-05-20 Thread mike
Gianluca Cecchi wrote: On Thu, May 20, 2010 at 2:45 PM, mike mgbut...@nbnet.nb.ca wrote: ok, I actually went ahead and did a test on my cluster. The results did not occur as I would have expected. I failed ldirectord twice on the main node. I waited 20 minutes and saw this entry in the

[Linux-HA] Using multiple stonith resources

2010-05-20 Thread Alexander Fisher
Hi. I'm setting up a two node corosync+pacemaker cluster. My servers each have dual PSUs. I've got them connected to a pair of APC networked PDUs. This gives me the option of using both external/ipmi and external/rackpdu. I've configured both and this seems to work fine. If I unplug the PDUs

[Linux-HA] Heartbeat fails to start services after failure

2010-05-20 Thread Joel Daynes
Hello- I'm using heartbeat 3.0.2 with DRBD on Ubuntu 10.04, to manage MySQL. When I fail from the primary (mysqlha-lucid1) to the secondary (mysqlha-lucid2) node (either manually with hb_takeover or by powering off the primary), the DRBD device gets mounted properly on the secondary, but MySQL

Re: [Linux-HA] Heartbeat fails to start services after failure

2010-05-20 Thread Joel Daynes
I may have solved my own problem. It appears that my init script wasn't executing properly, so I instead used the wrapper provided here: http://dev.mysql.com/doc/refman/5.1/en/ha-heartbeat-drbd.html This seems to be working better. Sorry for the false alarm! Joel On May 20, 2010, at 10:56

Re: [Linux-HA] stonith external/rackpdu question

2010-05-20 Thread Greg Woods
On Thu, 2010-05-20 at 18:30 +0100, Alexander Fisher wrote: I think I'll use IPMI and rackpdu in the same configuration. That is exactly what I will eventually try (assuming I ever get any time to work on my test cluster some more). It is clear that, no matter what I do, I cannot prepare for

[Linux-HA] need config file advice

2010-05-20 Thread Alyssa Hardy
Hi, I use a combination of heartbeat/drbd between two webservers so they can take on each other's site if one of them fails. I mainly have it so that a machine or cable failure will not cause downtime. They both use the same route, though, so it kicks in when the main internet connection fails,