[Linux-HA] BadThingsHappen with v2.0.5.

2007-04-19 Thread Peter Kruse
Hello, thanks for reading this, as it's with ancient v2.0.5., please tell me that this problem can not happen with recent version of heartbeat. Problem description: yesterday in one of our 2node HA-Clusters a successful takeover happened, where the failed node was resetted, so far so good.

Re: [Linux-HA] BadThingsHappen with v2.0.5.

2007-04-19 Thread Andrew Beekhof
On 4/19/07, Peter Kruse [EMAIL PROTECTED] wrote: Hello, thanks for reading this, as it's with ancient v2.0.5., please tell me that this problem can not happen with recent version of heartbeat. Problem description: yesterday in one of our 2node HA-Clusters a successful takeover happened, where

Re: [Linux-HA] BadThingsHappen with v2.0.5.

2007-04-19 Thread Peter Kruse
Hi Andrew! Andrew Beekhof wrote: beosrv-c-2 is the failed node right? it was beosrv-c-1 that failed, beosrv-c-2 took over. do you have logs from there too? attached (messages about Gmain_timeout removed, there were too many of them) The problem now is that cibadmin -m reports: CIB on

Re: [Linux-HA] BadThingsHappen with v2.0.5.

2007-04-19 Thread Andrew Beekhof
On 4/19/07, Peter Kruse [EMAIL PROTECTED] wrote: Hi Andrew! Andrew Beekhof wrote: beosrv-c-2 is the failed node right? it was beosrv-c-1 that failed, beosrv-c-2 took over. then i'm afraid your use of the dont fence nodes on startup option has come back to haunt you beosrv-c-1 came up but

Re: [Linux-HA] BadThingsHappen with v2.0.5.

2007-04-19 Thread Peter Kruse
Andrew Beekhof wrote: then i'm afraid your use of the dont fence nodes on startup option has come back to haunt you beosrv-c-1 came up but was not able to find beosrv-c-2 (even though it _was_ running) and because of that option beosrv-c-1 just pretended beosrv-c-2 wasn't running and happily

Re: [Linux-HA] BadThingsHappen with v2.0.5.

2007-04-19 Thread Alan Robertson
Peter Kruse wrote: Andrew Beekhof wrote: then i'm afraid your use of the dont fence nodes on startup option has come back to haunt you beosrv-c-1 came up but was not able to find beosrv-c-2 (even though it _was_ running) and because of that option beosrv-c-1 just pretended beosrv-c-2 wasn't