Re: [Pacemaker] How to really deal with gateway restarts?

2010-06-15 Thread Andrew Beekhof
On Mon, Jun 14, 2010 at 9:26 PM, Maros Timko tim...@gmail.com wrote: Date: Mon, 14 Jun 2010 08:13:59 +0200 From: Andrew Beekhof and...@beekhof.net To: The Pacemaker cluster resource manager        pacemaker@oss.clusterlabs.org Subject: Re: [Pacemaker] How to really deal with gateway restarts?

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Andrew Beekhof
On Mon, Jun 14, 2010 at 4:22 PM, Vadym Chepkov vchep...@gmail.com wrote: On Jun 7, 2010, at 8:04 AM, Vadym Chepkov wrote: I filed bug 2435, glad to hear it's not me Andrew closed this bug (http://developerbugs.linux-foundation.org/show_bug.cgi?id=2435) as resolved, but I respectfully

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Andreas Kurz
On Tuesday 15 June 2010 08:40:58 Andrew Beekhof wrote: On Mon, Jun 14, 2010 at 4:22 PM, Vadym Chepkov vchep...@gmail.com wrote: On Jun 7, 2010, at 8:04 AM, Vadym Chepkov wrote: I filed bug 2435, glad to hear it's not me Andrew closed this bug

[Pacemaker] question from NN

2010-06-15 Thread Michail Bogatyrev
Good day! I have 2 servers: Main and Backup . Both servers have 2 Ethernet cintrollers. One controller is used for LAN, second - for Internet. IP addresess: Main: 192.168.104.101/27 89.151.191.133/29 Bachup: 192.168.104.102/27 89.151.191.134/29 shared IP for heartbeat: 192.168.104.100 and

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Andrew Beekhof
On Tue, Jun 15, 2010 at 10:23 AM, Andreas Kurz andreas.k...@linbit.com wrote: On Tuesday 15 June 2010 08:40:58 Andrew Beekhof wrote: On Mon, Jun 14, 2010 at 4:22 PM, Vadym Chepkov vchep...@gmail.com wrote: On Jun 7, 2010, at 8:04 AM, Vadym Chepkov wrote: I filed bug 2435, glad to hear it's

Re: [Pacemaker] question from NN

2010-06-15 Thread Andrew Beekhof
2010/6/15 Michail Bogatyrev bogat...@mfisoft.ru: Good day! I have 2 servers: Main and Backup . Both servers have 2 Ethernet cintrollers. One controller is used for LAN, second - for Internet. IP addresess: Main: 192.168.104.101/27 89.151.191.133/29 Bachup: 192.168.104.102/27

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Dejan Muhamedagic
Hi, On Tue, Jun 15, 2010 at 10:57:47AM +0200, Andrew Beekhof wrote: On Tue, Jun 15, 2010 at 10:23 AM, Andreas Kurz andreas.k...@linbit.com wrote: On Tuesday 15 June 2010 08:40:58 Andrew Beekhof wrote: On Mon, Jun 14, 2010 at 4:22 PM, Vadym Chepkov vchep...@gmail.com wrote: On Jun 7,

Re: [Pacemaker] UPDATE...2 node cluster with clvm, configuration help needed...

2010-06-15 Thread Dejan Muhamedagic
Hi, On Tue, Jun 15, 2010 at 11:09:15AM +0200, patrik.rappo...@knapp.com wrote: hy guys, my colleague gave me a tip, that the stonith ressource on node 1, when node 2 is offline, won't work cause of a false state (cant reach the asm module of node 2) and so the other ressources (vg, lv)

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Andrew Beekhof
On Tue, Jun 15, 2010 at 12:14 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Tue, Jun 15, 2010 at 10:57:47AM +0200, Andrew Beekhof wrote: On Tue, Jun 15, 2010 at 10:23 AM, Andreas Kurz andreas.k...@linbit.com wrote: On Tuesday 15 June 2010 08:40:58 Andrew Beekhof wrote: On Mon,

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Dejan Muhamedagic
On Tue, Jun 15, 2010 at 12:30:45PM +0200, Andrew Beekhof wrote: On Tue, Jun 15, 2010 at 12:14 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Tue, Jun 15, 2010 at 10:57:47AM +0200, Andrew Beekhof wrote: On Tue, Jun 15, 2010 at 10:23 AM, Andreas Kurz andreas.k...@linbit.com

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Andrew Beekhof
On Tue, Jun 15, 2010 at 12:39 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: On Tue, Jun 15, 2010 at 12:30:45PM +0200, Andrew Beekhof wrote: On Tue, Jun 15, 2010 at 12:14 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Tue, Jun 15, 2010 at 10:57:47AM +0200, Andrew Beekhof

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Vadym Chepkov
On Jun 15, 2010, at 4:57 AM, Andrew Beekhof wrote: On Tue, Jun 15, 2010 at 10:23 AM, Andreas Kurz andreas.k...@linbit.com wrote: On Tuesday 15 June 2010 08:40:58 Andrew Beekhof wrote: On Mon, Jun 14, 2010 at 4:22 PM, Vadym Chepkov vchep...@gmail.com wrote: On Jun 7, 2010, at 8:04 AM, Vadym

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Andrew Beekhof
On Tue, Jun 15, 2010 at 1:38 PM, Vadym Chepkov vchep...@gmail.com wrote: On Jun 15, 2010, at 4:57 AM, Andrew Beekhof wrote: On Tue, Jun 15, 2010 at 10:23 AM, Andreas Kurz andreas.k...@linbit.com wrote: On Tuesday 15 June 2010 08:40:58 Andrew Beekhof wrote: On Mon, Jun 14, 2010 at 4:22 PM,

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Gianluca Cecchi
On Tue, Jun 15, 2010 at 1:50 PM, Andrew Beekhof and...@beekhof.net wrote: [snip] Score = -inf, plus the patch, plus sequential = true (or unset). Not sure how that looks in shell syntax though. Which patch? ___ Pacemaker mailing list:

[Pacemaker] after update one node in crm is getting offline CentOS

2010-06-15 Thread Testuser SST
Hi, I have just made an update from heartbeat 2.x to the latest pacemaker with heartbeat and corosync from the clusterlabs repo on a 2 node CentOS-Cluster. (uninstall the heartbeat rpm, yum install from the new repo) The Cluster is holding one IP-resource. When I start the first node with

[Pacemaker] Solved ! Fwd: after update one node in crm is getting offline CentOS

2010-06-15 Thread Testuser SST
Original-Nachricht Datum: Tue, 15 Jun 2010 14:13:19 +0200 Von: Testuser SST fatcha...@gmx.de An: pacemaker@oss.clusterlabs.org Betreff: [Pacemaker] after update one node in crm is getting offline CentOS Hi, I have just made an update from heartbeat 2.x to the latest pacemaker

[Pacemaker] Solved !!! Fwd: after update one node in crm is getting offline CentOS

2010-06-15 Thread Testuser SST
Hi, I´m sorry but the problem was generated by some kind of watchdog-script which stopped the heartbeat service. Kind Regards f_c Original-Nachricht Datum: Tue, 15 Jun 2010 14:13:19 +0200 Von: Testuser SST fatcha...@gmx.de An: pacemaker@oss.clusterlabs.org Betreff:

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Vadym Chepkov
On Jun 15, 2010, at 6:14 AM, Dejan Muhamedagic wrote: Hi, On Tue, Jun 15, 2010 at 10:57:47AM +0200, Andrew Beekhof wrote: On Tue, Jun 15, 2010 at 10:23 AM, Andreas Kurz andreas.k...@linbit.com wrote: On Tuesday 15 June 2010 08:40:58 Andrew Beekhof wrote: On Mon, Jun 14, 2010 at 4:22 PM,

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Vadym Chepkov
On Jun 15, 2010, at 8:11 AM, Gianluca Cecchi wrote: On Tue, Jun 15, 2010 at 1:50 PM, Andrew Beekhof and...@beekhof.net wrote: [snip] Score = -inf, plus the patch, plus sequential = true (or unset). Not sure how that looks in shell syntax though. Which patch?

Re: [Pacemaker] SBD Fencing daemon: explain me more clear

2010-06-15 Thread Lars Marowsky-Bree
On 2010-06-14T17:24:16, Aleksey Zholdak alek...@zholdak.com wrote: Hi Aleksey, Can anybody explain me more clear than on official and (IMHO) outdated page http://www.linux-ha.org/wiki/SBD_Fencing next: What timeouts I must specify, if my multipath needs from 90 to 160 secs to be switched

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Andrew Beekhof
On Tue, Jun 15, 2010 at 2:57 PM, Vadym Chepkov vchep...@gmail.com wrote: On Jun 15, 2010, at 7:50 AM, Andrew Beekhof wrote: On Tue, Jun 15, 2010 at 1:38 PM, Vadym Chepkov vchep...@gmail.com wrote: On Jun 15, 2010, at 4:57 AM, Andrew Beekhof wrote: On Tue, Jun 15, 2010 at 10:23 AM, Andreas

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Vadym Chepkov
On Tue, Jun 15, 2010 at 9:14 AM, Andrew Beekhof and...@beekhof.net wrote: On Tue, Jun 15, 2010 at 2:57 PM, Vadym Chepkov vchep...@gmail.com wrote: On Jun 15, 2010, at 7:50 AM, Andrew Beekhof wrote: On Tue, Jun 15, 2010 at 1:38 PM, Vadym Chepkov vchep...@gmail.com wrote: On Jun 15, 2010, at

Re: [Pacemaker] SBD Fencing daemon: explain me more clear

2010-06-15 Thread Aleksey Zholdak
Can anybody explain me more clear than on official and (IMHO) outdated page http://www.linux-ha.org/wiki/SBD_Fencing next: What timeouts I must specify, if my multipath needs from 90 to 160 secs to be switched off the dead path... Timeouts below are maybe wrong because sometime node1 kills node2

Re: [Pacemaker] SBD Fencing daemon: explain me more clear

2010-06-15 Thread Lars Marowsky-Bree
On 2010-06-15T16:32:12, Aleksey Zholdak alek...@zholdak.com wrote: Timeout (watchdog) : 180 Timeout (allocate) : 2 Timeout (loop) : 10 Timeout (msgwait) : 200 But I see, that node1 resets node2 (or vice versa, or each other) when it does not update its slot for 10 seconds... sbd

Re: [Pacemaker] SBD Fencing daemon: explain me more clear

2010-06-15 Thread Gianluca Cecchi
On Tue, Jun 15, 2010 at 3:36 PM, Lars Marowsky-Bree l...@novell.com wrote: On 2010-06-15T16:32:12, Aleksey Zholdak alek...@zholdak.com wrote: [snip] Why is the MPIO scenario so slow? These questions needs to be asked to developers mptsas (novell + hp) You should really file a service

Re: [Pacemaker] SBD Fencing daemon: explain me more clear

2010-06-15 Thread Aleksey Zholdak
Can you elaborate does not update its slot for 10 seconds more clearly? Unfortunately nowhere is not described in detail the work sbd, so many things have to only guess ... So I could misunderstand the logic of his work ... And anyway - I've finally got confused. If I set 180 secs to

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Gianluca Cecchi
On Tue, Jun 15, 2010 at 2:48 PM, Vadym Chepkov vchep...@gmail.com wrote: On Jun 15, 2010, at 8:11 AM, Gianluca Cecchi wrote: On Tue, Jun 15, 2010 at 1:50 PM, Andrew Beekhof and...@beekhof.netwrote: [snip] Score = -inf, plus the patch, plus sequential = true (or unset). Not sure how that

Re: [Pacemaker] SBD Fencing daemon: explain me more clear

2010-06-15 Thread Lars Marowsky-Bree
On 2010-06-15T17:04:39, Aleksey Zholdak alek...@zholdak.com wrote: Can you elaborate does not update its slot for 10 seconds more clearly? Unfortunately nowhere is not described in detail the work sbd, Uhm, what is unclear about http://www.linux-ha.org/wiki/SBD_Fencing ? It does explain how

Re: [Pacemaker] SBD Fencing daemon: explain me more clear

2010-06-15 Thread Aleksey Zholdak
Lars, Uhm, what is unclear about http://www.linux-ha.org/wiki/SBD_Fencing ? It does explain how sbd works (not all of the timeouts though). Exacly! It does not explain loop timeout, for example... Depending on the watchdog device you are using, it is conceivable that it refuses to accept a

Re: [Pacemaker] SBD Fencing daemon: explain me more clear

2010-06-15 Thread Lars Marowsky-Bree
On 2010-06-15T17:32:51, Aleksey Zholdak alek...@zholdak.com wrote: Uhm, what is unclear about http://www.linux-ha.org/wiki/SBD_Fencing ? It does explain how sbd works (not all of the timeouts though). Exacly! It does not explain loop timeout, for example... The loop timeout is just the time

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Andrew Beekhof
On Tue, Jun 15, 2010 at 4:08 PM, Gianluca Cecchi gianluca.cec...@gmail.com wrote: On Tue, Jun 15, 2010 at 2:48 PM, Vadym Chepkov vchep...@gmail.com wrote: On Jun 15, 2010, at 8:11 AM, Gianluca Cecchi wrote: On Tue, Jun 15, 2010 at 1:50 PM, Andrew Beekhof and...@beekhof.net wrote: [snip]

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Gianluca Cecchi
On Tue, Jun 15, 2010 at 4:43 PM, Andrew Beekhof and...@beekhof.net wrote: But that is for 1.1 branch that is not considered as stable... No, existing functionality its very stable. Its just the new features that might have some extra corner cases we've not seen exercised yet. Put it

Re: [Pacemaker] How to really deal with gateway restarts?

2010-06-15 Thread Maros Timko
I thought dampen attribute could help with some of the options, but actually it is does not. It should do. ?Hard to say without any logs from the two machines. Unfort. I don't have log files here, can provide you if that would help. Are you sure dampen should help here? From my testing it

Re: [Pacemaker] crm node delete

2010-06-15 Thread Maros Timko
On Fri, Jun 11, 2010 at 03:45:19PM +0100, Maros Timko wrote: Hi all, using heartbeat stack. I have a system with one node offline: Last updated: Fri Jun 11 13:52:40 2010 Stack: Heartbeat Current DC: vsp7.example.com (ba6d6332-71dd-465b-a030-227bcd31a25f) - partition with

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Vadym Chepkov
On Jun 15, 2010, at 9:26 AM, Vadym Chepkov wrote: what about this part? what do I need to do to prevent them from running on different nodes for sure? You can't have it both ways. Either they have to run on the same node or they can remain active when one or more die. Although you

Re: [Pacemaker] abrupt power failure problem

2010-06-15 Thread Bernd Schubert
On Tuesday 15 June 2010, Schaefer, Diane E wrote: Hi, We are having trouble with our two node cluster after one node experiences an abrupt power failure. The resources do not seem to start on the remaining node (ie DRBD resources do not promote to master). In the log we notice: Jan

Re: [Pacemaker] abrupt power failure problem

2010-06-15 Thread Schaefer, Diane E
Thanks for the idea. Is there any way to automatically recover resources without manual intervention? Diane THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the

Re: [Pacemaker] abrupt power failure problem

2010-06-15 Thread Bernd Schubert
Hello Diane, the problem is that pacemaker is not allowed to take over resources until stonith succeeds, as it simply does not know about the state of the other server. Lets assume the other node would still be up and running, would have mounted a shared storage device an would write to it,

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Andrew Beekhof
On Tue, Jun 15, 2010 at 5:13 PM, Gianluca Cecchi gianluca.cec...@gmail.com wrote: On Tue, Jun 15, 2010 at 4:43 PM, Andrew Beekhof and...@beekhof.net wrote: But that is for 1.1 branch that is not considered as stable... No, existing functionality its very stable. Its just the new features

Re: [Pacemaker] VirtualDomain/DRBD live migration with pacemaker...

2010-06-15 Thread Dennis J.
On 06/14/2010 11:01 PM, Vadym Chepkov wrote: On Mon, Jun 14, 2010 at 4:37 PM, Erich Weilerwei...@soe.ucsc.edu wrote: Hi All, We have this interesting problem I was hoping someone could shed some light on. Basically, we have 2 servers acting as a pacemaker cluster for DRBD and VirtualDomain

Re: [Pacemaker] VirtualDomain/DRBD live migration with pacemaker...

2010-06-15 Thread Michael Schwartzkopff
Am Dienstag, 15. Juni 2010 20:25:09 schrieb Dennis J.: (...) Has anybody played with this yet: http://www.linux-kvm.com/content/qemu-kvm-012-adds-block-migration-feature Technically something like this should make it possible to do a live migration event when not using shared storage. I

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Vadym Chepkov
On Jun 15, 2010, at 3:36 PM, Dejan Muhamedagic wrote: Hi, On Tue, Jun 15, 2010 at 08:45:37AM -0400, Vadym Chepkov wrote: On Jun 15, 2010, at 6:14 AM, Dejan Muhamedagic wrote: Hi, On Tue, Jun 15, 2010 at 10:57:47AM +0200, Andrew Beekhof wrote: On Tue, Jun 15, 2010 at 10:23 AM,

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Dejan Muhamedagic
On Tue, Jun 15, 2010 at 01:50:06PM +0200, Andrew Beekhof wrote: On Tue, Jun 15, 2010 at 1:38 PM, Vadym Chepkov vchep...@gmail.com wrote: On Jun 15, 2010, at 4:57 AM, Andrew Beekhof wrote: On Tue, Jun 15, 2010 at 10:23 AM, Andreas Kurz andreas.k...@linbit.com wrote: On Tuesday 15 June

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Dejan Muhamedagic
On Tue, Jun 15, 2010 at 12:53:07PM -0400, Vadym Chepkov wrote: On Jun 15, 2010, at 9:26 AM, Vadym Chepkov wrote: what about this part? what do I need to do to prevent them from running on different nodes for sure? You can't have it both ways. Either they have to run on the same

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Dejan Muhamedagic
On Tue, Jun 15, 2010 at 03:41:17PM -0400, Vadym Chepkov wrote: On Jun 15, 2010, at 3:36 PM, Dejan Muhamedagic wrote: Hi, On Tue, Jun 15, 2010 at 08:45:37AM -0400, Vadym Chepkov wrote: On Jun 15, 2010, at 6:14 AM, Dejan Muhamedagic wrote: Hi, On Tue, Jun 15, 2010 at

Re: [Pacemaker] crm node delete

2010-06-15 Thread Dejan Muhamedagic
Hi, On Tue, Jun 15, 2010 at 05:09:14PM +0100, Maros Timko wrote: On Fri, Jun 11, 2010 at 03:45:19PM +0100, Maros Timko wrote: Hi all, using heartbeat stack. I have a system with one node offline: Last updated: Fri Jun 11 13:52:40 2010 Stack: Heartbeat Current DC:

Re: [Pacemaker] abrupt power failure problem

2010-06-15 Thread Dejan Muhamedagic
Hi, On Tue, Jun 15, 2010 at 01:15:08PM -0600, Dan Urist wrote: I've recently had exactly the same thing happen. One (highly kludgey!) solution I've considered is hacking a custom version of the stonith IPMI agent that would check whether the node was at all reachable following a stonith

Re: [Pacemaker] abrupt power failure problem

2010-06-15 Thread Dan Urist
On Tue, 15 Jun 2010 22:08:37 +0200 Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Tue, Jun 15, 2010 at 01:15:08PM -0600, Dan Urist wrote: I've recently had exactly the same thing happen. One (highly kludgey!) solution I've considered is hacking a custom version of the stonith IPMI

Re: [Pacemaker] abrupt power failure problem

2010-06-15 Thread Dejan Muhamedagic
Hi, On Tue, Jun 15, 2010 at 02:25:51PM -0600, Dan Urist wrote: On Tue, 15 Jun 2010 22:08:37 +0200 Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Tue, Jun 15, 2010 at 01:15:08PM -0600, Dan Urist wrote: I've recently had exactly the same thing happen. One (highly kludgey!)

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Dejan Muhamedagic
On Tue, Jun 15, 2010 at 04:44:31PM -0400, Vadym Chepkov wrote: On Jun 15, 2010, at 3:55 PM, Dejan Muhamedagic wrote: On Tue, Jun 15, 2010 at 03:41:17PM -0400, Vadym Chepkov wrote: On Jun 15, 2010, at 3:36 PM, Dejan Muhamedagic wrote: Hi, On Tue, Jun 15, 2010 at 08:45:37AM

Re: [Pacemaker] Shouldn't colocation -inf: be mandatory?

2010-06-15 Thread Vadym Chepkov
On Jun 15, 2010, at 5:26 PM, Dejan Muhamedagic wrote: On Tue, Jun 15, 2010 at 04:44:31PM -0400, Vadym Chepkov wrote: On Jun 15, 2010, at 3:55 PM, Dejan Muhamedagic wrote: On Tue, Jun 15, 2010 at 03:41:17PM -0400, Vadym Chepkov wrote: On Jun 15, 2010, at 3:36 PM, Dejan Muhamedagic

Re: [Pacemaker] abrupt power failure problem

2010-06-15 Thread Bernd Schubert
On Tuesday 15 June 2010, Dejan Muhamedagic wrote: Hi, On Tue, Jun 15, 2010 at 02:25:51PM -0600, Dan Urist wrote: On Tue, 15 Jun 2010 22:08:37 +0200 Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Tue, Jun 15, 2010 at 01:15:08PM -0600, Dan Urist wrote: I've recently had

Re: [Pacemaker] SBD Fencing daemon: explain me more clear

2010-06-15 Thread Aleksey Zholdak
You'd have a message in the logs about the driver rejecting the timeout, I think. That's what I see in the logs: sles2 sbd: [5059]: notice: Using watchdog device: /dev/watchdog sles2 sbd: [5059]: info: Set watchdog timeout to 180 seconds. sles2 kernel: [ 68.552201] hpwdt: New timer passed in