[Pacemaker] drbd connection

2013-06-17 Thread andreas graeper
hi, i tried as i found in tutorial to kill -9 corosync on active node (n1). but the other node (n2) failed to demote drbd. after corosync start on n1, n2:drbd was left unmanaged. but /proc/drbd on both nodes looked good: connected and uptodate. how in such situation a resource can get managed

Re: [Pacemaker] drbd connection

2013-06-17 Thread andreas graeper
little error: n2 failed to promote drbd ! when i try to `drbdadm connect r0` on both nodes, it looks to me that the connection state can change from Standalone to WFConnection iff the other node is currently Standalone. WFConnection on both nodes does not meet at the same time. thanks andreas

Re: [Pacemaker] drbd connection

2013-06-17 Thread Digimer
My guess is you don't have (working) fencing/stonith? Can you pastebin your 'pcs config show' please? Also, 'drbdadm dump' please. digimer On 06/17/2013 09:32 AM, andreas graeper wrote: hi, i tried as i found in tutorial to kill -9 corosync on active node (n1). but the other node (n2) failed

Re: [Pacemaker] drbd connection

2013-06-17 Thread Digimer
On 06/17/2013 09:53 AM, andreas graeper wrote: hi, i will not have a stonith-device. i can test for a day a 'expert power control 8212', but in the end i will stay without. This is an extremely flawed approach. Clustering with shared storage and without stonith will certainly cause data loss

Re: [Pacemaker] drbd connection

2013-06-17 Thread Digimer
If you look in your logs when you try to connect the two nodes, you will likely see a message like split-brain detected, dropping connection. This is the result of not using fencing as you created a condition where both nodes went StandAlone and Primary. To prevent this, you need to setup

[Pacemaker] IPaddr2 route problem on active

2013-06-17 Thread Longina Przybyszewska
Hi, I have 2 node setup active/passive with drbd/file system/ip-failover Ubuntu-12.04-2 Linux 3.5.0-34-generic After Ip-failover is established on active node, mount client on active node uses still real iP-addresse instead of alias ip . I use standard simple configuration: --- primitive

[Pacemaker] drbd-fence-by-handler

2013-06-17 Thread andreas graeper
hi, rsc_location rsc=ms_drbd id=drbd-fence-by-handler-r0-ms_drbd rule role=Master score=-INFINITY id=drbd-fence-by-handler-r0-rule-ms_drbd expression attribute=#uname operation=ne value=n1 id=drbd-fence-by-handler-r0-expr-ms_drbd/ /rule /rsc_location whats this ? thanks andreas

Re: [Pacemaker] drbd connection

2013-06-17 Thread Elmar Marschke
Am 17.06.2013 15:59, schrieb Digimer: On 06/17/2013 09:53 AM, andreas graeper wrote: hi, i will not have a stonith-device. i can test for a day a 'expert power control 8212', but in the end i will stay without. This is an extremely flawed approach. Clustering with shared storage and without

[Pacemaker] Pacemaker issues on Amazon EC2

2013-06-17 Thread Jon Eisenstein
tl;dr summary: On EC2, we can't reuse IP addresses, and we need a reliable, scriptable procedure for replacing a dead (guaranteed no longer running) server with another one without needing to take the remaining cluster members down. I'm trying to build a Pacemaker solution using Percona

Re: [Pacemaker] Starting Pacemaker Cluster Manager: [FAILED]

2013-06-17 Thread Andrew Beekhof
On 18/06/2013, at 3:09 AM, Colin Blair cbl...@technicacorp.com wrote: All, Newbie here. I am trying to create a two-node cluster with the following: Ubuntu Server 11.10 Pacemaker 1.1.5 Corosync Cluster Engine 1.3.0 CMAN I am unable to start Pacemaker. CMAN seems to run with

Re: [Pacemaker] Is there any character which must not be used for an attribute name?

2013-06-17 Thread yusuke iida
Hi, Andrew I used libqb which installed from source. A version is tag:v0.14.4. I read the code of Pacemaker. default_ping_set(1) is connected with CRM_meta_ and becomes CRM_meta_default_ping_set(1). It had failed, when it was passed to xmlCtxtReadDoc(). Jun 5 14:43:13 vm1 crmd[22669]:

Re: [Pacemaker] Is there any character which must not be used for an attribute name?

2013-06-17 Thread Andrew Beekhof
On 18/06/2013, at 11:42 AM, yusuke iida yusk.i...@gmail.com wrote: Hi, Andrew I used libqb which installed from source. A version is tag:v0.14.4. I read the code of Pacemaker. default_ping_set(1) is connected with CRM_meta_ and becomes CRM_meta_default_ping_set(1). It had failed, when

Re: [Pacemaker] Pacemaker issues on Amazon EC2

2013-06-17 Thread Andrew Beekhof
On 18/06/2013, at 7:19 AM, Jon Eisenstein j...@animoto.com wrote: tl;dr summary: On EC2, we can't reuse IP addresses, and we need a reliable, scriptable procedure for replacing a dead (guaranteed no longer running) server with another one without needing to take the remaining cluster

Re: [Pacemaker] Weired resource-stickiness behavior

2013-06-17 Thread Andrew Beekhof
On 14/06/2013, at 3:52 PM, Xiaomin Zhang zhangxiao...@gmail.com wrote: Hi, Andrew: If I cut down the network connection of the running node by: service network stop, crm status will show me the node is put into OFFLINE status. The affected resource can also be failed over to another

Re: [Pacemaker] Pacemaker issues on Amazon EC2

2013-06-17 Thread Jon Eisenstein
On Jun 17, 2013, at 11:31 PM, Andrew Beekhof and...@beekhof.net wrote: On 18/06/2013, at 7:19 AM, Jon Eisenstein j...@animoto.com wrote: tl;dr summary: On EC2, we can't reuse IP addresses, and we need a reliable, scriptable procedure for replacing a dead (guaranteed no longer running)

Re: [Pacemaker] Pacemaker issues on Amazon EC2

2013-06-17 Thread Andrew Beekhof
On 18/06/2013, at 1:46 PM, Jon Eisenstein j...@animoto.com wrote: On Jun 17, 2013, at 11:31 PM, Andrew Beekhof and...@beekhof.net wrote: On 18/06/2013, at 7:19 AM, Jon Eisenstein j...@animoto.com wrote: tl;dr summary: On EC2, we can't reuse IP addresses, and we need a reliable,

Re: [Pacemaker] Pacemaker issues on Amazon EC2

2013-06-17 Thread Jon Eisenstein
On Jun 18, 2013, at 12:12 AM, Andrew Beekhof and...@beekhof.net wrote: On 18/06/2013, at 1:46 PM, Jon Eisenstein j...@animoto.com wrote: On Jun 17, 2013, at 11:31 PM, Andrew Beekhof and...@beekhof.net wrote: On 18/06/2013, at 7:19 AM, Jon Eisenstein j...@animoto.com wrote: tl;dr

Re: [Pacemaker] Pacemaker issues on Amazon EC2

2013-06-17 Thread Andrew Beekhof
On 18/06/2013, at 2:23 PM, Jon Eisenstein j...@animoto.com wrote: On Jun 18, 2013, at 12:12 AM, Andrew Beekhof and...@beekhof.net wrote: On 18/06/2013, at 1:46 PM, Jon Eisenstein j...@animoto.com wrote: On Jun 17, 2013, at 11:31 PM, Andrew Beekhof and...@beekhof.net wrote: On