Re: [Pacemaker] DRBD - mixed master/slave oddity

2010-07-26 Thread Andrew Beekhof
On Mon, Jul 19, 2010 at 9:54 PM, William Seligman
selig...@nevis.columbia.edu wrote:
 I had an unusual experience in setting up a Pacemaker+DRBD configuration that 
 I
 thought I'd offer for comment.

 I have two nodes on my HA cluster; for now, let's call them Main and 
 Assistant.
 I have two DRBD resources, Admin and Work. I wanted my standard resource
 allocation to be that Admin would be master on Main and slave on Assistant, 
 and
 Work be master on Assistant and slave on Main.

I assume DRBD supports this scenario normally?


 Setting up Admin to be Master on main posed no problems; it was textbook (I'm
 omitting the location statements that set the IP groups to prefer to be on
 specific nodes):

 configure primitive AdminDrbd ocf:linbit:drbd params drbd_resource=admin \
   op monitor interval=60s
 configure master Admin AdminDrbd meta master-max=1 master-node-max=1 \
   clone-max=2 clone-node-max=1 notify=true globally-unique=false
 configure colocation AdminWithMainIP inf: Admin:Master MainIPGroup

 But thing didn't work when I tried to set it up for Work:

 configure primitive WorkDrbd ocf:linbit:drbd params drbd_resource=work \
   op monitor interval=60s
 configure master Work WorkDrbd meta master-max=1 master-node-max=1 \
   clone-max=2 clone-node-max=1 notify=true globally-unique=false
 configure colocation WorkWithAssistantIP inf: Work:Master AssistantIPGroup

 What happened was that Work would not be promoted to Master on either Admin or
 Main. The log file just said failed to promote without any other failure
 indication.

Which process logged that?
If it came from drbd then its almost certainly still a DRBD issue.

 I knew the problem wasn't with DRBD, since if I turned off corosync
 and just used the DRBD service with drbdadm, I could set Work to primary on 
 the
 assistant node.

 If I forced Work to be master on the Main node:

 configure colocation WorkWithMainIP inf: Work:Master MainIPGroup

 ... it worked fine, promoting Work on the Main node.

 The solution turned out to be not to use infinity in that final colocation
 statement:

 configure colocation WorkPrefersAssistant 1000: Work:Master AssistantIPGroup

 Then Pacemaker promoted Work on Assistant with no complaints. So it didn't it
 when forced, but only did it when asked.

 Any thoughts?

Order matters...

configure colocation someid score: A:master B

What this says, is: promote A where B is running.
If B is somewhere A cannot be a master, then it will remain a slave.


I suspect what you really wanted was:

configure colocation WorkPrefersAssistant inf: AssistantIPGroup Work:Master



 Versions:

 corosync-1.2.5-1.3.el5
 pacemaker-1.0.9.1-1.el5
 heartbeat-3.0.3-2.el5
 drbd-8.3.8-29.el5

 I'll post the complete configuration files if anyone is interested.


 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: 
 http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


[Pacemaker] DRBD - mixed master/slave oddity

2010-07-19 Thread William Seligman
I had an unusual experience in setting up a Pacemaker+DRBD configuration that I
thought I'd offer for comment.

I have two nodes on my HA cluster; for now, let's call them Main and Assistant.
I have two DRBD resources, Admin and Work. I wanted my standard resource
allocation to be that Admin would be master on Main and slave on Assistant, and
Work be master on Assistant and slave on Main.

Setting up Admin to be Master on main posed no problems; it was textbook (I'm
omitting the location statements that set the IP groups to prefer to be on
specific nodes):

configure primitive AdminDrbd ocf:linbit:drbd params drbd_resource=admin \
   op monitor interval=60s
configure master Admin AdminDrbd meta master-max=1 master-node-max=1 \
   clone-max=2 clone-node-max=1 notify=true globally-unique=false
configure colocation AdminWithMainIP inf: Admin:Master MainIPGroup

But thing didn't work when I tried to set it up for Work:

configure primitive WorkDrbd ocf:linbit:drbd params drbd_resource=work \
   op monitor interval=60s
configure master Work WorkDrbd meta master-max=1 master-node-max=1 \
   clone-max=2 clone-node-max=1 notify=true globally-unique=false
configure colocation WorkWithAssistantIP inf: Work:Master AssistantIPGroup

What happened was that Work would not be promoted to Master on either Admin or
Main. The log file just said failed to promote without any other failure
indication. I knew the problem wasn't with DRBD, since if I turned off corosync
and just used the DRBD service with drbdadm, I could set Work to primary on the
assistant node.

If I forced Work to be master on the Main node:

configure colocation WorkWithMainIP inf: Work:Master MainIPGroup

... it worked fine, promoting Work on the Main node.

The solution turned out to be not to use infinity in that final colocation
statement:

configure colocation WorkPrefersAssistant 1000: Work:Master AssistantIPGroup

Then Pacemaker promoted Work on Assistant with no complaints. So it didn't it
when forced, but only did it when asked.

Any thoughts? 

Versions:

corosync-1.2.5-1.3.el5
pacemaker-1.0.9.1-1.el5
heartbeat-3.0.3-2.el5
drbd-8.3.8-29.el5

I'll post the complete configuration files if anyone is interested.


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker