Re: [Linux-HA] resources don't migrate when node is declared dead?!?

Jean-Francois Malouin Wed, 30 Jul 2008 11:39:54 -0700

* ZiLioN ZilLioN <[EMAIL PROTECTED]> [20080729 19:03]:
> 
> 
> 
> > Date: Tue, 29 Jul 2008 17:27:38 -0400
> > From: [EMAIL PROTECTED]
> > To: [email protected]
> > Subject: [Linux-HA] resources don't migrate when node is declared dead?!?
> > 
> > My cluster contains 2 active/passive nodes with one drbd master/slave
> > resource and one group resource which itself contains 7 resources. I
> > want the m/s and group to be colocated and when the master loose it's
> > ping then the slave should be promoted but nothing happens when I
> > pulled the ethernet cable... Here's what the constrains look like in
> > the cib right now:
> >


[...]

> 
> >        <rsc_location id="drbd_id:connected" rsc="ms-drbd_id">
> >          <rule role="master" id="drbd_id:connected:rule" 
> > score_attribute="pingd">
> >            <expression id="drbd_id:connected-rule-1" attribute="pingd" 
> > operation="defined"/>
> >          </rule>
> 
> If your score_attribute="pingd", your pingd scaling factor is 100, then 
> having access to one node is worth 100, 2 nodes is worth 200, and so on.
> If you don´t have connectivity score_attribute=0.
> 
> 
> >        </rsc_location>
> >        <rsc_location id="cli-prefer-mysql_id" rsc="mysql_id">
> >          <rule id="cli-prefer-rule-mysql_id" score="INFINITY">
> >            <expression id="cli-prefer-expr-mysql_id" attribute="#uname" 
> > operation="eq" value="feeble-1" type="string"/>
> >          </rule>
> 
> If the group_id runs on the same machine as ms-drbd_id, you can not specify 
> to mysql_id run always in feeble-1, because mysql_id belongs to the group 
> group_id. Your group_id runs on the same machine as ms-drbd_id and ms-drbd_id 
> runs in the node with the best connectivity.
> 
> >        </rsc_location>
> >        <rsc_location id="cli-prefer-drbd_id:0" rsc="drbd_id:0">
> >          <rule id="cli-prefer-rule-drbd_id:0" score="INFINITY">
> >            <expression id="cli-prefer-expr-drbd_id:0"
> > attribute="#uname" operation="eq" value="feeble-0" type="string"/>
> >          </rule>
> >        </rsc_location>
> >        <rsc_location id="cli-prefer-drbd_id:1" rsc="drbd_id:1">
> >          <rule id="cli-prefer-rule-drbd_id:1" score="INFINITY">
> >            <expression id="cli-prefer-expr-drbd_id:1"
> > attribute="#uname" operation="eq" value="feeble-0" type="string"/>
> >          </rule>
> >        </rsc_location>
> >      </constraints>

I have no clue where these constraints come from. I certainly didn't
put them manually...maybe the crm did while I was messing things up.
In any case I have removed those and replace the pingd constraint
as on the doc on ha-linux suggests:

<rsc_location id="ms-drbd_id:connected" rsc="ms-drbd_id">
  <rule id="ms-drbd_id:connected:rule" score="-INFINITY" boolean_op="or">
    <expression id="ms-drbd_id:connected:expr:undefined" attribute="pingd" 
operation="not_defined"/>
    <expression id="ms-drbd_id:connected:expr:zero" attribute="pingd" 
operation="lte" value="0"/>
  </rule>
</rsc_location>

and when I pull the cable on the master all the resources migrate to
the other nope. I'm happy with that but I face another problem and
I'll start a new thread on it rather than continuing here.

> > ~# showscores.sh 
> > Resource            Score     Node            Stickiness #Fail 
> > Fail-Stickiness 
> > -1000000_(master)   -INFINITY ptest[11262]:   100        -1001              
> >   
> > 1000000_(master)    INFINITY  ptest[11262]:   100        -1001
> > 100_(master)        100       ptest[11262]:   100        -1001
> > 175_(master)        175       ptest[11262]:   100        -1001
> > 76_(master)         76        ptest[11262]:   100        -1001
> 
> I don´t understand this, where is the name of resources? node?

I updated the showscores script as found on
http://hg.clusterlabs.org/pacemaker/dev/file/tip/contrib/showscores.sh

and now my scores looks like:

Resource            Score     Node            Stickiness #Fail Fail-Stickiness 
apache_id           0         feeble-0        100        0 -1001 
apache_id           600       feeble-1        100        0 -1001
drbd_id:0           -INFINITY feeble-0        100        0 -1001
drbd_id:0           75        feeble-1        100        0 -1001
drbd_id:0_(master)  INFINITY  feeble-1        100        0 -1001
drbd_id:1           76        feeble-1        100        0 -1001
drbd_id:1           INFINITY  feeble-0        100        0 -1001
drbd_id:1_(master)  75        feeble-0        100        0 -1001
drbd_id:2           -INFINITY feeble-0        100        0 -1001
drbd_id:2           -INFINITY feeble-1        100        0 -1001
fs_id               0         feeble-0        100        0 -1001
fs_id               600       feeble-1        100        0 -1001
group_id            0         feeble-0        100        0 -1001
group_id            600       feeble-1        100        0 -1001
ip_id               0         feeble-0        100        0 -1001
ip_id               600       feeble-1        100        0 -1001
mysql_id            0         feeble-0        100        0 -1001
mysql_id            INFINITY  feeble-1        100        0 -1001
nfs_common-id       0         feeble-0        100        0 -1001
nfs_common-id       600       feeble-1        100        0 -1001
nfs_kernel-id       0         feeble-0        100        0 -1001
nfs_kernel-id       600       feeble-1        100        0 -1001

I do not understand yet how exactly the scores are calculated for a
master-slave resource and I don't know if the showscores script deals
with them correctly...


> 
> > 
> > Any ideas?
> 
> Before pulled the ethernet cable, try:
> 
> cibadmin -Q -o nodes
> 
> You can see the all the nodes of your cluster?
> Today in my cluster only a node of the two was recognized.

both nodes are there.

> 
> 
> I do not think you have helped a lot, even I do not have much
> experience with the cib.xml. I hope that the comments at least you are
> beneficial.

thanks anyway.
jf
-- 
<° ><
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] resources don't migrate when node is declared dead?!?

Reply via email to