On 05/09/2011 09:07 PM, luca bianchi wrote:
> Hi guys,
>   I'm in trouble with my 2 servers cluster (pacemaker+corosync) running
>   some services over 3 instances cloned by DRBD.
>
>   The problem is: when I unplug the ethernet cable the Master/Slave role
>   doesn't change so the services cannot start on the server that is well
>   connected to the network.
>   While if I simulate a connectivity degraded (using IP tables) the switch
>   works well.
>
>   I attach below my running config and I ask a couple of questions:
>
>   - why the attribute value of MS resources is "10000"? Is it a default
>   value?
>   - How can I fix my problem?
>
>   I would like that when I unplug the cable all the MS resources become
>   MASTER on the well-connected NODE.
>
>   Thank you for your help
>
>
>   ---------------------------------------------------------------------------
>   Configuration
>   ---------------------------------------------------------------------------
>   node alfa
>   node beta
>   primitive ClusterIP ocf:heartbeat:IPaddr2 \
>      params ip="192.168.3.10" cidr_netmask="24" nic="eth0" iflabel="0" \
>      op monitor interval="2s"
>   primitive WebSite ocf:heartbeat:apache \
>      params configfile="/etc/httpd/conf/httpd.conf" \
>      op monitor start-delay="15s" interval="60s" \
>      op start interval="0" timeout="40s" \
>      op stop interval="0" timeout="60s"
>   primitive drbd_freeswitch ocf:linbit:drbd \
>      params drbd_resource="r2" \
>      op monitor interval="30s" \
>      op start interval="15" timeout="240s" \
>      op stop interval="0" timeout="100s"
>   primitive drbd_logAlfa ocf:linbit:drbd \
>      params drbd_resource="r0" \
>      op monitor interval="30s" \
>      op start interval="15" timeout="240s" \
>      op stop interval="0" timeout="100s"
>   primitive drbd_logBeta ocf:linbit:drbd \
>      params drbd_resource="r1" \
>      op monitor interval="30s" \
>      op start interval="15" timeout="240s" \
>      op stop interval="0" timeout="100s"
>   primitive freeswitch lsb:freeswitch \
>      op monitor interval="60s" \
>      op start interval="0" timeout="90s" \
>      op stop interval="0" timeout="100s"
>   primitive fs_drbd_freeswitch ocf:heartbeat:Filesystem \
>      params device="/dev/drbd2" directory="/data" fstype="ext3" \
>      op monitor interval="20s" timeout="40s" \
>      op start interval="15" timeout="60s" \
>      op stop interval="0" timeout="60s"
>   primitive fs_drbd_logAlfa ocf:heartbeat:Filesystem \
>      params device="/dev/drbd0" directory="/log_alfa" fstype="ext3" \
>      op monitor interval="20s" timeout="40s" \
>      op start interval="15" timeout="60s" \
>      op stop interval="0" timeout="60s"
>   primitive fs_drbd_logBeta ocf:heartbeat:Filesystem \
>      params device="/dev/drbd1" directory="/log_beta" fstype="ext3" \
>      op monitor interval="20s" timeout="40s" \
>      op start interval="15" timeout="60s" \
>      op stop interval="0" timeout="60s"
>   primitive pingd ocf:pacemaker:ping \
>      params host_list="alfa beta 192.168.3.100 192.168.3.1"
>   multiplier="1000" attempts="2" \
>      op monitor interval="3s" timeout="60s" \
>      op start interval="0" timeout="60s" \
>      op stop interval="0" timeout="20s"
>   primitive resMON ocf:pacemaker:ClusterMon \
>      operations $id="resMON-operations" \
>      op monitor interval="180" timeout="20" \
>      op start interval="0" timeout="90s" \
>      op stop interval="0" timeout="100s" \
>      params htmlfile="/data/srv/www/cluster-info/index.html"
>   extra_options="--snmp-trap 192.168.25.49"
>   group gr_freeswitch fs_drbd_freeswitch ClusterIP freeswitch resMON
>   WebSite \
>      meta resource-stickiness="50"
>   ms ms_drbd_freeswitch drbd_freeswitch \
>      meta master-max="1" master-node-max="1" clone-max="2"
>   clone-node-max="1" notify="true" globally-unique="false"
>   ms ms_drbd_logAlfa drbd_logAlfa \
>      meta master-max="1" master-node-max="1" clone-max="2"
>   clone-node-max="1" notify="true" globally-unique="false"
>   ms ms_drbd_logBeta drbd_logBeta \
>      meta master-max="1" master-node-max="1" clone-max="2"
>   clone-node-max="1" notify="true" globally-unique="false"
>   clone pingdClone pingd \
>      meta globally-unique="false"
>   location lo_gr_freeswitch gr_freeswitch \
>      rule $id="lo_gr_freeswitch-rule" 100: #uname eq alfa \
>      rule $id="lo_gr_freeswitch-rule-0" -25000: not_defined pingd or pingd
>   lte 1000 \
>      rule $id="lo_gr_freeswitch-rule-1" pingd: defined pingd
>   location ms_logAlfa__on__alfa ms_drbd_logAlfa \
>      rule $id="ms_logAlfa__on__alfa-rule" $role="master" 2000: #uname eq
>   alfa
>   location ms_logBeta__on__beta ms_drbd_logBeta \
>      rule $id="ms_logBeta__on__beta-rule" $role="master" 2000: #uname eq
>   beta
>   colocation freeswitch_on_drbd inf: gr_freeswitch
>   ms_drbd_freeswitch:Master
>   colocation fs_logAlfa__on__drbd_logAlfa inf: fs_drbd_logAlfa
>   ms_drbd_logAlfa:Master
>   colocation fs_logBeta__on__drbd_logBeta inf: fs_drbd_logBeta
>   ms_drbd_logBeta:Master
>   order freeswitch_after_drbd inf: ms_drbd_freeswitch:promote
>   gr_freeswitch:start
>   order fs_logAlfa__after__drbd_logAlfa inf: ms_drbd_logAlfa:promote
>   fs_drbd_logAlfa:start
>   order fs_logBeta__after__drbd_logBeta inf: ms_drbd_logBeta:promote
>   fs_drbd_logBeta:start
>   property $id="cib-bootstrap-options" \
>      stonith-enabled="false" \
>      default-resource-stickiness="1" \
>      no-quorum-policy="ignore" \
>      dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
>      cluster-infrastructure="openais" \
>      expected-quorum-votes="2"
>   rsc_defaults $id="rsc-options" \
>      resource-stickiness="1"
>
>   ------------------------------------------------------------------------
>   Log when both nodes have cable connected
>   ------------------------------------------------------------------------
>   crm_mon -A1
>   ============
>   Last updated: Mon May  9 11:24:18 2011
>   Stack: openais
>   Current DC: alfa - partition with quorum
>   Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
>   2 Nodes configured, 2 expected votes
>   7 Resources configured.
>   ============
>
>   Online: [ alfa beta ]
>
>    Resource Group: gr_freeswitch
>        fs_drbd_freeswitch   (ocf::heartbeat:Filesystem):   Started alfa
>        ClusterIP   (ocf::heartbeat:IPaddr2):   Started alfa
>        freeswitch   (lsb:freeswitch):   Started alfa
>        resMON   (ocf::pacemaker:ClusterMon):   Started alfa
>        WebSite   (ocf::heartbeat:apache):   Started alfa
>    Master/Slave Set: ms_drbd_logAlfa [drbd_logAlfa]
>        Masters: [ alfa ]
>        Slaves: [ beta ]
>    Master/Slave Set: ms_drbd_logBeta [drbd_logBeta]
>        Masters: [ beta ]
>        Slaves: [ alfa ]
>    fs_drbd_logAlfa   (ocf::heartbeat:Filesystem):   Started alfa
>    fs_drbd_logBeta   (ocf::heartbeat:Filesystem):   Started beta
>    Master/Slave Set: ms_drbd_freeswitch [drbd_freeswitch]
>        Masters: [ alfa ]
>        Slaves: [ beta ]
>    Clone Set: pingdClone [pingd]
>        Started: [ alfa beta ]
>
>   Node Attributes:
>   * Node alfa:
>       + master-drbd_freeswitch:0           : 10000
>       + master-drbd_logAlfa:0              : 10000
>       + master-drbd_logBeta:0              : 10000
>       + pingd                              : 4000
>   * Node beta:
>       + master-drbd_freeswitch:1           : 10000
>       + master-drbd_logAlfa:1              : 10000
>       + master-drbd_logBeta:1              : 10000
>       + pingd                              : 4000
>
>   ---------------------------------------------------------------
>   Log when only "beta" has the cable connected
>   ---------------------------------------------------------------
>   Last updated: Mon May  9 11:27:09 2011
>   Stack: openais
>   Current DC: beta - partition WITHOUT quorum
>   Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
>   2 Nodes configured, 2 expected votes
>   7 Resources configured.
>   ============
>
>   Online: [ beta ]
>   OFFLINE: [ alfa ]
>
>    Master/Slave Set: ms_drbd_logAlfa [drbd_logAlfa]
>        Slaves: [ beta ]
>        Stopped: [ drbd_logAlfa:1 ]
>    Master/Slave Set: ms_drbd_logBeta [drbd_logBeta]
>        Masters: [ beta ]
>        Stopped: [ drbd_logBeta:0 ]
>    fs_drbd_logBeta   (ocf::heartbeat:Filesystem):   Started beta
>    Master/Slave Set: ms_drbd_freeswitch [drbd_freeswitch]
>        Slaves: [ beta ]
>        Stopped: [ drbd_freeswitch:1 ]
>    Clone Set: pingdClone [pingd]
>        Started: [ beta ]
>        Stopped: [ pingd:0 ]
>
>   Node Attributes:
>   * Node beta:
>       + master-drbd_freeswitch:0           : 10000
>       + master-drbd_logAlfa:0              : 10000
>       + master-drbd_logBeta:1              : 10000
>       + pingd                              : 3000         : Connectivity is
>   degraded (Expected=4000)
>
>   Failed actions:
>       drbd_logAlfa:0_promote_0 (node=beta, call=1324, rc=-2, status=Timed
>   Out): unknown exec error
>       drbd_freeswitch:0_promote_0 (node=beta, call=1331, rc=-2,
>   status=Timed Out): unknown exec error
>       drbd_logAlfa:1_promote_0 (node=beta, call=1354, rc=-2, status=Timed
>   Out): unknown exec error
>       drbd_freeswitch:1_promote_0 (node=beta, call=1355, rc=-2,
>   status=Timed Out): unknown exec error
>
>
> :: SYSNET TELEMATICA srl ::
> CONFIDENZIALE:
> Questo messaggio e gli eventuali allegati sono confidenziali e riservati.
> Se vi è stato recapitato per errore e non siete fra i destinatari elencati,
> siete pregati di darne immediatamente avviso al mittente e cancellare il 
> messaggio
> di posta e gli eventuali file allegati. Le informazioni contenute non devono
> essere mostrate ad altri, né utilizzate, memorizzate o copiate in qualsiasi 
> forma.
>
> CONFIDENTIALITY :
> This e-mail and any attachments are confidential and may be privileged.
> If you are not a named recipient, please notify the sender immediately and 
> delete
> this e-mail and any attachment. Do not disclose the contents to another 
> person,
> use it for any purpose or store or copy the information in any medium.
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
Hi,

might be the line.

  order freeswitch_after_drbd inf: ms_drbd_freeswitch:promote
  gr_freeswitch:start

I would delete it and make the service order on something else.

I had a similar problem following a manual I found on the internet. The Nodes 
were in loop always trying to start the service.

Bye
Mario

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to