Hi guys,
 I'm in trouble with my 2 servers cluster (pacemaker+corosync) running
 some services over 3 instances cloned by DRBD.

 The problem is: when I unplug the ethernet cable the Master/Slave role
 doesn't change so the services cannot start on the server that is well
 connected to the network.
 While if I simulate a connectivity degraded (using IP tables) the switch
 works well.

 I attach below my running config and I ask a couple of questions:

 - why the attribute value of MS resources is "10000"? Is it a default
 value?
 - How can I fix my problem?

 I would like that when I unplug the cable all the MS resources become
 MASTER on the well-connected NODE.

 Thank you for your help


 ---------------------------------------------------------------------------
 Configuration
 ---------------------------------------------------------------------------
 node alfa
 node beta
 primitive ClusterIP ocf:heartbeat:IPaddr2 \
    params ip="192.168.3.10" cidr_netmask="24" nic="eth0" iflabel="0" \
    op monitor interval="2s"
 primitive WebSite ocf:heartbeat:apache \
    params configfile="/etc/httpd/conf/httpd.conf" \
    op monitor start-delay="15s" interval="60s" \
    op start interval="0" timeout="40s" \
    op stop interval="0" timeout="60s"
 primitive drbd_freeswitch ocf:linbit:drbd \
    params drbd_resource="r2" \
    op monitor interval="30s" \
    op start interval="15" timeout="240s" \
    op stop interval="0" timeout="100s"
 primitive drbd_logAlfa ocf:linbit:drbd \
    params drbd_resource="r0" \
    op monitor interval="30s" \
    op start interval="15" timeout="240s" \
    op stop interval="0" timeout="100s"
 primitive drbd_logBeta ocf:linbit:drbd \
    params drbd_resource="r1" \
    op monitor interval="30s" \
    op start interval="15" timeout="240s" \
    op stop interval="0" timeout="100s"
 primitive freeswitch lsb:freeswitch \
    op monitor interval="60s" \
    op start interval="0" timeout="90s" \
    op stop interval="0" timeout="100s"
 primitive fs_drbd_freeswitch ocf:heartbeat:Filesystem \
    params device="/dev/drbd2" directory="/data" fstype="ext3" \
    op monitor interval="20s" timeout="40s" \
    op start interval="15" timeout="60s" \
    op stop interval="0" timeout="60s"
 primitive fs_drbd_logAlfa ocf:heartbeat:Filesystem \
    params device="/dev/drbd0" directory="/log_alfa" fstype="ext3" \
    op monitor interval="20s" timeout="40s" \
    op start interval="15" timeout="60s" \
    op stop interval="0" timeout="60s"
 primitive fs_drbd_logBeta ocf:heartbeat:Filesystem \
    params device="/dev/drbd1" directory="/log_beta" fstype="ext3" \
    op monitor interval="20s" timeout="40s" \
    op start interval="15" timeout="60s" \
    op stop interval="0" timeout="60s"
 primitive pingd ocf:pacemaker:ping \
    params host_list="alfa beta 192.168.3.100 192.168.3.1"
 multiplier="1000" attempts="2" \
    op monitor interval="3s" timeout="60s" \
    op start interval="0" timeout="60s" \
    op stop interval="0" timeout="20s"
 primitive resMON ocf:pacemaker:ClusterMon \
    operations $id="resMON-operations" \
    op monitor interval="180" timeout="20" \
    op start interval="0" timeout="90s" \
    op stop interval="0" timeout="100s" \
    params htmlfile="/data/srv/www/cluster-info/index.html"
 extra_options="--snmp-trap 192.168.25.49"
 group gr_freeswitch fs_drbd_freeswitch ClusterIP freeswitch resMON
 WebSite \
    meta resource-stickiness="50"
 ms ms_drbd_freeswitch drbd_freeswitch \
    meta master-max="1" master-node-max="1" clone-max="2"
 clone-node-max="1" notify="true" globally-unique="false"
 ms ms_drbd_logAlfa drbd_logAlfa \
    meta master-max="1" master-node-max="1" clone-max="2"
 clone-node-max="1" notify="true" globally-unique="false"
 ms ms_drbd_logBeta drbd_logBeta \
    meta master-max="1" master-node-max="1" clone-max="2"
 clone-node-max="1" notify="true" globally-unique="false"
 clone pingdClone pingd \
    meta globally-unique="false"
 location lo_gr_freeswitch gr_freeswitch \
    rule $id="lo_gr_freeswitch-rule" 100: #uname eq alfa \
    rule $id="lo_gr_freeswitch-rule-0" -25000: not_defined pingd or pingd
 lte 1000 \
    rule $id="lo_gr_freeswitch-rule-1" pingd: defined pingd
 location ms_logAlfa__on__alfa ms_drbd_logAlfa \
    rule $id="ms_logAlfa__on__alfa-rule" $role="master" 2000: #uname eq
 alfa
 location ms_logBeta__on__beta ms_drbd_logBeta \
    rule $id="ms_logBeta__on__beta-rule" $role="master" 2000: #uname eq
 beta
 colocation freeswitch_on_drbd inf: gr_freeswitch
 ms_drbd_freeswitch:Master
 colocation fs_logAlfa__on__drbd_logAlfa inf: fs_drbd_logAlfa
 ms_drbd_logAlfa:Master
 colocation fs_logBeta__on__drbd_logBeta inf: fs_drbd_logBeta
 ms_drbd_logBeta:Master
 order freeswitch_after_drbd inf: ms_drbd_freeswitch:promote
 gr_freeswitch:start
 order fs_logAlfa__after__drbd_logAlfa inf: ms_drbd_logAlfa:promote
 fs_drbd_logAlfa:start
 order fs_logBeta__after__drbd_logBeta inf: ms_drbd_logBeta:promote
 fs_drbd_logBeta:start
 property $id="cib-bootstrap-options" \
    stonith-enabled="false" \
    default-resource-stickiness="1" \
    no-quorum-policy="ignore" \
    dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
    cluster-infrastructure="openais" \
    expected-quorum-votes="2"
 rsc_defaults $id="rsc-options" \
    resource-stickiness="1"

 ------------------------------------------------------------------------
 Log when both nodes have cable connected
 ------------------------------------------------------------------------
 crm_mon -A1
 ============
 Last updated: Mon May  9 11:24:18 2011
 Stack: openais
 Current DC: alfa - partition with quorum
 Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
 2 Nodes configured, 2 expected votes
 7 Resources configured.
 ============

 Online: [ alfa beta ]

  Resource Group: gr_freeswitch
      fs_drbd_freeswitch   (ocf::heartbeat:Filesystem):   Started alfa
      ClusterIP   (ocf::heartbeat:IPaddr2):   Started alfa
      freeswitch   (lsb:freeswitch):   Started alfa
      resMON   (ocf::pacemaker:ClusterMon):   Started alfa
      WebSite   (ocf::heartbeat:apache):   Started alfa
  Master/Slave Set: ms_drbd_logAlfa [drbd_logAlfa]
      Masters: [ alfa ]
      Slaves: [ beta ]
  Master/Slave Set: ms_drbd_logBeta [drbd_logBeta]
      Masters: [ beta ]
      Slaves: [ alfa ]
  fs_drbd_logAlfa   (ocf::heartbeat:Filesystem):   Started alfa
  fs_drbd_logBeta   (ocf::heartbeat:Filesystem):   Started beta
  Master/Slave Set: ms_drbd_freeswitch [drbd_freeswitch]
      Masters: [ alfa ]
      Slaves: [ beta ]
  Clone Set: pingdClone [pingd]
      Started: [ alfa beta ]

 Node Attributes:
 * Node alfa:
     + master-drbd_freeswitch:0           : 10000    
     + master-drbd_logAlfa:0              : 10000    
     + master-drbd_logBeta:0              : 10000    
     + pingd                              : 4000      
 * Node beta:
     + master-drbd_freeswitch:1           : 10000    
     + master-drbd_logAlfa:1              : 10000    
     + master-drbd_logBeta:1              : 10000    
     + pingd                              : 4000      

 ---------------------------------------------------------------
 Log when only "beta" has the cable connected
 ---------------------------------------------------------------
 Last updated: Mon May  9 11:27:09 2011
 Stack: openais
 Current DC: beta - partition WITHOUT quorum
 Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
 2 Nodes configured, 2 expected votes
 7 Resources configured.
 ============

 Online: [ beta ]
 OFFLINE: [ alfa ]

  Master/Slave Set: ms_drbd_logAlfa [drbd_logAlfa]
      Slaves: [ beta ]
      Stopped: [ drbd_logAlfa:1 ]
  Master/Slave Set: ms_drbd_logBeta [drbd_logBeta]
      Masters: [ beta ]
      Stopped: [ drbd_logBeta:0 ]
  fs_drbd_logBeta   (ocf::heartbeat:Filesystem):   Started beta
  Master/Slave Set: ms_drbd_freeswitch [drbd_freeswitch]
      Slaves: [ beta ]
      Stopped: [ drbd_freeswitch:1 ]
  Clone Set: pingdClone [pingd]
      Started: [ beta ]
      Stopped: [ pingd:0 ]

 Node Attributes:
 * Node beta:
     + master-drbd_freeswitch:0           : 10000    
     + master-drbd_logAlfa:0              : 10000    
     + master-drbd_logBeta:1              : 10000    
     + pingd                              : 3000         : Connectivity is
 degraded (Expected=4000)

 Failed actions:
     drbd_logAlfa:0_promote_0 (node=beta, call=1324, rc=-2, status=Timed
 Out): unknown exec error
     drbd_freeswitch:0_promote_0 (node=beta, call=1331, rc=-2,
 status=Timed Out): unknown exec error
     drbd_logAlfa:1_promote_0 (node=beta, call=1354, rc=-2, status=Timed
 Out): unknown exec error
     drbd_freeswitch:1_promote_0 (node=beta, call=1355, rc=-2,
 status=Timed Out): unknown exec error


:: SYSNET TELEMATICA srl ::
CONFIDENZIALE:
Questo messaggio e gli eventuali allegati sono confidenziali e riservati.
Se vi è stato recapitato per errore e non siete fra i destinatari elencati,
siete pregati di darne immediatamente avviso al mittente e cancellare il 
messaggio
di posta e gli eventuali file allegati. Le informazioni contenute non devono
essere mostrate ad altri, né utilizzate, memorizzate o copiate in qualsiasi 
forma.

CONFIDENTIALITY :
This e-mail and any attachments are confidential and may be privileged.
If you are not a named recipient, please notify the sender immediately and 
delete
this e-mail and any attachment. Do not disclose the contents to another person,
use it for any purpose or store or copy the information in any medium.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to