Serge Dubrouski escribió:
On Tue, Oct 21, 2008 at 4:12 AM, Adrian Chapela
<[EMAIL PROTECTED]> wrote:
Again I am testing with a more simple config file, with only one
master/slave resource.

In this case the resource becomes to be a Master but in the slave server the
DRBD is not running. The modules is not loaded and crm_mon says that drbd is
running as Slave:

Master/Slave Set: ms-drbd0
  drbd0:0     (ocf::heartbeat:drbd):  Master debianquagga2
  drbd0:1     (ocf::heartbeat:drbd):  Started debianquagga1

Resource configuration:
<resources>
<master_slave id="ms-drbd0">
                      <meta_attributes id="ma-ms-drbd0">
                              <attributes>
                                      <nvpair id="ma-ms-drbd0-1"
name="clone_max" value="2"/>
                                      <nvpair id="ma-ms-drbd0-2"
name="clone_node_max" value="1"/>
                                      <nvpair id="ma-ms-drbd0-3"
name="master_max" value="1"/>
                                      <nvpair id="ma-ms-drbd0-4"
name="master_node_max" value="1"/>
                                      <nvpair id="ma-ms-drbd0-5"
name="notify" value="yes"/>
                                      <nvpair id="ma-ms-drbd0-6"
name="globally_unique" value="false"/>
                              </attributes>
                      </meta_attributes>
                      <primitive id="drbd0" class="ocf" provider="heartbeat"
type="drbd">
                              <instance_attributes id="ia-drbd0">
                                      <attributes>
                                              <nvpair id="ia-drbd0-1"
name="drbd_resource" value="mail_disk"/>
                                      </attributes>
                              </instance_attributes>
                              <operations>
                                      <op id="op-ms-drbd2-1" name="monitor"
interval="59s" timeout="60s" start_delay="30s" role="Master"/>
                                      <op id="op-ms-drbd2-2" name="monitor"
interval="60s" timeout="60s" start_delay="30s" role="Slave"/>
                              </operations>

                      </primitive>
</master_slave>
</resources>

Why heartbeat is not monitoring the service in the slave node ?

It does. I think that your slave is in Standalone mode in DRBD, check
it with drbdadm cstate.
Adrian Chapela escribió:
Hello,

I am doing new tests. I am doing this tests to improve an old config and
to try to understand best multistate resources.

I can't  make it to works, my configuration is always bad.... I don't know
anything. I am using 2.1.4 release.

The cluster is only doing a notify action and then stop action. If I
started drbd myself, the cluster bring drbd in an Uncofigured resources
state. I don't know the reason. Should I specify a start action ?

[snip]
I found a better script but it wasn't multi state RA. I modified the script to fit my Multi State needs. I attached to this mail and you could download the last release (Now is the same as attached version) from: http://code.adrianchapela.net/heartbeat/drbd_HA

Could you have a look ? Could you test it ?

Now I have solved my problem of start the drbd resource in two nodes and one of them become a Master. I have now some troubles with the failover (Old Master is down....then Slave node detecst it but it doesn't do anything to be the new Master... ). I suppose an error in location rules.

Thank you
#!/bin/sh
#
# License: GNU General Public License (GPL)
# Author:  Martin Fick
# Date:    04/19/07
# Origin:  Hacked together from many other drbd and ocf scripts
#
# Adapted to be a MultState RA by: Adrian Chapela
# Date:    21/10/08
#
#       This script manages a drbd device
#
#       It can make a drbd device primary or secondary
#
#       Be sure to only allow this resource to run on the
#       two specific nodes where your drbd device is setup.
#
#       usage: $0 {start|stop|status|monitor|meta-data|promote|demote}
#
#
#       OCF parameters are as below
#       OCF_RESKEY_drbd_resource
#
#######################################################################
#
#
#    ocf_logi is a custom error log made by Martin Fick. I (Adrian Chapela) 
modified this function to log to another filename 'drbd_HA.log'.
#    To mantain compatibility with other Hearbeat RA, I added ocf_log in all 
locations which ocf_logi is used. This duplicates entries in logs, but I 
considered that is more 
#    positive to us to debug OCF script and Heartbeat configuration (to be it 
was..).
#
# We use the STATUS_CODES from here
#. /usr/lib/heartbeat/ocf-shellfuncs

# The next lines are to be more compatible with next releases of Heartbeat
if [ -n "$OCF_DEBUG_LIBRARY" ]; then
        . $OCF_DEBUG_LIBRARY
else
        . ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs
fi

USAGE="usage: $0 {start|stop|status|monitor|meta-data|promote|demote}";

#######################################################################

#HA_D=/etc/ha.d
#. ${HA_D}/shellfuncs


meta_data() {
        cat <<END
<?xml version="1.0"?>
<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
<resource-agent name="drbd">
 <version>0.0</version>
 <longdesc lang="en">
This script manages a drbd device
It can  make a drbd device primary or secondary
 </longdesc>
 <shortdesc lang="en">OCF MultiState Resource Agent compliant drbd 
script.</shortdesc>

 <parameters>

  <parameter name="drbd_resource" unique="1" required="1">
   <longdesc lang="en">
The drbd resource is a resource defined in /etc/drbd.conf
   </longdesc>
   <shortdesc lang="en">drbd resource</shortdesc>
    <content type="string" default="" />
   </parameter>

 </parameters>

 <actions>
  <action name="start"   timeout="1m" />
  <action name="stop"    timeout="1m" />
  <action name="monitor" depth="10"  timeout="1m" interval="5s" 
start-delay="1m" />
  <action name="meta-data"  timeout="1m" />
  <action name="promote"  timeout="1m" />
  <action name="demote"  timeout="1m" />
 </actions>
</resource-agent>
END
        exit $OCF_SUCCESS
}

ocf_logi() { # type msg
#       if [ "$1" != "err" ] ; then return ; fi
        shift
        echo `date`" - $@" >> /var/log/drbd_HA.log
}

drbd_reload() {
    drbd_stop || return
    drbd_start
}

drbd_stop() {
  #
  #     Is the device already secondary?
  #
#    drbd_status
#    if [ $? = $OCF_NOT_RUNNING ]; then  exit $OCF_NOT_RUNNING; fi

    #stop="$($DRBDADM secondary $RES 2>&1)"
    stop="$($DRBDADM down $RES 2>&1)"

    drbd_status ; rc=$?
    if [ $rc = $OCF_NOT_RUNNING ]; then  exit 0; fi

    ocf_logi err "$RES stop failed: ($rc)"
    ocf_log err "$RES stop failed: ($rc)"
    ocf_logi err "$stop"
    ocf_log err "$stop"

    return 1
}

drbd_start() {
    drbd_status
    if [ $? = $OCF_SUCCESS ]; then  return $OCF_SUCCESS; fi

       if is_drbd_enabled; then
            : OK
        else
            do_cmd modprobe -s drbd `$DRBDADM sh-mod-parms` || {
                    ocf_logi err "Can not load the drbd module."$'\n';
                    ocf_log err "Can not load the drbd module."$'\n';
                    return $OCF_ERR_GENERIC
            }
            ocf_logi debug "$RES start: Module loaded."
            ocf_log debug "$RES start: Module loaded."
        fi


        if [ "$STATE" != "Secondary" ]; then
                $DRBDADM up $RES
        fi

        # try several times, in case heartbeat deadtime
        # was smaller than drbd ping time
        #try=6
        #while true; do
        #       start="$($DRBDADM primary $RES 2>&1)" && break
        #       let "--try" || break
        #       sleep 1
        #done

        drbd_status ; rc=$?
        if [ $rc = $OCF_SUCCESS ]; then  return $OCF_SUCCESS; fi

        ocf_logi err "$RES start failed: ($rc)"
        ocf_logi err "$start"
        ocf_log err "$RES start failed: ($rc)"
        ocf_log err "$start"
        return $rc
}

drbd_status() {
        ST=$( $DRBDADM state $RES 2>&1 )
        STATE=${ST%/*}
        if [ "$STATE" = "Primary" ]; then
            echo "Primary - running as Master"
                rc=$OCF_RUNNING_MASTER
        elif [ "$STATE" = "Secondary" ]; then
            echo "Secondary - Slave"
                rc=$OCF_SUCCESS
        else
            echo "$ST"
                rc=$OCF_NOT_RUNNING
        fi
        return $rc
}

drbd_promote() {
        if is_drbd_enabled; then
            : OK
        else
            ocf_logi err "drbd is not enabled"
            ocf_log err "drbd is not enabled"
            return $OCF_ERR_GENERIC
        fi

        drbd_status

        if [ "$STATE" != "Secondary" ]; then
            ocf_logi err $RES" DRBD is not prepared to be a Master in this node 
;)"
            ocf_log err $RES" DRBD is not prepared to be a Master in this node 
;)"
            return $OCF_ERR_GENERIC

        else
             if $DRBDADM primary $RES ; then
                # TODO: WORK AROUND because drbdadm has a bug and
                # reports success even if it failed :-(
                drbd_status
                if [ "$STATE" = "Primary" ]; then
                       ocf_logi info "$RES promote: primary succeeded"
                       ocf_log info "$RES promote: primary succeeded"
                       return $OCF_SUCCESS
                else
                       ocf_logi err "$RES promote: Not primary despite drbdadm 
call."
                       ocf_log err "$RES promote: Not primary despite drbdadm 
call."
                fi
             else
                ocf_logi err "$RES promote: Failed with exit code $?."
                ocf_log err "$RES promote: Failed with exit code $?."
             fi
           return $OCF_ERR_GENERIC
        fi
}

drbd_demote() {
        # Always I test drbd module
        if is_drbd_enabled; then
            : OK
        else
            ocf_logi err "drbd is not enabled"
            ocf_log err "drbd is not enabled"
            return $OCF_ERR_SUCCESS
        fi

        drbd_status

        if [ "$STATE" = "Secondary" ]; then
            ocf_logi err $RES" demote: already secondary"
            ocf_log err $RES" demote: already secondary"
            return $OCF_SUCCESS
        fi
        
        if [ "$STATE" = "Not configured" ]; then
            ocf_logi debug "$RESOURCE demote: already stopped"
            ocf_log debug "$RESOURCE demote: already stopped"
            return $OCF_NOT_RUNNING
        fi

        # TODO: this is a _force_ operation. we may need to kill higher
        # levels (or switch them to r/o) to be able to demote drbd.
        # figure out how...

        if $DRBDADM primary $RES ; then
                sleep 2
                drbd_status
                if [ "$STATE" = "Primary" ]; then
                      ocf_logi err "$RESOURCE demote: still primary!"
                      ocf_log err "$RESOURCE demote: still primary!"
                      return $OCF_ERR_GENERIC
                fi

                ocf_logi debug "$RESOURCE demote: succeeded"
                ocf_log debug "$RESOURCE demote: succeeded"
                return $OCF_SUCCESS
        else
                ocf_logi err "$RESOURCE demote: Failed with exit code $?."
                ocf_log err "$RESOURCE demote: Failed with exit code $?."
               return $OCF_ERR_GENERIC
        fi

        return $OCF_SUCCESS
}

drbd_monitor() {
  drbd_status
}


do_cmd() {
        local cmd="$*"
        ocf_logi Debug "$RES: Calling $cmd"
        ocf_log Debug "$RES: Calling $cmd"
        local cmd_out=$($cmd 2>&1)
        ret=$?

        if [ $ret -ne 0 ]; then
                ocf_logi err "$RES: Called $cmd"
                ocf_logi err "$RES: Exit code $ret"
                ocf_logi err "$RES: Command output: $cmd_out"
                ocf_log err "$RES: Called $cmd"
                ocf_log err "$RES: Exit code $ret"
                ocf_log err "$RES: Command output: $cmd_out"

        else
                ocf_logi debug "$RES: Exit code $ret"
                ocf_logi debug "$RES: Command output: $cmd_out"
                ocf_log debug "$RES: Exit code $ret"
                ocf_log debug "$RES: Command output: $cmd_out"
        fi

        echo $cmd_out

        return $ret
}

is_drbd_enabled () {
        if [ -f /proc/drbd ]; then
                return 0
        fi
        return 1
}

usage() {
  echo $USAGE >&2
}

#
#       Make a drbd device primary or secondary
#

DEFAULTFILE="/etc/default/drbd"
DRBDADM="/sbin/drbdadm"

if [ -f $DEFAULTFILE ]; then
  . $DEFAULTFILE
fi

if
  [ $# -ne 1 ]
then
  usage
  exit $OCF_ERR_ARGS
fi

RES="$OCF_RESKEY_drbd_resource"


  case $1 in
    info)       cat <<-!INFO
        Abstract=DRBD Device Manager
        Argument=DRBD Resource Name
        Description:
        A DRBD device is a network raid block device which can have a primary 
and a secondary backing
        device each on a separate machine. DRBD will keep them in sync.
        Please rerun with the meta-data command for a list of \\
        valid arguments and their defaults.
        !INFO
        exit $OCF_SUCCESS;;
  esac

case $1 in
  start|stop|status|monitor|reload|promote|demote)
        ocf_logi Debug "$RES: $1"
        ocf_log Debug "$RES: $1"
        cmd_out=$($cmd 2>&1)
                drbd_$1 ;;
  meta-data)    meta_data;;
  usage)        usage; exit $OCF_SUCCESS;;
  validate-all|notify)  exit 0;;
  *)            usage
                exit $OCF_ERR_ARGS
                ;;
esac


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to