On Friday 14 October 2016 04:00 AM, Andy Zhou wrote:


Done. Now it shows the following.

    [root@h2 ovs]# crm configure show

    node1: h1 \

    attributes

    node2: h2

    primitiveClusterIP IPaddr2 \

    paramsip=10.33.75.200cidr_netmask=32\

    opstart interval=0stimeout=20s\

    opstop interval=0stimeout=20s\

    opmonitor interval=30s

    primitiveovndb ocf:ovn:ovndb-servers \

    opstart interval=0stimeout=30s\

    opstop interval=0stimeout=20s\

    oppromote interval=0stimeout=50s\

    opdemote interval=0stimeout=50s\

    opmonitor interval=1min\

    meta

    msovndb-master ovndb\

    metanotify=true

    colocationcolocation-ovndb-master-ClusterIP-INFINITY inf:
    ovndb-master:Started ClusterIP:Master

    orderorder-ClusterIP-ovndb-master-mandatory inf: ClusterIP:start
    ovndb-master:start

    propertycib-bootstrap-options: \

    have-watchdog=false\

    dc-version=1.1.13-10.el7_2.4-44eb2dd\

    cluster-infrastructure=corosync\

    cluster-name=mycluster\

    stonith-enabled=false

    propertyovn_ovsdb_master_server: \




My installation does not have ocf-tester, There is a program called ocft with a test option. Not sure if this is a suitable replacement. If not, how could I get the ocf-tester program? I ran the ocft program and get the following output. Not sure what it means.

[root@h2 ovs]# ocft test -n test-ovndb -o master_ip 10.0.0.1 /usr/share/openvswitch/scripts/ovndb-servers.ocf

ERROR: cases directory not found.



I have attached ocf-tester with this mail. I guess it's a standalone script. If it does not work, I think it's better not to attempt anymore as we have another way to find out.



    Alternately, you can check if the ovsdb servers are started
    properly by running

    /usr/share/openvswitch/scripts/ovn-ctl --db-sb-sync-from=10.0.0.1
    --db-nb-sync-from=10.0.0.1 start_ovsdb


The output are as follows. Should we use --db-sb-sync-from-addr instead?
[root@h2 ovs]# /usr/share/openvswitch/scripts/ovn-ctl --db-sb-sync-from=10.0.0.1 --db-nb-sync-from=10.0.0.1 start_ovsdb

/usr/share/openvswitch/scripts/ovn-ctl: unknown option "--db-sb-sync-from=10.0.0.1" (use --help for help)

/usr/share/openvswitch/scripts/ovn-ctl: unknown option "--db-nb-sync-from=10.0.0.1" (use --help for help)

'ovn-ctl' runs without any error message after I fixed the command line parameter.

I am sorry for the misinformation, Andy. What you ran is correct. Could you check the status of the ovsdb servers in h2 after you run the above command using the following commands.

ovn-ctl status_ovnnb
ovn-ctl status_ovnsb

Both the above commands should return "running/backup". If you see in the OCF script in the function ovsdb_server_start(), we wait indefinitely till the DB servers are started. Since the 'start' action on h2 times out, I doubt that the servers are not started properly.


    I think this is mostly due to the crm configuration. Once you add
    the 'ms' and 'colocation' resources, you should be able to
    overcome this problem.

No, ovndb still failed to launch on h2.

[root@h2 ovs]# crm status

Last updated: Thu Oct 13 11:27:42 2016Last change: Thu Oct 13 11:17:25 2016 by root via cibadmin on h2

Stack: corosync

Current DC: h2 (version 1.1.13-10.el7_2.4-44eb2dd) - partition with quorum

2 nodes and 3 resources configured


*Online*: [ h1 h2 ]


Full list of resources:


*ClusterIP*(ocf::heartbeat:IPaddr2):*Started*h1

*Master*/*Slave*Set: ovndb-*master*[ovndb]

*Master*s: [ h1 ]

*Stopped*: [ h2 ]


Failed Actions:

* ovndb_start_0 on h2 '*unknown error*' (1): call=39, status=*Timed Out*, exitreason='none',

    last-rc-change='Wed Oct 12 14:43:03 2016', queued=0ms, exec=30003

    I have never tried colocating two resources with the ClusterIP
    resource. Just for testing, is it possible to drop the WebServer
    resource?

Done.  It did not make any difference that I can see.


#!/bin/sh
#
#       $Id: ocf-tester,v 1.2 2006/08/14 09:38:20 andrew Exp $
#
# Copyright (c) 2006 Novell Inc, Andrew Beekhof
#                    All Rights Reserved.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of version 2 of the GNU General Public License as
# published by the Free Software Foundation.
#
# This program is distributed in the hope that it would be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# Further, this software is distributed without any warranty that it is
# free of the rightful claim of any third person regarding infringement
# or the like.  Any license provided herein, whether implied or
# otherwise, applies only to this software file.  Patent licenses, if
# any, provided herein do not apply to combinations of this program with
# other software, or any other product whatsoever.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write the Free Software Foundation,
# Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
#

LRMD=/usr/lib/heartbeat/lrmd
LRMADMIN=/usr/sbin/lrmadmin
DATADIR=/usr/share
METADATA_LINT="xmllint --noout --valid -"

# set some common meta attributes, which are expected to be
# present by resource agents
export OCF_RESKEY_CRM_meta_timeout=20000  # 20 seconds timeout
export OCF_RESKEY_CRM_meta_interval=10000  # reset this for probes

num_errors=0

info() {
    [ "$quiet" -eq 1 ] && return
    echo "$*"
}
debug() {
    [ "$verbose" -eq 0 ] && return
    echo "$*"
}
usage() {
    # make sure to output errors on stderr
    [ "x$1" = "x0" ] || exec >&2

    echo "Tool for testing if a cluster resource is OCF compliant"
    echo ""
    echo "Usage: ocf-tester [-Lh] -n resource_name [-o name=value]* 
/full/path/to/resource/agent"
    echo ""
    echo "Options:"
    echo "  -h                  This text"
    echo "  -v                  Be verbose while testing"
    echo "  -q                  Be quiet while testing"
    echo "  -d                  Turn on RA debugging"
    echo "  -n name             Name of the resource"   
    echo "  -o name=value               Name and value of any parameters 
required by the agent"
    echo "  -L                  Use lrmadmin/lrmd for tests"
    exit $1
}

assert() {
    rc=$1; shift
    target=$1; shift
    msg=$1; shift
    local targetrc matched

    if [ $# = 0 ]; then
        exit_code=0
    else
        exit_code=$1; shift
    fi

    for targetrc in `echo $target | tr ':' ' '`; do
        [ $rc -eq $targetrc ] && matched=1
    done
    if [ "$matched" != 1 ]; then
        num_errors=`expr $num_errors + 1`
        echo "* rc=$rc: $msg"
        if [ $exit_code != 0 ]; then
            [ -n "$command_output" ] && cat<<EOF
$command_output
EOF
            echo "Aborting tests"
            exit $exit_code
        fi
    fi
    command_output=""
}

done=0
ra_args=""
verbose=0
quiet=0
while test "$done" = "0"; do
    case "$1" in
        -n) OCF_RESOURCE_INSTANCE=$2; ra_args="$ra_args 
OCF_RESOURCE_INSTANCE=$2"; shift; shift;;
        -o) name=${2%%=*}; value=${2#*=}; 
                lrm_ra_args="$lrm_ra_args $2";
                ra_args="$ra_args OCF_RESKEY_$name='$value'"; shift; shift;;
        -L) use_lrmd=1; shift;;
        -v) verbose=1; shift;;
        -d) export HA_debug=1; shift;;
        -q) quiet=1; shift;;
        -?|--help) usage 0;;
        --version) echo "UNKNOWN"; exit 0;;
        -*) echo "unknown option: $1" >&2; usage 1;;
        *) done=1;;
    esac
done

if [ "x" = "x$OCF_ROOT" ]; then
    if [ -d /usr/lib/ocf ]; then
        export OCF_ROOT=/usr/lib/ocf
    else
        echo "You must supply the location of OCF_ROOT (common location is 
/usr/lib/ocf)" >&2
        usage 1
    fi
fi

if [ "x" = "x$OCF_RESOURCE_INSTANCE" ]; then
    echo "You must give your resource a name, set OCF_RESOURCE_INSTANCE" >&2
    usage 1
fi

agent=$1
if [ ! -e $agent ]; then
    echo "You must provide the full path to your resource agent" >&2
    usage 1
fi
installed_rc=5
stopped_rc=7
has_demote=1
has_promote=1

start_lrmd() {
        lrmd_timeout=0
        lrmd_interval=0
        lrmd_target_rc=EVERYTIME
        lrmd_started=""
        $LRMD -s 2>/dev/null
        rc=$?
        if [ $rc -eq 3 ]; then
                lrmd_started=1
                $LRMD &
                sleep 1
                $LRMD -s 2>/dev/null
        else
                return $rc
        fi
}
add_resource() {
        $LRMADMIN -A $OCF_RESOURCE_INSTANCE \
                ocf \
                `basename $agent` \
                $(basename `dirname $agent`) \
                $lrm_ra_args > /dev/null
}
del_resource() {
        $LRMADMIN -D $OCF_RESOURCE_INSTANCE
}
parse_lrmadmin_output() {
        awk '
BEGIN{ rc=1; }
/Waiting for lrmd to callback.../ { n=1; next; }
n==1 && /----------------operation--------------/ { n++; next; }
n==2 && /return code:/ { rc=$0; sub("return code: *","",rc); next }
n==2 && /---------------------------------------/ {
        n++;
        next;
}
END{
        if( n!=3 ) exit 1;
        else exit rc;
}
'
}
exec_resource() {
        op="$1"
        args="$2"
        $LRMADMIN -E $OCF_RESOURCE_INSTANCE \
                $op $lrmd_timeout $lrmd_interval \
                $lrmd_target_rc \
                $args | parse_lrmadmin_output
}

if [ "$use_lrmd" = 1 ]; then
        echo "Using lrmd/lrmadmin for all tests"
        start_lrmd || {
                echo "could not start lrmd" >&2
                exit 1
        }
        trap '
                [ "$lrmd_started" = 1 ] && $LRMD -k
        ' EXIT
        add_resource || {
                echo "failed to add resource to lrmd" >&2
                exit 1
        }
fi

lrm_test_command() {
        action="$1"
        msg="$2"
        debug "$msg"
        exec_resource $action "$lrm_ra_args"
}

test_permissions() {
    action=meta-data
    debug ${1:-"Testing permissions with uid nobody"}
    su nobody -s /bin/sh $agent $action > /dev/null
}

test_metadata() {
    action=meta-data
    msg=${1:-"Testing: $action"}
    debug $msg
    bash $agent $action | (cd $DATADIR/resource-agents && $METADATA_LINT)
    rc=$?
    #echo rc: $rc
    return $rc
}

test_command() {
    action=$1; shift
    export __OCF_ACTION=$action
    msg=${1:-"Testing: $action"}
    if [ "$use_lrmd" = 1 ]; then
        lrm_test_command $action "$msg"
        return $?
    fi
    #echo Running: "export $ra_args; bash $agent $action 2>&1 > /dev/null"
    if [ $verbose -eq 0 ]; then
        command_output=`bash $agent $action 2>&1`
    else
        debug $msg
        bash $agent $action
    fi
    rc=$?
    #echo rc: $rc
    return $rc
}

# Begin tests
info "Beginning tests for $agent..."

if [ ! -f $agent ]; then
    assert 7 0 "Could not find file: $agent"
fi

if [ `id -u` = 0 ]; then
        test_permissions
        assert $? 0 "Your agent has too restrictive permissions: should be 755"
else
        echo "WARN: Can't check agent's permissions because we're not root; 
they should be 755"
fi

test_metadata
assert $? 0 "Your agent produces meta-data which does not conform to 
ra-api-1.dtd"

OCF_TESTER_FAIL_HAVE_BINARY=1
export OCF_TESTER_FAIL_HAVE_BINARY
test_command meta-data
rc=$?
if [ $rc -eq 3 ]; then
    assert $rc 0 "Your agent does not support the meta-data action"
else
    assert $rc 0 "The meta-data action cannot fail and must return 0"
fi
unset OCF_TESTER_FAIL_HAVE_BINARY

ra_args="export $ra_args"
eval $ra_args
test_command validate-all
rc=$?
if [ $rc -eq 3 ]; then
    assert $rc 0 "Your agent does not support the validate-all action"
elif [ $rc -ne 0 ]; then
    assert $rc 0 "Validation failed.  Did you supply enough options with -o ?" 1
    usage $rc
fi

test_command monitor "Checking current state"
rc=$?
if [ $rc -eq 3 ]; then
    assert $rc 7 "Your agent does not support the monitor action" 1

elif [ $rc -eq 8 ]; then
    test_command demote "Cleanup, demote"
    assert $? 0 "Your agent was a master and could not be demoted" 1

    test_command stop "Cleanup, stop"
    assert $? 0 "Your agent was a master and could not be stopped" 1

elif [ $rc -ne 7 ]; then
    test_command stop
    assert $? 0 "Your agent was active and could not be stopped" 1
fi

test_command monitor
assert $? $stopped_rc "Monitoring a stopped resource should return $stopped_rc"

OCF_TESTER_FAIL_HAVE_BINARY=1
export OCF_TESTER_FAIL_HAVE_BINARY
OCF_RESKEY_CRM_meta_interval=0
test_command monitor
assert $? $stopped_rc:$installed_rc "The initial probe for a stopped resource 
should return $stopped_rc or $installed_rc even if all binaries are missing"
unset OCF_TESTER_FAIL_HAVE_BINARY
OCF_RESKEY_CRM_meta_interval=20000

test_command start
assert $? 0 "Start failed.  Did you supply enough options with -o ?" 1

test_command monitor
assert $? 0 "Monitoring an active resource should return 0"

OCF_RESKEY_CRM_meta_interval=0
test_command monitor
assert $? 0 "Probing an active resource should return 0"
OCF_RESKEY_CRM_meta_interval=20000

test_command notify
rc=$?
if [ $rc -eq 3 ]; then
    info "* Your agent does not support the notify action (optional)"
else
    assert $rc 0 "The notify action cannot fail and must return 0"
fi

test_command demote "Checking for demote action"
if [ $? -eq 3 ]; then
    has_demote=0
    info "* Your agent does not support the demote action (optional)"
fi

test_command promote "Checking for promote action"
if [ $? -eq 3 ]; then
    has_promote=0
    info "* Your agent does not support the promote action (optional)"
fi

if [ $has_promote -eq 1 -a $has_demote -eq 1 ]; then
    test_command demote "Testing: demotion of started resource"
    assert $? 0 "Demoting a start resource should not fail"

    test_command promote
    assert $? 0 "Promote failed"

    test_command demote
    assert $? 0 "Demote failed" 1

    test_command demote "Testing: demotion of demoted resource"
    assert $? 0 "Demoting a demoted resource should not fail"

    test_command promote "Promoting resource"
    assert $? 0 "Promote failed" 1

    test_command promote "Testing: promotion of promoted resource"
    assert $? 0 "Promoting a promoted resource should not fail"

    test_command demote "Demoting resource"
    assert $? 0 "Demote failed" 1

elif [ $has_promote -eq 0 -a $has_demote -eq 0 ]; then
    info "* Your agent does not support master/slave (optional)"

else
    echo "* Your agent partially supports master/slave"
    num_errors=`expr $num_errors + 1`
fi

test_command stop
assert $? 0 "Stop failed" 1

test_command monitor
assert $? $stopped_rc "Monitoring a stopped resource should return $stopped_rc"

test_command start "Restarting resource..."
assert $? 0 "Start failed" 1

test_command monitor
assert $? 0 "Monitoring an active resource should return 0"

test_command start "Testing: starting a started resource"
assert $? 0 "Starting a running resource is required to succeed"

test_command monitor
assert $? 0 "Monitoring an active resource should return 0"

test_command stop "Stopping resource"
assert $? 0 "Stop could not clean up after multiple starts" 1

test_command monitor
assert $? $stopped_rc "Monitoring a stopped resource should return $stopped_rc"

test_command stop "Testing: stopping a stopped resource"
assert $? 0 "Stopping a stopped resource is required to succeed"

test_command monitor
assert $? $stopped_rc "Monitoring a stopped resource should return $stopped_rc"

test_command migrate_to "Checking for migrate_to action"
rc=$?
if [ $rc -ne 3 ]; then
    test_command migrate_from "Checking for migrate_from action"
fi
if [ $? -eq 3 ]; then
    info "* Your agent does not support the migrate action (optional)"
fi

test_command reload "Checking for reload action"
if [ $? -eq 3 ]; then
    info "* Your agent does not support the reload action (optional)"
fi

if [ $num_errors -gt 0 ]; then
    echo "Tests failed: $agent failed $num_errors tests" >&2
    exit 1
else 
    echo $agent passed all tests
    exit 0
fi

# vim:et:ts=8:sw=4
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to