On Wed, 2010-12-22 at 17:27 +0100, Dejan Muhamedagic wrote:
> On Wed, Dec 22, 2010 at 02:57:53PM +0100, Holger Teutsch wrote:
> > On Wed, 2010-12-22 at 10:37 +0100, Dejan Muhamedagic wrote:
> > > On Wed, Dec 22, 2010 at 09:57:40AM +0100, Holger Teutsch wrote:
> > > > On Tue, 2010-12-21 at 19:03 +0100, Dejan Muhamedagic wrote:
> > > > > Hi,
> > > > > 
> > > > > On Tue, Dec 21, 2010 at 05:30:52PM +0100, Holger Teutsch wrote:
> > > > > > Hi,
> > > > > > I would like to submit a libvirt based stonith plugin for review and
> > > > > > possible inclusion to glue.
> > > > > > The plugin uses the client of libvirtd (i.e. virsh) _in the virtual
> > > > > > machines_ and connects remotely to libvirtd on the hypervisor.
> > > > > > Therefore is works with whatever transport or hypervisor that 
> > > > > > libvirt
> > > > > > supports or will support.
> > > > > 
> > > > > Just a note that the reset command should try to boot the host in
> > > > > case it was down too. No objections here to the rest of the code.
> > > > 
> > > > As a data center guy I would not expect this. In particular when startup
> > > > fencing comes into play.
> > > > When I _power down_ a cluster member for good reasons and start only one
> > > > node I would not like the other one to be powered on automatigically.
> > > > The power switch is the ultimate thing we control all this stuff 
> > > 
> > > If you want to keep the node down why not use the poweroff action
> > > for stonith?
> > > 
> > 
> > Unfortunately libvirt has no state "powered on / not running" or
> > "persistent power off".
> > I'm pretty sure that e.g HP's ilo/ipmi implementation of "reset" would
> > not power on but would be ignored on a powered off machine. So that
> > might not be an issue with "real" servers.
> 
> riloe and ipmi do pay attention to the power state and act
> correspondingly, that is turn power on if the host was powered
> off and reset otherwise.
> 
> > With a previous version of the script on my KVM test cluster startup
> > fencing of pacemaker powered on a stopped machine and I think that is
> > not what you want.
> 
> Well, that's what STONITH requires and that's how all other
> stonith plugins behave.

OK, I comment out the logic.
Be it a pacemaker or a stonith problem: From a data center operations
perspective I consider this behavior absolutely strange. You really have
to pull the power cords to be sure that powered off servers stay off.

> 
> > > > > Any chance to support more than one host?
> > > > 
> > > > I reasoned about this as well but as we can not assume 'host name' ==
> > > > 'domain id' that means domain_id has to be a list as well (with defaults
> > > > or partial defaults). 
> > > 
> > > IIRC, there was one stonith agent which does this kind of
> > > mapping. Alternatively, perhaps drop domain_id and allow
> > > appending it in the hostlist (as in external/xen0), i.e.
> > > "node1[:domain_id] ...".
> > > 
> > > > I will think again about feasability with not overcomplicated code.
> > > 
> > > This should reduce the configuration, so I think it's worth the
> > > effort.
> > 
> > Will go with your proposal.
> 
> Great.
> 
> Cheers,
> 
> Dejan
> 

The updated version:
- holger

#!/bin/sh
#
# External STONITH module for a libvirt managed hypervisor (kvm/Xen).
# Uses libvirt as a STONITH device to control guest.
#
# Copyright (c) 2010 Holger Teutsch <[email protected]>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of version 2 of the GNU General Public License as
# published by the Free Software Foundation.
#
# This program is distributed in the hope that it would be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# Further, this software is distributed without any warranty that it is
# free of the rightful claim of any third person regarding infringement
# or the like.  Any license provided herein, whether implied or
# otherwise, applies only to this software file.  Patent licenses, if
# any, provided herein do not apply to combinations of this program with
# other software, or any other product whatsoever.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write the Free Software Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
#

# start a domain
libvirt_start() {
    out=$($VIRSH -c $hypervisor_uri start $domain_id 2>&1)
    if [ $? -eq 0 ]
    then
        ha_log.sh notice "Domain $domain_id was started"
        return 0
    fi

    if echo "$out" | grep -i 'Domain is already active' > /dev/null 2>&1
    then
        ha_log.sh notice "Domain $domain_id is already active"
        return 0
    fi

    ha_log.sh err "Failed to start domain $domain_id"
    ha_log.sh err "$out"
    return 1
}

# stop a domain
# return
#   0: success
#   1: error
#   2: was already stopped
libvirt_stop() {
    out=$($VIRSH -c $hypervisor_uri destroy $domain_id 2>&1)
    if [ $? -eq 0 ]
    then
        ha_log.sh notice "Domain $domain_id was stopped"
        return 0
    fi

    if echo "$out" | grep -i 'domain is not running' > /dev/null 2>&1
    then
        ha_log.sh notice "Domain $domain_id is already stopped"
        return 2
    fi

    ha_log.sh err "Failed to stop domain $domain_id"
    ha_log.sh err "$out"
    return 1
}

# get status of stonith device (*NOT* of the domain).
# If we can retrieve some info from the hypervisor
# the stonith device is OK.
libvirt_status() {
    out=$($VIRSH -c $hypervisor_uri version 2>&1)
    if [ $? -eq 0 ]
    then
        out=`echo "$out" | tail -1`
        ha_log.sh notice "$hypervisor_uri: $out"
        return 0
    fi

    ha_log.sh err "Failed to get status for $hypervisor_uri"
    ha_log.sh err "$out"
    return 1
}

# check config and set variables
# does not return on error
libvirt_check_config() {
    VIRSH=`which virsh 2>/dev/null`

    if [ ! -x "$VIRSH" ]
    then
        ha_log.sh err "virsh not installed"
        exit 1
    fi

    if [ -z "$hostlist" -o -z "$hypervisor_uri" ]
    then
        ha_log.sh err "hostlist or hypervisor_uri missing; check configuration"
        exit 1
    fi
}

# set variable domain_id for the host specified as arg
libvirt_set_domain_id ()
{
    for h in $hostlist
    do
        case $h in
            $1:*)
            domain_id=`expr $h : '.*:\(.*\)'`
            return
            ;;

            $1)
            domain_id=$1
            return
        esac
    done

    ha_log.sh err "Should never happen: Called for host $1 but $1 is not in 
$hostlist."
    exit 1
}

libvirt_info() {
cat << LVIRTXML
<parameters>
<parameter name="hostlist" unique="1" required="1">
<content type="string" />
<shortdesc lang="en">
List of hostname[:domain_id]..
</shortdesc>
<longdesc lang="en">
List of controlled hosts: hostname[:domain_id]..
The optional domain_id defaults to the hostname. 
</longdesc>
</parameter>

<parameter name="hypervisor_uri" required="1">
<content type="string" />
<shortdesc lang="en">
Hypervisor URI
</shortdesc>
<longdesc lang="en">
URI for connection to the hypervisor.
driver[+transport]://[usern...@][hostlist][:port]/[path][?extraparameters]
e.g.
qemu+ssh://my_kvm_server.mydomain.my/system   (uses ssh for root)
xen://my_kvm_server.mydomain.my/              (uses TLS for client)

virsh must be installed (e.g. libvir-client package) and access control must
be configured for your selected URI.
</longdesc>
</parameter>
</parameters>
LVIRTXML
exit 0
}

#############
# Main code #
#############

# don't fool yourself when testing with stonith(8)
# and transport ssh
unset SSH_AUTH_SOCK

# support , as a separator as well
hostlist=`echo $hostlist| sed -e 's/,/ /g'`

case $1 in
    gethosts)
    hostnames=`echo $hostlist|sed -e 's/:[^: ]*//g'`
    for h in $hostnames
    do
        echo $h
    done
    exit 0
    ;;

    on)
    libvirt_check_config
    libvirt_set_domain_id $2

    libvirt_start
    exit $?
    ;;

    off)
    libvirt_check_config
    libvirt_set_domain_id $2

    libvirt_stop
    [ $? = 1 ] && exit 1
    exit 0
    ;;

    reset)
    # libvirt has no reset so we do a power cycle
    libvirt_check_config
    libvirt_set_domain_id $2

    libvirt_stop
    rc=$?
    [ $rc = 1 ] && exit 1

    # stonith reset seems to require a power on even if it was off
    # before so the next line is commented out
    # [ $rc = 2 ] && exit 0

    sleep 2
    libvirt_start
    exit $?
    ;;

    status)
    libvirt_check_config
    libvirt_status
    exit $?
    ;;

    getconfignames)
    echo "hostlist hypervisor_uri"
    exit 0
    ;;

    getinfo-devid)
    echo "libvirt STONITH device"
    exit 0
    ;;

    getinfo-devname)
    echo "libvirt STONITH external device"
    exit 0
    ;;

    getinfo-devdescr)
    echo "libvirt-based Linux host reset for Xen/KVM guest domain through 
hypervisor"
    exit 0
    ;;

    getinfo-devurl)
    echo "http://libvirt.org/uri.html http://linux-ha.org/wiki";
    exit 0
    ;;

    getinfo-xml)
    libvirt_info
    echo 0;
    ;;

    *)
    exit 1
    ;;
esac



_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to