[Linux-ha-dev] New STONITH plug-in to be usable by Xen and KVM

Yasumasa OZAKI Thu, 30 Sep 2010 03:39:03 -0700

Hi Dejan,

Thank you for reply.


I correct it based on your opinion because I think that your opinion is 
correct. wait a little, please.

- The stop of the domain is judged from the virsh returns, and 
CheckIfDead function is removed.

- RunCommand function does, and shortens the refactoring.

- Other points are corrected based on your opinion.

Thanks,
Yasumasa OZAKI

On Thu, 23 Sep 2010 17:08:12 +0200, Dejan Muhamedagic wrote:
> Hi Yasumasa-san,
> 
> On Fri, Sep 10, 2010 at 03:17:56PM +0900, Yasumasa OZAKI wrote:
>> Hi,
>>
>> I made the STONITH plug-in that can be used by both Xen and KVM.
> 
> Thanks! Comments below.
> 
>> I would like to hear any opinion.
>>
>> Best regards,
>> Yasumasa OZAKI
>>
>>
>>
> 
>> #!/bin/sh
>> #
>> # External STONITH module for Xen/KVM hypervisor through ssh.
>> # Uses Xen/KVM hypervisor as a STONITH device to control guest.
>> #
>> # Copyright (c) 2010 NIPPON TELEGRAPH AND TELEPHONE CORPORATION
>> #
>>
>> STOP_COMMAND="virsh destroy"
>> START_COMMAND="virsh start"
>> DUMP_COMMAND="virsh dump"
>> SSH_COMMAND="/usr/bin/ssh -q -x -n"
>>
>> # Rewrite the hostlist to accept "," as a delimeter for hostnames too.
>> hostlist=`echo $hostlist | tr ',' ' '`
>>
>> CheckIfDead() {
>>     for j in 1 2 3 4 5
>>     do
>>         if ! ping -w1 -c1 "$1" >/dev/null 2>&1
>>         then
>>             return 0
>>         fi
>>         sleep 1
>>     done
>>     
>>     return 1
>> }
> 
> I'd rather have this one removed. If virsh returns a meaningful
> exit code, then we can rely on that, right? Or use domstate?
> 
>> CheckHostList() {
>>     if [ "x" = "x$hostlist" ]
>>     then
>>         ha_log.sh err "hostlist isn't set"
>>         exit 1
>>     fi
>> }
>>
>> CheckHypervisor() {
>>     if [ "x" = "x$hypervisor" ]
>>     then
>>         ha_log.sh err "hypervisor isn't set"
>>         exit 1
>>     fi
>> }
>>
>> RunCommand() {
>>     CheckHostList
>>     CheckHypervisor
>>     
>>     for node in $hostlist
>>     do
>>         if [ "$node" != "$1" ]
>>         then
>>             continue
>>         fi
> 
> You can abbreviate code like this into a more compact
> 
>       [ "$node" != "$1" ] && continue
> 
>>         case $2 in
>>             stop)
>>                 if [ "x$run_dump" != "x" ]
>>                 then
>>                     #Need to run core dump
>>                     if [ "x$dump_dir" != "x" ]
>>                     then
>>                         TIMESTAMP=`date +%Y-%m%d-%H%M.%S`
>>                         DOMAINNAME=`printf "%s" $node`
>>                         COREFILE=$dump_dir/$TIMESTAMP-$DOMAINNAME.core
>>                         #Run core dump
>>                         command_result="$($SSH_COMMAND $hypervisor " \
>> pgrep -f ^virsh_$node >/dev/null 2>&1; \
>> if [ \$? = 0 ]; then echo RUNNING; exit; fi; \
>> mkdir -p $dump_dir; \
>> if [ \$? != 0 ]; then echo MKDIR_FAILED; exit; fi; \
>> (exec -a virsh_$node $DUMP_COMMAND $node $COREFILE >/dev/null 2>&1); \
>> if [ \$? != 0 ]; then echo DUMP_FAILED; exit; fi; \
>> echo OK")"
>>                         ssh_result=$?
>>                         if [ $ssh_result = 0 ]
>>                         then
>>                             case "$command_result" in
>>                                 RUNNING)
>>                                     ha_log.sh info "Dump is already running"
>>                                     exit 0
>>                                     ;;
>>                                 SUSPEND_FAILED)
>>                                     ha_log.sh err "Failed to suspend domain 
>> $node"
>>                                     ;;
>>                                 MKDIR_FAILED)
>>                                     ha_log.sh err "Failed to create 
>> directory $dump_dir"
>>                                     ;;
>>                                 DUMP_FAILED)
>>                                     ha_log.sh err "Failed to core dump 
>> domain $node to $COREFILE"
>>                                     ;;
>>                                 OK)
>>                                     ha_log.sh notice "Domain $node dumped to 
>> $COREFILE"
>>                                     ;;
>>                             esac
>>                         else
>>                             ha_log.sh err "Couldn't connect to hypervisor 
>> $hypervisor"
>>                         fi
>>                     else
>>                         ha_log.sh err "dump_dir isn't set"
>>                     fi
>>                 fi
>>
>>                 command_result=$($SSH_COMMAND $hypervisor "((sleep 2; 
>> $STOP_COMMAND $node) >/dev/null 2>&1 &); echo \$?")
>>                 ssh_result=$?
>>                 if [ $ssh_result = 0 ]
>>                 then
>>                     if [ $command_result = 0 ]
>>                     then
>>                         ha_log.sh notice "Domain $node is stoped"
>>                     else
>>                         ha_log.sh err "Failed to stop domain $node"
>>                     fi
>>                 else
>>                     ha_log.sh err "Couldn't connect to hypervisor 
>> $hypervisor"
>>                 fi
>>                 break;;
>>             start)
>>                 command_result=$($SSH_COMMAND $hypervisor "((sleep 2; 
>> $START_COMMAND $node) >/dev/null 2>&1 &); echo \$?")
>>                 ssh_result=$?
>>                 if [ $ssh_result = 0 ]
>>                 then
>>                     if [ $command_result = 0 ]
>>                     then
>>                         ha_log.sh notice "Domain $node is started"
>>                     else
>>                         ha_log.sh err "Failed to start domain $node"
>>                     fi
>>                 else
>>                     ha_log.sh err "Couldn't connect to hypervisor 
>> $hypervisor"
>>                 fi
>>                 break;;
>>         esac
>>         exit 0
>>     done
>> }
> 
> This function is too long. Can you try to refactor and split it
> into several. Also, the stop and start good candidates to be
> folded into one function.
> 
>> # Main code
>>
>> case $1 in
>> gethosts)
>>     CheckHostList
>>     
>>     for node in $hostlist ; do
>>         echo $node
>>     done
>>     exit 0
>>     ;;
>> on)
>>     RunCommand $2 start
>>     exit $?
>>     ;;
>> off)
>>     if RunCommand $2 stop
>>     then
>>         if CheckIfDead $2
>>         then 
>>             exit 0
>>         fi
>>     fi
>>            
>>     exit 1
>>     ;;
>> reset)
>>     RunCommand $2 stop
>>         
>>     if CheckIfDead $2
>>     then
>>         RunCommand $2 start 
>>         exit 0
>>     fi
>>
>>     exit 1
>>     ;;
>> status)
>>     exit 0
>>     ;;
>> getconfignames)
>>     echo "hostlist hypervisor"
>>     exit 0
>>     ;;
>> getinfo-devid)
>>     echo "virsh STONITH device"
>>     exit 0
>>     ;;
>> getinfo-devname)
>>     echo "virsh STONITH external device"
>>     exit 0
>>     ;;
>> getinfo-devdescr)
>>     echo "ssh-based Linux host reset for Xen/KVM guest domain trough 
>> hypervisor"
>>     echo "Fine for testing, but not really suitable for production!"
> 
> Well, it could be used in production too, if you replace the
> ping part.
> 
>>     exit 0
>>     ;;
>> getinfo-devurl)
>>     echo "http://openssh.org http://www.xensource.com/ 
>> http://linux-ha.org/wiki";
> 
> I think that this should be reduced to just xensource.
> 
>>     exit 0
>>     ;;
>> getinfo-xml)
>>     cat << SSHXML
>> <parameters>
>> <parameter name="hostlist" unique="1" required="1">
>> <content type="string" />
>> <shortdesc lang="en">
>> Hostlist
>> </shortdesc>
>> <longdesc lang="en">
>> The list of controlled nodes.
>> For example: "node1 node2"
>> </longdesc>
>> </parameter>
>> <parameter name="hypervisor" unique="1" required="1">
>> <content type="string" />
>> <shortdesc lang="en">
>> Hypervisor hostname
>> </shortdesc>
>> <longdesc lang="en">
>> Host name to execute hypervisor. Root user shall be able to ssh to that node.
> 
> Using public key authentication. This may also be a security
> issue, but let's leave that to users to decide.
> 
>> </longdesc>
>> </parameter>
>> <parameter name="run_dump" unique="0" required="0">
>> <content type="string" />
>> <shortdesc lang="en">
>> Run core dump
>> </shortdesc>
>> <longdesc lang="en">
>> If set plugin will call "virsh dump" before killing guest domain
>> </longdesc>
>> </parameter>
>> <parameter name="dump_dir" unique="1" required="0">
>> <content type="string" />
>> <shortdesc lang="en">
>> Run dump core with the specified directory 
>> When the "run_dump" parameter is set, this parameter is indispensable 
>> </shortdesc>
>> <longdesc lang="en">
>> This parameter can indicate the dump destination.
>> Should be set as a full path format, ex.) "/var/log/dump"
>> The above example would dump the core, like;
>> /var/log/dump/2009-0316-1403.37-GuestDomain.core
>> </longdesc>
>> </parameter>
>> </parameters>
>> SSHXML
>>     exit 0
>>     ;;
>> *)
>>     exit 1
>>     ;;
>> esac
> 
> Cheers,
> 
> Dejan

_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

[Linux-ha-dev] New STONITH plug-in to be usable by Xen and KVM

Reply via email to