Hi Dejan,
Thank you for reply.
I correct it based on your opinion because I think that your opinion is
correct. wait a little, please.
- The stop of the domain is judged from the virsh returns, and
CheckIfDead function is removed.
- RunCommand function does, and shortens the refactoring.
- Other points are corrected based on your opinion.
Thanks,
Yasumasa OZAKI
On Thu, 23 Sep 2010 17:08:12 +0200, Dejan Muhamedagic wrote:
> Hi Yasumasa-san,
>
> On Fri, Sep 10, 2010 at 03:17:56PM +0900, Yasumasa OZAKI wrote:
>> Hi,
>>
>> I made the STONITH plug-in that can be used by both Xen and KVM.
>
> Thanks! Comments below.
>
>> I would like to hear any opinion.
>>
>> Best regards,
>> Yasumasa OZAKI
>>
>>
>>
>
>> #!/bin/sh
>> #
>> # External STONITH module for Xen/KVM hypervisor through ssh.
>> # Uses Xen/KVM hypervisor as a STONITH device to control guest.
>> #
>> # Copyright (c) 2010 NIPPON TELEGRAPH AND TELEPHONE CORPORATION
>> #
>>
>> STOP_COMMAND="virsh destroy"
>> START_COMMAND="virsh start"
>> DUMP_COMMAND="virsh dump"
>> SSH_COMMAND="/usr/bin/ssh -q -x -n"
>>
>> # Rewrite the hostlist to accept "," as a delimeter for hostnames too.
>> hostlist=`echo $hostlist | tr ',' ' '`
>>
>> CheckIfDead() {
>> for j in 1 2 3 4 5
>> do
>> if ! ping -w1 -c1 "$1" >/dev/null 2>&1
>> then
>> return 0
>> fi
>> sleep 1
>> done
>>
>> return 1
>> }
>
> I'd rather have this one removed. If virsh returns a meaningful
> exit code, then we can rely on that, right? Or use domstate?
>
>> CheckHostList() {
>> if [ "x" = "x$hostlist" ]
>> then
>> ha_log.sh err "hostlist isn't set"
>> exit 1
>> fi
>> }
>>
>> CheckHypervisor() {
>> if [ "x" = "x$hypervisor" ]
>> then
>> ha_log.sh err "hypervisor isn't set"
>> exit 1
>> fi
>> }
>>
>> RunCommand() {
>> CheckHostList
>> CheckHypervisor
>>
>> for node in $hostlist
>> do
>> if [ "$node" != "$1" ]
>> then
>> continue
>> fi
>
> You can abbreviate code like this into a more compact
>
> [ "$node" != "$1" ] && continue
>
>> case $2 in
>> stop)
>> if [ "x$run_dump" != "x" ]
>> then
>> #Need to run core dump
>> if [ "x$dump_dir" != "x" ]
>> then
>> TIMESTAMP=`date +%Y-%m%d-%H%M.%S`
>> DOMAINNAME=`printf "%s" $node`
>> COREFILE=$dump_dir/$TIMESTAMP-$DOMAINNAME.core
>> #Run core dump
>> command_result="$($SSH_COMMAND $hypervisor " \
>> pgrep -f ^virsh_$node >/dev/null 2>&1; \
>> if [ \$? = 0 ]; then echo RUNNING; exit; fi; \
>> mkdir -p $dump_dir; \
>> if [ \$? != 0 ]; then echo MKDIR_FAILED; exit; fi; \
>> (exec -a virsh_$node $DUMP_COMMAND $node $COREFILE >/dev/null 2>&1); \
>> if [ \$? != 0 ]; then echo DUMP_FAILED; exit; fi; \
>> echo OK")"
>> ssh_result=$?
>> if [ $ssh_result = 0 ]
>> then
>> case "$command_result" in
>> RUNNING)
>> ha_log.sh info "Dump is already running"
>> exit 0
>> ;;
>> SUSPEND_FAILED)
>> ha_log.sh err "Failed to suspend domain
>> $node"
>> ;;
>> MKDIR_FAILED)
>> ha_log.sh err "Failed to create
>> directory $dump_dir"
>> ;;
>> DUMP_FAILED)
>> ha_log.sh err "Failed to core dump
>> domain $node to $COREFILE"
>> ;;
>> OK)
>> ha_log.sh notice "Domain $node dumped to
>> $COREFILE"
>> ;;
>> esac
>> else
>> ha_log.sh err "Couldn't connect to hypervisor
>> $hypervisor"
>> fi
>> else
>> ha_log.sh err "dump_dir isn't set"
>> fi
>> fi
>>
>> command_result=$($SSH_COMMAND $hypervisor "((sleep 2;
>> $STOP_COMMAND $node) >/dev/null 2>&1 &); echo \$?")
>> ssh_result=$?
>> if [ $ssh_result = 0 ]
>> then
>> if [ $command_result = 0 ]
>> then
>> ha_log.sh notice "Domain $node is stoped"
>> else
>> ha_log.sh err "Failed to stop domain $node"
>> fi
>> else
>> ha_log.sh err "Couldn't connect to hypervisor
>> $hypervisor"
>> fi
>> break;;
>> start)
>> command_result=$($SSH_COMMAND $hypervisor "((sleep 2;
>> $START_COMMAND $node) >/dev/null 2>&1 &); echo \$?")
>> ssh_result=$?
>> if [ $ssh_result = 0 ]
>> then
>> if [ $command_result = 0 ]
>> then
>> ha_log.sh notice "Domain $node is started"
>> else
>> ha_log.sh err "Failed to start domain $node"
>> fi
>> else
>> ha_log.sh err "Couldn't connect to hypervisor
>> $hypervisor"
>> fi
>> break;;
>> esac
>> exit 0
>> done
>> }
>
> This function is too long. Can you try to refactor and split it
> into several. Also, the stop and start good candidates to be
> folded into one function.
>
>> # Main code
>>
>> case $1 in
>> gethosts)
>> CheckHostList
>>
>> for node in $hostlist ; do
>> echo $node
>> done
>> exit 0
>> ;;
>> on)
>> RunCommand $2 start
>> exit $?
>> ;;
>> off)
>> if RunCommand $2 stop
>> then
>> if CheckIfDead $2
>> then
>> exit 0
>> fi
>> fi
>>
>> exit 1
>> ;;
>> reset)
>> RunCommand $2 stop
>>
>> if CheckIfDead $2
>> then
>> RunCommand $2 start
>> exit 0
>> fi
>>
>> exit 1
>> ;;
>> status)
>> exit 0
>> ;;
>> getconfignames)
>> echo "hostlist hypervisor"
>> exit 0
>> ;;
>> getinfo-devid)
>> echo "virsh STONITH device"
>> exit 0
>> ;;
>> getinfo-devname)
>> echo "virsh STONITH external device"
>> exit 0
>> ;;
>> getinfo-devdescr)
>> echo "ssh-based Linux host reset for Xen/KVM guest domain trough
>> hypervisor"
>> echo "Fine for testing, but not really suitable for production!"
>
> Well, it could be used in production too, if you replace the
> ping part.
>
>> exit 0
>> ;;
>> getinfo-devurl)
>> echo "http://openssh.org http://www.xensource.com/
>> http://linux-ha.org/wiki"
>
> I think that this should be reduced to just xensource.
>
>> exit 0
>> ;;
>> getinfo-xml)
>> cat << SSHXML
>> <parameters>
>> <parameter name="hostlist" unique="1" required="1">
>> <content type="string" />
>> <shortdesc lang="en">
>> Hostlist
>> </shortdesc>
>> <longdesc lang="en">
>> The list of controlled nodes.
>> For example: "node1 node2"
>> </longdesc>
>> </parameter>
>> <parameter name="hypervisor" unique="1" required="1">
>> <content type="string" />
>> <shortdesc lang="en">
>> Hypervisor hostname
>> </shortdesc>
>> <longdesc lang="en">
>> Host name to execute hypervisor. Root user shall be able to ssh to that node.
>
> Using public key authentication. This may also be a security
> issue, but let's leave that to users to decide.
>
>> </longdesc>
>> </parameter>
>> <parameter name="run_dump" unique="0" required="0">
>> <content type="string" />
>> <shortdesc lang="en">
>> Run core dump
>> </shortdesc>
>> <longdesc lang="en">
>> If set plugin will call "virsh dump" before killing guest domain
>> </longdesc>
>> </parameter>
>> <parameter name="dump_dir" unique="1" required="0">
>> <content type="string" />
>> <shortdesc lang="en">
>> Run dump core with the specified directory
>> When the "run_dump" parameter is set, this parameter is indispensable
>> </shortdesc>
>> <longdesc lang="en">
>> This parameter can indicate the dump destination.
>> Should be set as a full path format, ex.) "/var/log/dump"
>> The above example would dump the core, like;
>> /var/log/dump/2009-0316-1403.37-GuestDomain.core
>> </longdesc>
>> </parameter>
>> </parameters>
>> SSHXML
>> exit 0
>> ;;
>> *)
>> exit 1
>> ;;
>> esac
>
> Cheers,
>
> Dejan
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/