On Fri, Jun 04, 2010 at 07:14:12AM -0400, Vadym Chepkov wrote:
> I actually submitted a patch to linux-ha-dev as described
Either I missed it, or it never got there. Perhaps you're not
subscribed to the list? At any rate, the patch is already in the
repository.
Cheers,
Dejan
> On Jun 4, 2010, at 7:04 AM, Dejan Muhamedagic wrote:
>
> > Hi,
> >
> > On Thu, Jun 03, 2010 at 06:52:09PM -0400, Vadym Chepkov wrote:
> >> Hi
> >>
> >> There is a bug in stonith/plugins/external/rackpdu in cluster-glue-1.0.5
> >>
> >> It doesn't check if snmpset was successful or not :
> >>
> >> SendCommand() {
> >>
> >> local host=$1
> >> local command=$2
> >>
> >> GetOutletNumber $host
> >> local outlet=$?
> >>
> >> if [ $outlet -gt 0 ]; then
> >> local set_result=`snmpset -v1 -c $community $pduip $oid.$outlet i
> >> $command 2>&1`
> >> local check_result=`echo "$set_result" | grep "Timeout"`
> >>
> >> if [ ! -z "$check_result" ]; then
> >> ha_log.sh err "Write SNMP value $oid.$outlet=$command. Result:
> >> $set_result"
> >> fi
> >>
> >> return 0
> >> else
> >> return 1
> >> fi
> >> }
> >>
> >> Here is what happens:
> >>
> >> + '[' 1 -gt 0 ']'
> >> ++ snmpset -v1 -c private 10.10.10.10 .1.3.6.1.4.1.318.1.1.12.3.3.1.1.4.1
> >> i 2
> >> + local 'set_result=Error in packet.
> >> Reason: (genError) A general failure occured'
> >> ++ echo 'Error in packet.
> >> Reason: (genError) A general failure occured'
> >> ++ grep Timeout
> >> + local check_result=
> >> + '[' '!' -z '' ']'
> >> + return 0
> >> + exit 0
> >>
> >> so stonith agent says it was successful when it was not :(
> >>
> >> instead of grepping for "Timeout" (why?)
> >
> > Don't know. Yes, that's strange. I left that check in anyway. Can
> > you simulate a time out and see what does snmpset return (exit
> > code)?
> >
> >> it should check if exit status was 0, then it was successful
> >> 2 - failed and not recoverable
> >
> > Yes, fixed now. Also snmpwalk for gethosts.
> >
> >> 1 - you can possibly retry.
> >>
> >> The last one, unfortunately, usually happens when somebody is
> >> already logged in into PDU (via http or telnet)
> >
> > Well, we could retry, but that's probably going to be in vain.
> > That needs to be documented.
> >
> > Can you please test the changes. You can pull the new version
> > from the repository for testing.
> >
> > Many thanks for the report.
> >
> > Dejan
>
>
> I actually submitted a patch to linux-ha-dev list as described on clusterlabs
> site, I guess it never got it there.
> I attach it now. I assume the original author didn't realize
>
> local result=`command`
>
> always returns 0, no matter what command outcome was. timeout does generate
> exit code 1
>
>
Delivered-To: [email protected]
Received: by 10.150.211.9 with SMTP id j9cs68199ybg;
Fri, 4 Jun 2010 04:09:14 -0700 (PDT)
Received: by 10.150.65.10 with SMTP id n10mr10901081yba.9.1275649754581;
Fri, 04 Jun 2010 04:09:14 -0700 (PDT)
Return-Path: <[email protected]>
Received: from vms173011.mailsrvcs.net (vms173011pub.verizon.net
[206.46.173.11])
by mx.google.com with ESMTP id v23si5591487ybv.60.2010.06.04.04.09.14;
Fri, 04 Jun 2010 04:09:14 -0700 (PDT)
Received-SPF: neutral (google.com: 206.46.173.11 is neither permitted nor
denied by domain of [email protected]) client-ip=206.46.173.11;
Authentication-Results: mx.google.com; spf=neutral (google.com: 206.46.173.11
is neither permitted nor denied by domain of [email protected])
[email protected]
Received: from fedora.chepkov.lan ([unknown] [173.71.210.176])
by vms173011.mailsrvcs.net
(Sun Java(tm) System Messaging Server 7u2-7.02 32bit (built Apr 16 2009))
with ESMTPA id <[email protected]> for
[email protected]; Fri, 04 Jun 2010 06:09:14 -0500 (CDT)
Received: from centos64-dev.chepkov.lan
(centos64-dev.chepkov.lan [10.10.10.92]) by fedora.chepkov.lan
(8.14.4/8.14.4)
with ESMTP id o54B9BUL023880; Fri, 04 Jun 2010 07:09:11 -0400
Content-type: text/plain; charset=us-ascii
MIME-version: 1.0
Content-transfer-encoding: 7bit
Subject: [PATCH] Check exit codes of snmp utils
X-Mercurial-Node: 955b957b9e64c83cff9a0e793922143f573cc712
Message-id: <[email protected]>
User-Agent: Mercurial-patchbomb/1.5.1
Date: Fri, 04 Jun 2010 07:09:12 -0400
From: Vadym Chepkov <[email protected]>
To: [email protected]
>
> # HG changeset patch
> # User Vadym Chepkov <[email protected]>
> # Date 1275609966 14400
> # Node ID 955b957b9e64c83cff9a0e793922143f573cc712
> # Parent 5385c0d6c83668cd970161b2862282570b3cf92a
> Check exit codes of snmp utils
>
> diff -r 5385c0d6c836 -r 955b957b9e64 lib/plugins/stonith/external/rackpdu
> --- a/lib/plugins/stonith/external/rackpdu Tue May 25 15:35:38 2010 +0200
> +++ b/lib/plugins/stonith/external/rackpdu Thu Jun 03 20:06:06 2010 -0400
> @@ -68,7 +68,12 @@
> # Get outlet number from device
>
> local outlet_num=1
> - local snmp_result=`snmpwalk -v1 -c $community $pduip $names_oid 2>&1`
> + local snmp_result
> + snmp_result=`snmpwalk -v1 -c $community $pduip $names_oid 2>&1`
> + if [ $? -ne 0 ]; then
> + ha_log.sh err "Outlet number not found for node $nodename. Result:
> $snmp_result"
> + return 0
> + fi
>
> local names=`echo "$snmp_result" | cut -f2 -d'"' | tr ' ' '_' | tr
> '\012' ' '`
>
> @@ -95,11 +100,11 @@
> local outlet=$?
>
> if [ $outlet -gt 0 ]; then
> - local set_result=`snmpset -v1 -c $community $pduip $oid.$outlet i
> $command 2>&1`
> - local check_result=`echo "$set_result" | grep "Timeout"`
> -
> - if [ ! -z "$check_result" ]; then
> + local set_result
> + set_result=`snmpset -v1 -c $community $pduip $oid.$outlet i $command
> 2>&1`
> + if [ $? -ne 0 ]; then
> ha_log.sh err "Write SNMP value $oid.$outlet=$command. Result:
> $set_result"
> + return 1
> fi
>
> return 0
> @@ -116,9 +121,7 @@
> gethosts)
> if [ "$hostlist" = "AUTO" ]; then
> snmp_result=`snmpwalk -v1 -c $community $pduip $names_oid 2>&1`
> - snmp_check=`echo "$snmp_result" | grep "Timeout"`
> -
> - if [ ! -z "$snmp_check" ]; then
> + if [ $? -ne 0 ]; then
> ha_log.sh err "Cannot read list of nodes from device. Result:
> $snmp_result"
> exit 1
> else
>
>
>
> Vadym
>
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems