On Fri, Jun 04, 2010 at 01:04:35PM +0200, Dejan Muhamedagic wrote:
> Hi,
>
> On Thu, Jun 03, 2010 at 06:52:09PM -0400, Vadym Chepkov wrote:
> > Hi
> >
> > There is a bug in stonith/plugins/external/rackpdu in cluster-glue-1.0.5
> >
> > It doesn't check if snmpset was successful or not :
> >
> > SendCommand() {
> >
> > local host=$1
> > local command=$2
> >
> > GetOutletNumber $host
> > local outlet=$?
> >
> > if [ $outlet -gt 0 ]; then
> > local set_result=`snmpset -v1 -c $community $pduip $oid.$outlet i
> > $command 2>&1`
> > local check_result=`echo "$set_result" | grep "Timeout"`
> >
> > if [ ! -z "$check_result" ]; then
> > ha_log.sh err "Write SNMP value $oid.$outlet=$command. Result:
> > $set_result"
> > fi
> >
> > return 0
> > else
> > return 1
> > fi
> > }
> >
> > Here is what happens:
> >
> > + '[' 1 -gt 0 ']'
> > ++ snmpset -v1 -c private 10.10.10.10 .1.3.6.1.4.1.318.1.1.12.3.3.1.1.4.1
> > i 2
> > + local 'set_result=Error in packet.
> > Reason: (genError) A general failure occured'
> > ++ echo 'Error in packet.
> > Reason: (genError) A general failure occured'
> > ++ grep Timeout
> > + local check_result=
> > + '[' '!' -z '' ']'
> > + return 0
> > + exit 0
> >
> > so stonith agent says it was successful when it was not :(
> >
> > instead of grepping for "Timeout" (why?)
>
> Don't know. Yes, that's strange. I left that check in anyway. Can
> you simulate a time out and see what does snmpset return (exit
> code)?
>
> > it should check if exit status was 0, then it was successful
> > 2 - failed and not recoverable
>
> Yes, fixed now. Also snmpwalk for gethosts.
One more thing: Somebody recently reported that a multiple outlet
operation would report erroneously success in case a link between
two PDUs didn't work. It seems very likely that they actually ran
into this bug.
Thanks,
Dejan
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems