On Fri, Jun 04, 2010 at 02:05:25PM +0100, Alexander Fisher wrote:
> On 4 June 2010 12:09, Dejan Muhamedagic <[email protected]> wrote:
> > On Fri, Jun 04, 2010 at 01:04:35PM +0200, Dejan Muhamedagic wrote:
> >> Hi,
> >>
> >> On Thu, Jun 03, 2010 at 06:52:09PM -0400, Vadym Chepkov wrote:
> >> > Hi
> >> >
> >> > There is a bug in stonith/plugins/external/rackpdu in cluster-glue-1.0.5
> >> >
> >> > It doesn't check if snmpset was successful or not :
> >> >
> >> > SendCommand() {
> >> >
> >> >     local host=$1
> >> >     local command=$2
> >> >
> >> >     GetOutletNumber $host
> >> >     local outlet=$?
> >> >
> >> >     if [ $outlet -gt 0 ]; then
> >> >         local set_result=`snmpset -v1 -c $community $pduip $oid.$outlet 
> >> > i $command 2>&1`
> >> >         local check_result=`echo "$set_result" | grep "Timeout"`
> >> >
> >> >         if [ ! -z "$check_result" ]; then
> >> >             ha_log.sh err "Write SNMP value $oid.$outlet=$command. 
> >> > Result: $set_result"
> >> >         fi
> >> >
> >> >         return 0
> >> >     else
> >> >         return 1
> >> >     fi
> >> > }
> >> >
> >> > Here is what happens:
> >> >
> >> > + '[' 1 -gt 0 ']'
> >> > ++ snmpset -v1 -c private 10.10.10.10  
> >> > .1.3.6.1.4.1.318.1.1.12.3.3.1.1.4.1 i 2
> >> > + local 'set_result=Error in packet.
> >> > Reason: (genError) A general failure occured'
> >> > ++ echo 'Error in packet.
> >> > Reason: (genError) A general failure occured'
> >> > ++ grep Timeout
> >> > + local check_result=
> >> > + '[' '!' -z '' ']'
> >> > + return 0
> >> > + exit 0
> >> >
> >> > so stonith agent says it was successful when it was not :(
> >> >
> >> > instead of grepping for "Timeout" (why?)
> >>
> >> Don't know. Yes, that's strange. I left that check in anyway. Can
> >> you simulate a time out and see what does snmpset return (exit
> >> code)?
> >>
> >> > it should check if exit status was 0, then it was successful
> >> > 2 - failed and not recoverable
> >>
> >> Yes, fixed now. Also snmpwalk for gethosts.
> >
> > One more thing: Somebody recently reported that a multiple outlet
> > operation would report erroneously success in case a link between
> > two PDUs didn't work. It seems very likely that they actually ran
> > into this bug.
> 
> I'm not convinced they did.  When you link outlets across PDUs, they
> are synchronised using a IP multicast address that has to be
> configured.

I'm not inclined either way, but since the exit code from snmpset
has never been verified...  Anyway, it's worth checking. If it
doesn't work, then APC should mention that it's a best effort
feature.

Cheers,

Dejan

> 
> Regards,
> Alex
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to