On Fri, Jun 04, 2010 at 02:05:25PM +0100, Alexander Fisher wrote: > On 4 June 2010 12:09, Dejan Muhamedagic <[email protected]> wrote: > > On Fri, Jun 04, 2010 at 01:04:35PM +0200, Dejan Muhamedagic wrote: > >> Hi, > >> > >> On Thu, Jun 03, 2010 at 06:52:09PM -0400, Vadym Chepkov wrote: > >> > Hi > >> > > >> > There is a bug in stonith/plugins/external/rackpdu in cluster-glue-1.0.5 > >> > > >> > It doesn't check if snmpset was successful or not : > >> > > >> > SendCommand() { > >> > > >> > local host=$1 > >> > local command=$2 > >> > > >> > GetOutletNumber $host > >> > local outlet=$? > >> > > >> > if [ $outlet -gt 0 ]; then > >> > local set_result=`snmpset -v1 -c $community $pduip $oid.$outlet > >> > i $command 2>&1` > >> > local check_result=`echo "$set_result" | grep "Timeout"` > >> > > >> > if [ ! -z "$check_result" ]; then > >> > ha_log.sh err "Write SNMP value $oid.$outlet=$command. > >> > Result: $set_result" > >> > fi > >> > > >> > return 0 > >> > else > >> > return 1 > >> > fi > >> > } > >> > > >> > Here is what happens: > >> > > >> > + '[' 1 -gt 0 ']' > >> > ++ snmpset -v1 -c private 10.10.10.10 > >> > .1.3.6.1.4.1.318.1.1.12.3.3.1.1.4.1 i 2 > >> > + local 'set_result=Error in packet. > >> > Reason: (genError) A general failure occured' > >> > ++ echo 'Error in packet. > >> > Reason: (genError) A general failure occured' > >> > ++ grep Timeout > >> > + local check_result= > >> > + '[' '!' -z '' ']' > >> > + return 0 > >> > + exit 0 > >> > > >> > so stonith agent says it was successful when it was not :( > >> > > >> > instead of grepping for "Timeout" (why?) > >> > >> Don't know. Yes, that's strange. I left that check in anyway. Can > >> you simulate a time out and see what does snmpset return (exit > >> code)? > >> > >> > it should check if exit status was 0, then it was successful > >> > 2 - failed and not recoverable > >> > >> Yes, fixed now. Also snmpwalk for gethosts. > > > > One more thing: Somebody recently reported that a multiple outlet > > operation would report erroneously success in case a link between > > two PDUs didn't work. It seems very likely that they actually ran > > into this bug. > > I'm not convinced they did. When you link outlets across PDUs, they > are synchronised using a IP multicast address that has to be > configured.
I'm not inclined either way, but since the exit code from snmpset has never been verified... Anyway, it's worth checking. If it doesn't work, then APC should mention that it's a best effort feature. Cheers, Dejan > > Regards, > Alex > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
