This is almost certainly a bug in the BMC. The change in your patch should have no effect, this is the start of a send, and the BMC interface should be idle at that point, so doing an smi_timeout will only result in another extraneous read from the IPMI interface (and of course a slightly longer delay).
I would guess that adding an extra read is working around the problem. Before polling was reduced, it read a whole lot more from the interface and probably covered the BMC bug. You can test this by replacing that "smi_timeout()" added in your patch with "smi_info->io->inputb(smi_info->io, 1)", which will do the read from the status register. -corey On 01/10/2011 06:49 PM, Brian De Wolf wrote: > Hello, in last October I upgraded to 2.6.35 on a Sun Fire X4100 and found that > starting the watchdog no longer worked. It produces this output when > started: > > Oct 21 15:50:14 stephen watchdog[4725]: starting daemon (5.6): > Oct 21 15:50:14 stephen watchdog[4725]: int=30s realtime=yes sync=no soft=no > mla=0 mem=0 > Oct 21 15:50:14 stephen watchdog[4725]: ping: no machine to check > Oct 21 15:50:14 stephen watchdog[4725]: file: no file to check > Oct 21 15:50:14 stephen watchdog[4725]: pidfile: no server process to check > Oct 21 15:50:14 stephen watchdog[4725]: interface: no interface to check > Oct 21 15:50:14 stephen watchdog[4725]: test=none(0) repair=none > alive=/dev/watchdog heartbeat=none temp=none to=root no_act=no > Oct 21 15:50:14 stephen kernel: IPMI message handler: BMC returned incorrect > response, expected netfn 7 cmd 22, got netfn 7 cmd 24 > Oct 21 15:50:14 stephen kernel: IPMI Watchdog: response: Error ff on cmd 22 > Oct 21 15:50:14 stephen watchdog[4725]: write watchdog device gave error 22 = > 'Invalid argument'! > Oct 21 15:51:15 stephen kernel: IPMI message handler: BMC returned incorrect > response, expected netfn 7 cmd 35, got netfn 7 cmd 22 > Oct 21 15:51:15 stephen kernel: IPMI message handler: BMC returned incorrect > response, expected netfn 7 cmd 22, got netfn 7 cmd 35 > > > After some bisecting, I found that the patch that causes this is a > patch to reduce ipmi polling: > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=3326f4f2276791561af1fd5f2020be0186459813 > > Unfortunately, the system is unstable if I reverse this patch. It > crashes with "kernel BUG at kernel/timer.c:851!" (I can provide this > output as requested) > > > I originally sent this directly to Matthew Garrett but he hasn't been > responsive for the last month or two, and I would like to eventually be > able to upgrade to a new kernel without losing functionality. Matthew > provided a workaround patch, but it still produced error output > infrequently. He said it wasn't clean enough for upstream, but > hopefully it will give some indication to what he found the problem to > be: > > diff --git a/drivers/char/ipmi/ipmi_si_intf.c > b/drivers/char/ipmi/ipmi_si_intf.c > index e829053..3f1e856 100644 > --- a/drivers/char/ipmi/ipmi_si_intf.c > +++ b/drivers/char/ipmi/ipmi_si_intf.c > @@ -316,6 +316,7 @@ static int unload_when_empty = 1; > static int add_smi(struct smi_info *smi); > static int try_smi_init(struct smi_info *smi); > static void cleanup_one_si(struct smi_info *to_clean); > +static void smi_timeout(unsigned long data); > > static ATOMIC_NOTIFIER_HEAD(xaction_notifier_list); > static int register_xaction_notifier(struct notifier_block *nb) > @@ -896,6 +897,7 @@ static void sender(void *send_info, > #endif > > mod_timer(&smi_info->si_timer, jiffies + SI_TIMEOUT_JIFFIES); > + smi_timeout((unsigned long)smi_info); > > if (smi_info->thread) > wake_up_process(smi_info->thread); > > ------------------------------------------------------------------------------ > Gaining the trust of online customers is vital for the success of any company > that requires sensitive data to be transmitted over the Web. Learn how to > best implement a security strategy that keeps consumers' information secure > and instills the confidence they need to proceed with transactions. > http://p.sf.net/sfu/oracle-sfdevnl > _______________________________________________ > Openipmi-developer mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/openipmi-developer ------------------------------------------------------------------------------ Gaining the trust of online customers is vital for the success of any company that requires sensitive data to be transmitted over the Web. Learn how to best implement a security strategy that keeps consumers' information secure and instills the confidence they need to proceed with transactions. http://p.sf.net/sfu/oracle-sfdevnl _______________________________________________ Openipmi-developer mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/openipmi-developer
