#1017: "rx FIFO overrun" prevents traffic from flowing
------------------------------------+---------------------------------------
Reporter: [EMAIL PROTECTED] | Owner:
Type: defect | Status: new
Priority: critical | Milestone:
Component: madwifi: driver | Version: trunk
Resolution: | Keywords:
Patch_attached: 0 |
------------------------------------+---------------------------------------
Comment (by [EMAIL PROTECTED]):
A little more information on this bug.
I don't know why I am the only one who sees an NMI error before the
cascade of FIFO overruns. Peeking into the kernel source I see that this
message is produced by the function mem_parity_error(), in traps_64.c.
However, there is a comment in the code before the call to
mem_parity_error() which seems to indicate that the identification of the
NMI as a parity error is suspect:
{{{
/* AK: following checks seem to be broken on modern chipsets.
FIXME */
if (reason & 0x80)
mem_parity_error(reason, regs);
if (reason & 0x40)
io_check_error(reason, regs);
}}}
mem_parity_error() itself reads
{{{
static __kprobes void
mem_parity_error(unsigned char reason, struct pt_regs * regs)
{
printk(KERN_EMERG "Uhhuh. NMI received for unknown reason
%02x.\n",
reason);
printk(KERN_EMERG "You have some hardware problem, likely on the
PCI bus.\n");
#if defined(CONFIG_EDAC)
if(edac_handler_set()) {
edac_atomic_assert_error();
return;
}
#endif
if (panic_on_unrecovered_nmi)
panic("NMI: Not continuing");
printk(KERN_EMERG "Dazed and confused, but trying to continue\n");
/* Clear and disable the memory parity error line. */
reason = (reason & 0xf) | 4;
outb(reason, 0x61);
}
}}}
My current kernel has timestamps included (listing # of seconds since last
wakeup from suspend), so one can see that the NMI is always close in time
to the overruns:
{{{
[ 8878.847008] thinkpad_acpi: unknown LID-related hotkey event: 0x500c
[ 8892.428671] thinkpad_acpi: unknown LID-related hotkey event: 0x5009
[11084.863266] Uhhuh. NMI received for unknown reason a0.
[11084.863277] You have some hardware problem, likely on the PCI bus.
[11084.863281] Dazed and confused, but trying to continue
[11087.090955] wifi0: rx FIFO overrun; resetting
[11089.444806] wifi0: rx FIFO overrun; resetting
[11091.798682] wifi0: rx FIFO overrun; resetting
[11094.152548] wifi0: rx FIFO overrun; resetting
[11096.608767] wifi0: rx FIFO overrun; resetting
[11098.962629] wifi0: rx FIFO overrun; resetting
}}}
Also, this error condition disturbs other devices on my system, not just
the wi-fi: it also leads to some spurious "clicks" from the Wacom device
(this is a tablet PC). This behavior is consistent enough that now I can
tell when the error condition has started even if I am not actively using
the wi-fi.
--
Ticket URL: <http://madwifi.org/ticket/1017#comment:102>
madwifi.org <http://madwifi.org/>
Multiband Atheros Driver for Wireless Fidelity
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Madwifi-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/madwifi-tickets