#1017: "rx FIFO overrun" prevents traffic from flowing
------------------------------------+---------------------------------------
      Reporter:  [EMAIL PROTECTED]  |       Owner:       
          Type:  defect             |      Status:  new  
      Priority:  critical           |   Milestone:       
     Component:  madwifi: driver    |     Version:  trunk
    Resolution:                     |    Keywords:       
Patch_attached:  0                  |  
------------------------------------+---------------------------------------
Comment (by [EMAIL PROTECTED]):

 A little more information on this bug.

 I don't know why I am the only one who sees an NMI error before the
 cascade of FIFO overruns.  Peeking into the kernel source I see that this
 message is produced by the function mem_parity_error(), in traps_64.c.
 However, there is a comment in the code before the call to
 mem_parity_error() which seems to indicate that the identification of the
 NMI as a parity error is suspect:

 {{{
         /* AK: following checks seem to be broken on modern chipsets.
 FIXME */

         if (reason & 0x80)
                 mem_parity_error(reason, regs);
         if (reason & 0x40)
                 io_check_error(reason, regs);
 }}}


 mem_parity_error() itself reads

 {{{
 static __kprobes void
 mem_parity_error(unsigned char reason, struct pt_regs * regs)
 {
         printk(KERN_EMERG "Uhhuh. NMI received for unknown reason
 %02x.\n",
                 reason);
         printk(KERN_EMERG "You have some hardware problem, likely on the
 PCI bus.\n");

 #if defined(CONFIG_EDAC)
         if(edac_handler_set()) {
                 edac_atomic_assert_error();
                 return;
         }
 #endif

         if (panic_on_unrecovered_nmi)
                 panic("NMI: Not continuing");

         printk(KERN_EMERG "Dazed and confused, but trying to continue\n");

         /* Clear and disable the memory parity error line. */
         reason = (reason & 0xf) | 4;
         outb(reason, 0x61);
 }

 }}}


 My current kernel has timestamps included (listing # of seconds since last
 wakeup from suspend), so one can see that the NMI is always close in time
 to the overruns:

 {{{
 [ 8878.847008] thinkpad_acpi: unknown LID-related hotkey event: 0x500c
 [ 8892.428671] thinkpad_acpi: unknown LID-related hotkey event: 0x5009
 [11084.863266] Uhhuh. NMI received for unknown reason a0.
 [11084.863277] You have some hardware problem, likely on the PCI bus.
 [11084.863281] Dazed and confused, but trying to continue
 [11087.090955] wifi0: rx FIFO overrun; resetting
 [11089.444806] wifi0: rx FIFO overrun; resetting
 [11091.798682] wifi0: rx FIFO overrun; resetting
 [11094.152548] wifi0: rx FIFO overrun; resetting
 [11096.608767] wifi0: rx FIFO overrun; resetting
 [11098.962629] wifi0: rx FIFO overrun; resetting
 }}}

 Also, this error condition disturbs other devices on my system, not just
 the wi-fi:  it also leads to some spurious "clicks" from the Wacom device
 (this is a tablet PC).  This behavior is consistent enough that now I can
 tell when the error condition has started even if I am not actively using
 the wi-fi.

-- 
Ticket URL: <http://madwifi.org/ticket/1017#comment:102>
madwifi.org <http://madwifi.org/>
Multiband Atheros Driver for Wireless Fidelity
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Madwifi-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/madwifi-tickets

Reply via email to