On Tue, Oct 6, 2009 at 1:26 AM, Ben Hutchings <[email protected]> wrote:
> On Mon, 2009-10-05 at 14:05 +0200, Jens-Michael Hoffmann wrote:
> > On Monday, 5. October 2009 01:29:39 Ben Hutchings wrote:
> > > On Mon, 2009-10-05 at 00:15 +0100, Antonio Marcos López Alonso wrote:
> > > > > Is this a new problem or did it occur with earlier kernel versions?
> > > >
> > > > No, it happened also in previous versions. But at least
> irqpoll/irqfixup
> > > > worked pretty well. Now this behavior seems to get worsened even
> using
> > > > these kernel options.
> > > >
> > > > > Can you try to reproduce this without the nvidia or virtualbox
> modules
> > > > > loaded?
> > > >
> > > > I can but just to make things faster:
> > > >
> > > > Jens-Michael,
> > > >
> > > > Have you got any nvidia/virtualbox modules running in your host? Just
> to
> > > > discard...
> > >
> > > The warning message shows all loaded modules, and those aren't
> included,
> > > so this question is answered.
> > >
> > > I had a look at the code and the values in the 'transmit timed out'
> > > message, and it seems that the NIC has reported a transmit completion
> > > but this hasn't been handled. Perhaps another device sharing its IRQ
> is
> > > misbehaving and causing the IRQ to be disabled. Please can you send
> > > more of the kernel log from before the TX watchdog warning? Also, if
> > > this happens again, please send the contents of /proc/interrupts.
> >
> > /proc/interrupts:
> > CPU0 CPU1
> > 0: 42 0 IO-APIC-edge timer
> > 1: 0 82 IO-APIC-edge i8042
> > 8: 0 0 IO-APIC-edge rtc0
> > 9: 0 0 IO-APIC-fasteoi acpi
> > 14: 0 109 IO-APIC-edge ide0
> > 17: 5 581 IO-APIC-fasteoi firewire_ohci
> > 18: 350432 19112694 IO-APIC-fasteoi eth1
> > 20: 6365 143067 IO-APIC-fasteoi sata_via
> > 21: 0 0 IO-APIC-fasteoi uhci_hcd:usb1,
> ehci_hcd:usb2, uhci_hcd:usb3, uhci_hcd:usb4, uhci_hcd:usb5
> > 23: 150373 6348653 IO-APIC-fasteoi eth2
> [...]
>
> OK, that seems to rule out my first hypothesis.
>
> Could you try adding 'noapic' to the kernel command line?
>
Sure. Here is /proc/interrupts using irqpoll + noapic (yet no much time
has passed to reproduce the failure):
CPU0
0: 32 XT-PIC-XT timer
1: 1084 XT-PIC-XT i8042
2: 0 XT-PIC-XT cascade
3: 1 XT-PIC-XT
4: 23569 XT-PIC-XT ehci_hcd:usb1, uhci_hcd:usb8
5: 63936 XT-PIC-XT ehci_hcd:usb2, uhci_hcd:usb4,
uhci_hcd:usb5, uhci_hcd:usb7, eth0
6: 5 XT-PIC-XT floppy
7: 412 XT-PIC-XT parport0
8: 0 XT-PIC-XT rtc0
9: 0 XT-PIC-XT acpi
10: 409 XT-PIC-XT nvidia
11: 30797 XT-PIC-XT uhci_hcd:usb3, uhci_hcd:usb6, sata_via,
HDA Intel
12: 15090 XT-PIC-XT i8042
14: 3043 XT-PIC-XT ide0
15: 0 XT-PIC-XT ide1
NMI: 0 Non-maskable interrupts
LOC: 117311 Local timer interrupts
SPU: 0 Spurious interrupts
RES: 0 Rescheduling interrupts
CAL: 0 Function call interrupts
TLB: 0 TLB shootdowns
TRM: 0 Thermal event interrupts
THR: 0 Threshold APIC interrupts
ERR: 1
MIS: 0
Antonio