Re: Interrupt storm with MSI in combination with em1
On Thursday 05 May 2011 22:22:15 Jack Vogel wrote: On Thu, May 5, 2011 at 1:17 PM, Daan Vreeken d...@vehosting.nl wrote: Hi Peter, On Thursday 05 May 2011 21:28:02 Peter Jeremy wrote: On 2011-May-05 13:22:59 +0200, Daan Vreeken d...@vehosting.nl wrote: Not yet. I'll reboot the machine later today when I have physical access to it to check the BIOS version. I'll keep you informed as soon as I get another storm going. Depending on the quality of your BIOS (competence of the vendor), you might find that kenv(8) reports the BIOS version without needing a reboot. (Look at smbios.bios.* in the output). ... smbios.bios.version=0303 ... Version 0402 is the latest and greatest, so it's time to upgrade. According to Asus it Improves system stability, so let's see if this 'cures' IRQ 16. Cool, thanks for the update! Good luck. I've updated the BIOS and let the machine run for a couple of hours with MSI/MSIX enabled. After 3 hours of uptime I see the storm again. Here are the first couple of lines of output of top -S : last pid: 33218; load averages: 0.47, 0.35, 0.33up 0+03:52:1016:42:52 317 processes: 6 running, 289 sleeping, 22 waiting CPU: 0.4% user, 0.0% nice, 0.5% system, 11.6% interrupt, 87.5% idle Mem: 280M Active, 176M Inact, 1797M Wired, 8572K Cache, 32M Buf, 5545M Free Swap: 500M Total, 500M Free PID USERNAME THR PRI NICE SIZERES STATE C TIME WCPU COMMAND 11 root 4 171 ki31 0K64K CPU00 893:17 351.95% idle 12 root23 -80- 0K 368K WAIT2 18:37 50.39% intr One core is spending half it's time handling interrupts. /var/log/messages doesn't show any new message since the storm started. vmstat -i now shows : # vmstat -i interrupt total rate irq3: uart1 917384 63 -- irq16: ehci0 809547235 55608 irq23: ehci1 1751385120 cpu0:timer 16380717 1125 irq256: em0:rx 0 1651907113 irq257: em0:tx 0 1495708102 irq258: em0:link 3 0 irq259: em1:rx 0 397227 27 irq260: em1:tx 0 257865 17 irq261: em1:link 6 0 irq262: re010549 0 irq263: ahci0 290926 19 cpu1:timer 1160008 79 cpu3:timer763939 52 cpu2:timer 4120133283 irq272: hdac0 819282 56 Total 839564274 57670 Apart from spending far too much time handling interrupts, the machine works fine, so I'll let it run in case anyone wants me to try something on it. As a next step to try to isolate the problem I could create a kernel with MSI/MSIX enabled, but with a modified 'em' driver so it doesn't try to attach the MSI/MSIX interrupts to see if the problem is really related to the network cards or not. If anyone has a better idea, I'm all ears :) Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
- Original Message - From: Daan Vreeken d...@vehosting.nl # vmstat -i interrupt total rate irq3: uart1 917384 63 -- irq16: ehci0 809547235 55608 Have you tried removing USB from the kernel? USB seems to be a common course of this behaviour and here at least removing it from the kernel fixes in all cases assuming you don't need it for something? This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
Hi Steven, On Friday 06 May 2011 17:20:15 Steven Hartland wrote: From: Daan Vreeken d...@vehosting.nl # vmstat -i interrupt total rate irq3: uart1 917384 63 -- irq16: ehci0 809547235 55608 Have you tried removing USB from the kernel? USB seems to be a common course of this behaviour and here at least removing it from the kernel fixes in all cases assuming you don't need it for something? No, I haven't tried that yet. I could disable USB to run some tests, but I'll eventually need it enabled again. I'll wait for a couple of hours to see if anyone can come up with a test to run on the machine while the interrupt is still storming. After that I'll reboot it with USB disabled. Thanks, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
I don't see why you are blaming em, you can see its on MSIX vectors that are NOT storming, its something with USB as noted. Trying to disable em from using MSIX is in exactly the wrong direction IMHO. Jack On Fri, May 6, 2011 at 8:32 AM, Daan Vreeken d...@vehosting.nl wrote: Hi Steven, On Friday 06 May 2011 17:20:15 Steven Hartland wrote: From: Daan Vreeken d...@vehosting.nl # vmstat -i interrupt total rate irq3: uart1 917384 63 -- irq16: ehci0 809547235 55608 Have you tried removing USB from the kernel? USB seems to be a common course of this behaviour and here at least removing it from the kernel fixes in all cases assuming you don't need it for something? No, I haven't tried that yet. I could disable USB to run some tests, but I'll eventually need it enabled again. I'll wait for a couple of hours to see if anyone can come up with a test to run on the machine while the interrupt is still storming. After that I'll reboot it with USB disabled. Thanks, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
On 5/6/2011 11:02 AM, Daan Vreeken wrote: One core is spending half it's time handling interrupts. /var/log/messages doesn't show any new message since the storm started. vmstat -i now shows : # vmstat -i interrupt total rate irq3: uart1 917384 63 -- irq16: ehci0 809547235 55608 Apart from spending far too much time handling interrupts, the machine works fine, so I'll let it run in case anyone wants me to try something on it. Do you have any usb devices plugged in ? ie what does usbconfig show ? Also, what USB settings do you have in the BIOS ? I would try disabling usb legacy mode and and things like 80-64 translation. ---Mike -- --- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, m...@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
Hi Jack, On Friday 06 May 2011 17:36:52 Jack Vogel wrote: On Fri, May 6, 2011 at 8:32 AM, Daan Vreeken d...@vehosting.nl wrote: Hi Steven, On Friday 06 May 2011 17:20:15 Steven Hartland wrote: From: Daan Vreeken d...@vehosting.nl # vmstat -i interrupt total rate irq3: uart1 917384 63 -- irq16: ehci0 809547235 55608 Have you tried removing USB from the kernel? USB seems to be a common course of this behaviour and here at least removing it from the kernel fixes in all cases assuming you don't need it for something? No, I haven't tried that yet. I could disable USB to run some tests, but I'll eventually need it enabled again. I'll wait for a couple of hours to see if anyone can come up with a test to run on the machine while the interrupt is still storming. After that I'll reboot it with USB disabled. I don't see why you are blaming em, you can see its on MSIX vectors that are NOT storming, its something with USB as noted. Trying to disable em from using MSIX is in exactly the wrong direction IMHO. I'm not blaming this on 'em' per se. The only thing I've noticed in the tests that I've done so far is that I haven't seen the storms with MSI/MSIX disabled. With respect to the devices on IRQ 16, disabling MSI/MSIX only seems to change the way interrupts are delivered to the em0/em1 cards. (This is what made me suspect the 'em' driver.) At this moment all devices on IRQ 16 (including the PCI bridge itself) could be the source of the problem. I'm just trying to find a way to isolate the problem, either by finding a way to proof it is NOT device X, or by finding a way to proof it IS device Y. I'll reboot the machine in a couple of minutes with USB disabled. Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
On Friday, May 06, 2011 11:36:52 am Jack Vogel wrote: I don't see why you are blaming em, you can see its on MSIX vectors that are NOT storming, its something with USB as noted. Trying to disable em from using MSIX is in exactly the wrong direction IMHO. In the past Intel host bridges have exhibited very brain damaged behavior where em interrupts could trigger false interrupts on USB controllers. These host bridges did this because they assumed that if the interrupt line was masked in the I/O APIC, then the OS must be using the 8259A PICs and not the I/O APICs, so it would forward the interrupt down to the 8259A's in the ICH, and the effect was to trigger an interrupt on the line shared with the USB controllers creating phantom USB interrupts for each em(4) interrupt. FreeBSD triggered this because when using INTx and not using Scott's INTR_FAST changes, the kernel would mask em(4)'s interrupt in the I/O APIC which triggered the buggy behavior in the bridge. If for some reason em(4) is asserting both the INTx interrupt and the MSI-X interrupt now, then since the INTx interrupt is not enabled in FreeBSD, the I/O APIC pin will be masked and any INTx assertions would trigger similar phantom interrupts if this bridge was similarly broken. So given that, is there any chance that em(4) could still be asserting its INTx line (or the simulated INTx line rather) when MSI-X is being used? -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
Hi Jack, On Thursday 05 May 2011 02:25:39 Jack Vogel wrote: OK, but the reason you see the multiple cases of irq 16 is that's the bridge, once you are using MSIX, as vmstat shows, its using other vectors. Can you capture the messages file with the actual storm happening? I'll do that as soon as I witness another storm. Right now the system has been up over half a day (with MSI/MSIX enabled) and everything seems to be working as it should. I noticed some complaints about checksums in the dmesg, have you checked on BIOS upgrades or something like that on your motherboard? Not yet. I'll reboot the machine later today when I have physical access to it to check the BIOS version. I'll keep you informed as soon as I get another storm going. On Wed, May 4, 2011 at 4:27 PM, Daan Vreeken d...@vehosting.nl wrote: On Thursday 05 May 2011 00:15:43 you wrote: This all looks completely kosher, what IRQ is the storm on?? IRQ 16. Further down this email there is a list of devices that share the IRQ according to 'dmesg'. On Wed, May 4, 2011 at 3:04 PM, Daan Vreeken d...@vehosting.nl wrote: Hi, On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote: Will you please set it back to a default and then boot and capture the message for me? No problem. Here's the output with MSI/MSIX enabled : http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt I've also added the output of vmstat -i a couple of minutes after a reboot with MSI enabled : http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt Note that in the above vmstat -i dump the interrupt storm hasn't started yet. For some reason the storm doesn't always start directly at boot. I haven't been able (yet) to pinpoint what's triggering it to start. On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote: Hi Jack, Wednesday 04 May 2011 19:46:05 Jack Vogel wrote: Who makes your motherboard? The problem you are having is that MSIX AND MSI are both failing as em0 comes up, so it falls back to Legacy interrupt mode, and must be having some issue with sharing the line, causing the storm. The motherboard is an Asus P7H55-M. Sorry, I should have mentioned that the dmesg output is from booting with : hw.pci.enable_msix=0 hw.pci.enable_msi=0 .. in loader.conf. With those lines in loader.conf, MSI and MSIX is disabled, both cards work like they should and there is no interrupt storm. With MSI/MSIX enabled, both cards work like they should and I see the counters of the MSI interrupts increase (in small amounts, like they should), but at boot-time an interrupt storm starts on 'legacy' IRQ 16. Because the only difference between disabling/enabling MSI/MSIX seems to be in the way em0/em1 are used, and because 'em1' shares IRQ 16 according to the dmesg, I'm suspecting 'em1' is causing the storm. (But please correct me if I'm wrong :) What can I do to help track this problem down? According to dmesg the following devices share IRQ 16 : pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xbc00-0xbc07 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 on pci0 ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff irq 16 at device 26.0 on pci0 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd irq 16 at device 0.0 on pci4 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI interrupt delivery to both 'em0' and 'em1' seems to work correctly during a storm, as I see their counters increase normally in the vmstat -i output. As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the e1000 driver is causing this problem. Could it be that the driver forgets to clear/mask legacy interrupts when attaching the MSI interrupts perhaps? Any tips on how to debug and/or fix this? The full output of dmesg can be found here : http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt And the
Re: Interrupt storm with MSI in combination with em1
Hi Peter, On Thursday 05 May 2011 21:28:02 Peter Jeremy wrote: On 2011-May-05 13:22:59 +0200, Daan Vreeken d...@vehosting.nl wrote: Not yet. I'll reboot the machine later today when I have physical access to it to check the BIOS version. I'll keep you informed as soon as I get another storm going. Depending on the quality of your BIOS (competence of the vendor), you might find that kenv(8) reports the BIOS version without needing a reboot. (Look at smbios.bios.* in the output). Great! I didn't know that :) # kenv ... smbios.bios.reldate=07/15/2010 ... smbios.bios.version=0303 ... smbios.planar.maker=ASUSTeK Computer INC. smbios.planar.product=P7H55-M LX Version 0402 is the latest and greatest, so it's time to upgrade. According to Asus it Improves system stability, so let's see if this 'cures' IRQ 16. Thanks, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
Cool, thanks for the update! Good luck. Jack On Thu, May 5, 2011 at 1:17 PM, Daan Vreeken d...@vehosting.nl wrote: Hi Peter, On Thursday 05 May 2011 21:28:02 Peter Jeremy wrote: On 2011-May-05 13:22:59 +0200, Daan Vreeken d...@vehosting.nl wrote: Not yet. I'll reboot the machine later today when I have physical access to it to check the BIOS version. I'll keep you informed as soon as I get another storm going. Depending on the quality of your BIOS (competence of the vendor), you might find that kenv(8) reports the BIOS version without needing a reboot. (Look at smbios.bios.* in the output). Great! I didn't know that :) # kenv ... smbios.bios.reldate=07/15/2010 ... smbios.bios.version=0303 ... smbios.planar.maker=ASUSTeK Computer INC. smbios.planar.product=P7H55-M LX Version 0402 is the latest and greatest, so it's time to upgrade. According to Asus it Improves system stability, so let's see if this 'cures' IRQ 16. Thanks, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
On 2011-May-05 13:22:59 +0200, Daan Vreeken d...@vehosting.nl wrote: Not yet. I'll reboot the machine later today when I have physical access to it to check the BIOS version. I'll keep you informed as soon as I get another storm going. Depending on the quality of your BIOS (competence of the vendor), you might find that kenv(8) reports the BIOS version without needing a reboot. (Look at smbios.bios.* in the output). -- Peter Jeremy pgpZbYhnW3y6u.pgp Description: PGP signature
Interrupt storm with MSI in combination with em1
Hi All, I've just updated a machine to -current (r221321) and since then I'm seeing an interrupt storm on irq 16. The storm goes away when I disable MSI/MSIX with the following lines in loader.conf : hw.pci.enable_msix=0 hw.pci.enable_msi=0 According to dmesg the following devices share IRQ 16 : pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xbc00-0xbc07 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 on pci0 ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff irq 16 at device 26.0 on pci0 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd irq 16 at device 0.0 on pci4 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI interrupt delivery to both 'em0' and 'em1' seems to work correctly during a storm, as I see their counters increase normally in the vmstat -i output. As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the e1000 driver is causing this problem. Could it be that the driver forgets to clear/mask legacy interrupts when attaching the MSI interrupts perhaps? Any tips on how to debug and/or fix this? The full output of dmesg can be found here : http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt And the full output of pciconf -lv is here : http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
Who makes your motherboard? The problem you are having is that MSIX AND MSI are both failing as em0 comes up, so it falls back to Legacy interrupt mode, and must be having some issue with sharing the line, causing the storm. This is the second report in a matter of a week perhaps about a problematic motherboard, I would like to know who makes them. Thanks, Jack On Wed, May 4, 2011 at 8:34 AM, Daan Vreeken d...@vehosting.nl wrote: Hi All, I've just updated a machine to -current (r221321) and since then I'm seeing an interrupt storm on irq 16. The storm goes away when I disable MSI/MSIX with the following lines in loader.conf : hw.pci.enable_msix=0 hw.pci.enable_msi=0 According to dmesg the following devices share IRQ 16 : pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xbc00-0xbc07 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 on pci0 ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff irq 16 at device 26.0 on pci0 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd irq 16 at device 0.0 on pci4 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI interrupt delivery to both 'em0' and 'em1' seems to work correctly during a storm, as I see their counters increase normally in the vmstat -i output. As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the e1000 driver is causing this problem. Could it be that the driver forgets to clear/mask legacy interrupts when attaching the MSI interrupts perhaps? Any tips on how to debug and/or fix this? The full output of dmesg can be found here : http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt And the full output of pciconf -lv is here : http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
Hi Jack, Wednesday 04 May 2011 19:46:05 Jack Vogel wrote: Who makes your motherboard? The problem you are having is that MSIX AND MSI are both failing as em0 comes up, so it falls back to Legacy interrupt mode, and must be having some issue with sharing the line, causing the storm. The motherboard is an Asus P7H55-M. Sorry, I should have mentioned that the dmesg output is from booting with : hw.pci.enable_msix=0 hw.pci.enable_msi=0 .. in loader.conf. With those lines in loader.conf, MSI and MSIX is disabled, both cards work like they should and there is no interrupt storm. With MSI/MSIX enabled, both cards work like they should and I see the counters of the MSI interrupts increase (in small amounts, like they should), but at boot-time an interrupt storm starts on 'legacy' IRQ 16. Because the only difference between disabling/enabling MSI/MSIX seems to be in the way em0/em1 are used, and because 'em1' shares IRQ 16 according to the dmesg, I'm suspecting 'em1' is causing the storm. (But please correct me if I'm wrong :) What can I do to help track this problem down? According to dmesg the following devices share IRQ 16 : pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xbc00-0xbc07 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 on pci0 ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff irq 16 at device 26.0 on pci0 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd irq 16 at device 0.0 on pci4 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI interrupt delivery to both 'em0' and 'em1' seems to work correctly during a storm, as I see their counters increase normally in the vmstat -i output. As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the e1000 driver is causing this problem. Could it be that the driver forgets to clear/mask legacy interrupts when attaching the MSI interrupts perhaps? Any tips on how to debug and/or fix this? The full output of dmesg can be found here : http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt And the full output of pciconf -lv is here : http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
Will you please set it back to a default and then boot and capture the message for me? Thank you, Jack On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote: Hi Jack, Wednesday 04 May 2011 19:46:05 Jack Vogel wrote: Who makes your motherboard? The problem you are having is that MSIX AND MSI are both failing as em0 comes up, so it falls back to Legacy interrupt mode, and must be having some issue with sharing the line, causing the storm. The motherboard is an Asus P7H55-M. Sorry, I should have mentioned that the dmesg output is from booting with : hw.pci.enable_msix=0 hw.pci.enable_msi=0 .. in loader.conf. With those lines in loader.conf, MSI and MSIX is disabled, both cards work like they should and there is no interrupt storm. With MSI/MSIX enabled, both cards work like they should and I see the counters of the MSI interrupts increase (in small amounts, like they should), but at boot-time an interrupt storm starts on 'legacy' IRQ 16. Because the only difference between disabling/enabling MSI/MSIX seems to be in the way em0/em1 are used, and because 'em1' shares IRQ 16 according to the dmesg, I'm suspecting 'em1' is causing the storm. (But please correct me if I'm wrong :) What can I do to help track this problem down? According to dmesg the following devices share IRQ 16 : pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xbc00-0xbc07 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 on pci0 ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff irq 16 at device 26.0 on pci0 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd irq 16 at device 0.0 on pci4 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI interrupt delivery to both 'em0' and 'em1' seems to work correctly during a storm, as I see their counters increase normally in the vmstat -i output. As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the e1000 driver is causing this problem. Could it be that the driver forgets to clear/mask legacy interrupts when attaching the MSI interrupts perhaps? Any tips on how to debug and/or fix this? The full output of dmesg can be found here : http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt And the full output of pciconf -lv is here : http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
Hi, On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote: Will you please set it back to a default and then boot and capture the message for me? No problem. Here's the output with MSI/MSIX enabled : http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt I've also added the output of vmstat -i a couple of minutes after a reboot with MSI enabled : http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt Note that in the above vmstat -i dump the interrupt storm hasn't started yet. For some reason the storm doesn't always start directly at boot. I haven't been able (yet) to pinpoint what's triggering it to start. On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote: Hi Jack, Wednesday 04 May 2011 19:46:05 Jack Vogel wrote: Who makes your motherboard? The problem you are having is that MSIX AND MSI are both failing as em0 comes up, so it falls back to Legacy interrupt mode, and must be having some issue with sharing the line, causing the storm. The motherboard is an Asus P7H55-M. Sorry, I should have mentioned that the dmesg output is from booting with : hw.pci.enable_msix=0 hw.pci.enable_msi=0 .. in loader.conf. With those lines in loader.conf, MSI and MSIX is disabled, both cards work like they should and there is no interrupt storm. With MSI/MSIX enabled, both cards work like they should and I see the counters of the MSI interrupts increase (in small amounts, like they should), but at boot-time an interrupt storm starts on 'legacy' IRQ 16. Because the only difference between disabling/enabling MSI/MSIX seems to be in the way em0/em1 are used, and because 'em1' shares IRQ 16 according to the dmesg, I'm suspecting 'em1' is causing the storm. (But please correct me if I'm wrong :) What can I do to help track this problem down? According to dmesg the following devices share IRQ 16 : pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xbc00-0xbc07 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 on pci0 ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff irq 16 at device 26.0 on pci0 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd irq 16 at device 0.0 on pci4 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI interrupt delivery to both 'em0' and 'em1' seems to work correctly during a storm, as I see their counters increase normally in the vmstat -i output. As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the e1000 driver is causing this problem. Could it be that the driver forgets to clear/mask legacy interrupts when attaching the MSI interrupts perhaps? Any tips on how to debug and/or fix this? The full output of dmesg can be found here : http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt And the full output of pciconf -lv is here : http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
This all looks completely kosher, what IRQ is the storm on?? Jack On Wed, May 4, 2011 at 3:04 PM, Daan Vreeken d...@vehosting.nl wrote: Hi, On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote: Will you please set it back to a default and then boot and capture the message for me? No problem. Here's the output with MSI/MSIX enabled : http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt I've also added the output of vmstat -i a couple of minutes after a reboot with MSI enabled : http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt Note that in the above vmstat -i dump the interrupt storm hasn't started yet. For some reason the storm doesn't always start directly at boot. I haven't been able (yet) to pinpoint what's triggering it to start. On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote: Hi Jack, Wednesday 04 May 2011 19:46:05 Jack Vogel wrote: Who makes your motherboard? The problem you are having is that MSIX AND MSI are both failing as em0 comes up, so it falls back to Legacy interrupt mode, and must be having some issue with sharing the line, causing the storm. The motherboard is an Asus P7H55-M. Sorry, I should have mentioned that the dmesg output is from booting with : hw.pci.enable_msix=0 hw.pci.enable_msi=0 .. in loader.conf. With those lines in loader.conf, MSI and MSIX is disabled, both cards work like they should and there is no interrupt storm. With MSI/MSIX enabled, both cards work like they should and I see the counters of the MSI interrupts increase (in small amounts, like they should), but at boot-time an interrupt storm starts on 'legacy' IRQ 16. Because the only difference between disabling/enabling MSI/MSIX seems to be in the way em0/em1 are used, and because 'em1' shares IRQ 16 according to the dmesg, I'm suspecting 'em1' is causing the storm. (But please correct me if I'm wrong :) What can I do to help track this problem down? According to dmesg the following devices share IRQ 16 : pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xbc00-0xbc07 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 on pci0 ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff irq 16 at device 26.0 on pci0 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd irq 16 at device 0.0 on pci4 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI interrupt delivery to both 'em0' and 'em1' seems to work correctly during a storm, as I see their counters increase normally in the vmstat -i output. As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the e1000 driver is causing this problem. Could it be that the driver forgets to clear/mask legacy interrupts when attaching the MSI interrupts perhaps? Any tips on how to debug and/or fix this? The full output of dmesg can be found here : http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt And the full output of pciconf -lv is here : http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
Right, it was you Wiktor :) Oh, so yours is sort of a special case. Thanks, Jack On Wed, May 4, 2011 at 3:27 PM, Wiktor Niesiobedzki b...@vink.pl wrote: 2011/5/4 Jack Vogel jfvo...@gmail.com: This is the second report in a matter of a week perhaps about a problematic motherboard, I would like to know who makes them. Just for the record, the motherboard with which I had problems (I guess my problem is here referred) is VIA EPIA SN1. It's nothing new, and probably rarely used with additional PCIe cards, as this is embedded-like creature. Cheers, Wiktor Niesiobedzki ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
2011/5/4 Jack Vogel jfvo...@gmail.com: This is the second report in a matter of a week perhaps about a problematic motherboard, I would like to know who makes them. Just for the record, the motherboard with which I had problems (I guess my problem is here referred) is VIA EPIA SN1. It's nothing new, and probably rarely used with additional PCIe cards, as this is embedded-like creature. Cheers, Wiktor Niesiobedzki ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
On Thursday 05 May 2011 00:15:43 you wrote: This all looks completely kosher, what IRQ is the storm on?? IRQ 16. Further down this email there is a list of devices that share the IRQ according to 'dmesg'. On Wed, May 4, 2011 at 3:04 PM, Daan Vreeken d...@vehosting.nl wrote: Hi, On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote: Will you please set it back to a default and then boot and capture the message for me? No problem. Here's the output with MSI/MSIX enabled : http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt I've also added the output of vmstat -i a couple of minutes after a reboot with MSI enabled : http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt Note that in the above vmstat -i dump the interrupt storm hasn't started yet. For some reason the storm doesn't always start directly at boot. I haven't been able (yet) to pinpoint what's triggering it to start. On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote: Hi Jack, Wednesday 04 May 2011 19:46:05 Jack Vogel wrote: Who makes your motherboard? The problem you are having is that MSIX AND MSI are both failing as em0 comes up, so it falls back to Legacy interrupt mode, and must be having some issue with sharing the line, causing the storm. The motherboard is an Asus P7H55-M. Sorry, I should have mentioned that the dmesg output is from booting with : hw.pci.enable_msix=0 hw.pci.enable_msi=0 .. in loader.conf. With those lines in loader.conf, MSI and MSIX is disabled, both cards work like they should and there is no interrupt storm. With MSI/MSIX enabled, both cards work like they should and I see the counters of the MSI interrupts increase (in small amounts, like they should), but at boot-time an interrupt storm starts on 'legacy' IRQ 16. Because the only difference between disabling/enabling MSI/MSIX seems to be in the way em0/em1 are used, and because 'em1' shares IRQ 16 according to the dmesg, I'm suspecting 'em1' is causing the storm. (But please correct me if I'm wrong :) What can I do to help track this problem down? According to dmesg the following devices share IRQ 16 : pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xbc00-0xbc07 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 on pci0 ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff irq 16 at device 26.0 on pci0 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd irq 16 at device 0.0 on pci4 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI interrupt delivery to both 'em0' and 'em1' seems to work correctly during a storm, as I see their counters increase normally in the vmstat -i output. As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the e1000 driver is causing this problem. Could it be that the driver forgets to clear/mask legacy interrupts when attaching the MSI interrupts perhaps? Any tips on how to debug and/or fix this? The full output of dmesg can be found here : http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt And the full output of pciconf -lv is here : http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt Thanks, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
OK, but the reason you see the multiple cases of irq 16 is that's the bridge, once you are using MSIX, as vmstat shows, its using other vectors. Can you capture the messages file with the actual storm happening? I noticed some complaints about checksums in the dmesg, have you checked on BIOS upgrades or something like that on your motherboard? Regards, Jack On Wed, May 4, 2011 at 4:27 PM, Daan Vreeken d...@vehosting.nl wrote: On Thursday 05 May 2011 00:15:43 you wrote: This all looks completely kosher, what IRQ is the storm on?? IRQ 16. Further down this email there is a list of devices that share the IRQ according to 'dmesg'. On Wed, May 4, 2011 at 3:04 PM, Daan Vreeken d...@vehosting.nl wrote: Hi, On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote: Will you please set it back to a default and then boot and capture the message for me? No problem. Here's the output with MSI/MSIX enabled : http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt I've also added the output of vmstat -i a couple of minutes after a reboot with MSI enabled : http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt Note that in the above vmstat -i dump the interrupt storm hasn't started yet. For some reason the storm doesn't always start directly at boot. I haven't been able (yet) to pinpoint what's triggering it to start. On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote: Hi Jack, Wednesday 04 May 2011 19:46:05 Jack Vogel wrote: Who makes your motherboard? The problem you are having is that MSIX AND MSI are both failing as em0 comes up, so it falls back to Legacy interrupt mode, and must be having some issue with sharing the line, causing the storm. The motherboard is an Asus P7H55-M. Sorry, I should have mentioned that the dmesg output is from booting with : hw.pci.enable_msix=0 hw.pci.enable_msi=0 .. in loader.conf. With those lines in loader.conf, MSI and MSIX is disabled, both cards work like they should and there is no interrupt storm. With MSI/MSIX enabled, both cards work like they should and I see the counters of the MSI interrupts increase (in small amounts, like they should), but at boot-time an interrupt storm starts on 'legacy' IRQ 16. Because the only difference between disabling/enabling MSI/MSIX seems to be in the way em0/em1 are used, and because 'em1' shares IRQ 16 according to the dmesg, I'm suspecting 'em1' is causing the storm. (But please correct me if I'm wrong :) What can I do to help track this problem down? According to dmesg the following devices share IRQ 16 : pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xbc00-0xbc07 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 on pci0 ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff irq 16 at device 26.0 on pci0 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd irq 16 at device 0.0 on pci4 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI interrupt delivery to both 'em0' and 'em1' seems to work correctly during a storm, as I see their counters increase normally in the vmstat -i output. As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the e1000 driver is causing this problem. Could it be that the driver forgets to clear/mask legacy interrupts when attaching the MSI interrupts perhaps? Any tips on how to debug and/or fix this? The full output of dmesg can be found here : http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt And the full output of pciconf -lv is here : http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt Thanks, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org