Re: Interrupt storm with MSI in combination with em1

2011-05-06 Thread Daan Vreeken
On Thursday 05 May 2011 22:22:15 Jack Vogel wrote:
 On Thu, May 5, 2011 at 1:17 PM, Daan Vreeken d...@vehosting.nl wrote:
  Hi Peter,
 
  On Thursday 05 May 2011 21:28:02 Peter Jeremy wrote:
   On 2011-May-05 13:22:59 +0200, Daan Vreeken d...@vehosting.nl wrote:
   Not yet. I'll reboot the machine later today when I have physical
access to it to check the BIOS version. I'll keep you informed as
soon as I get another storm going.
  
   Depending on the quality of your BIOS (competence of the vendor), you
   might find that kenv(8) reports the BIOS version without needing a
   reboot.
   (Look at smbios.bios.* in the output).
...
  smbios.bios.version=0303   
...
  Version 0402 is the latest and greatest, so it's time to upgrade.
  According
  to Asus it Improves system stability, so let's see if this 'cures' IRQ
  16.

 Cool, thanks for the update! Good luck.

I've updated the BIOS and let the machine run for a couple of hours with 
MSI/MSIX enabled. After 3 hours of uptime I see the storm again.

Here are the first couple of lines of output of top -S :

last pid: 33218;  load averages:  0.47,  0.35,  0.33up 
0+03:52:1016:42:52
317 processes: 6 running, 289 sleeping, 22 waiting
CPU:  0.4% user,  0.0% nice,  0.5% system, 11.6% interrupt, 87.5% idle
Mem: 280M Active, 176M Inact, 1797M Wired, 8572K Cache, 32M Buf, 5545M 
Free
Swap: 500M Total, 500M Free
PID USERNAME   THR PRI NICE   SIZERES STATE   C   TIME   WCPU 
COMMAND
11 root 4 171 ki31 0K64K CPU00 893:17 351.95% 
idle
12 root23 -80- 0K   368K WAIT2  18:37 50.39% 
intr

One core is spending half it's time handling interrupts.
/var/log/messages doesn't show any new message since the storm 
started. vmstat -i now shows :

# vmstat -i
interrupt  total   rate
irq3: uart1   917384 63
-- irq16: ehci0   809547235  55608
irq23: ehci1 1751385120
cpu0:timer  16380717   1125
irq256: em0:rx 0 1651907113
irq257: em0:tx 0 1495708102
irq258: em0:link   3  0
irq259: em1:rx 0  397227 27
irq260: em1:tx 0  257865 17
irq261: em1:link   6  0
irq262: re010549  0
irq263: ahci0 290926 19
cpu1:timer   1160008 79
cpu3:timer763939 52
cpu2:timer   4120133283
irq272: hdac0 819282 56
Total  839564274  57670

Apart from spending far too much time handling interrupts, the machine works 
fine, so I'll let it run in case anyone wants me to try something on it.

As a next step to try to isolate the problem I could create a kernel with 
MSI/MSIX enabled, but with a modified 'em' driver so it doesn't try to attach 
the MSI/MSIX interrupts to see if the problem is really related to the 
network cards or not.
If anyone has a better idea, I'm all ears :)


Regards,
-- 
Daan Vreeken
VEHosting
http://VEHosting.nl
tel: +31-(0)40-7113050 / +31-(0)6-46210825
KvK nr: 17174380
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Interrupt storm with MSI in combination with em1

2011-05-06 Thread Steven Hartland
- Original Message - 
From: Daan Vreeken d...@vehosting.nl


# vmstat -i
interrupt  total   rate
irq3: uart1   917384 63
-- irq16: ehci0   809547235  55608


Have you tried removing USB from the kernel?

USB seems to be a common course of this behaviour and here at least
removing it from the kernel fixes in all cases assuming you don't
need it for something?



This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Interrupt storm with MSI in combination with em1

2011-05-06 Thread Daan Vreeken
Hi Steven,

On Friday 06 May 2011 17:20:15 Steven Hartland wrote:
 From: Daan Vreeken d...@vehosting.nl

  # vmstat -i
  interrupt  total   rate
  irq3: uart1   917384 63
  -- irq16: ehci0   809547235  55608

 Have you tried removing USB from the kernel?

 USB seems to be a common course of this behaviour and here at least
 removing it from the kernel fixes in all cases assuming you don't
 need it for something?

No, I haven't tried that yet. I could disable USB to run some tests, but I'll 
eventually need it enabled again.
I'll wait for a couple of hours to see if anyone can come up with a test to 
run on the machine while the interrupt is still storming. After that I'll 
reboot it with USB disabled.


Thanks,
-- 
Daan Vreeken
VEHosting
http://VEHosting.nl
tel: +31-(0)40-7113050 / +31-(0)6-46210825
KvK nr: 17174380
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Interrupt storm with MSI in combination with em1

2011-05-06 Thread Jack Vogel
I don't see why you are blaming em, you can see its on MSIX vectors
that are NOT storming, its something with USB as noted. Trying to
disable em from using MSIX is in exactly the wrong direction IMHO.

Jack


On Fri, May 6, 2011 at 8:32 AM, Daan Vreeken d...@vehosting.nl wrote:

 Hi Steven,

 On Friday 06 May 2011 17:20:15 Steven Hartland wrote:
  From: Daan Vreeken d...@vehosting.nl
 
   # vmstat -i
   interrupt  total   rate
   irq3: uart1   917384 63
   -- irq16: ehci0   809547235  55608
 
  Have you tried removing USB from the kernel?
 
  USB seems to be a common course of this behaviour and here at least
  removing it from the kernel fixes in all cases assuming you don't
  need it for something?

 No, I haven't tried that yet. I could disable USB to run some tests, but
 I'll
 eventually need it enabled again.
 I'll wait for a couple of hours to see if anyone can come up with a test to
 run on the machine while the interrupt is still storming. After that I'll
 reboot it with USB disabled.


 Thanks,
 --
 Daan Vreeken
 VEHosting
 http://VEHosting.nl
 tel: +31-(0)40-7113050 / +31-(0)6-46210825
 KvK nr: 17174380

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Interrupt storm with MSI in combination with em1

2011-05-06 Thread Mike Tancsa
On 5/6/2011 11:02 AM, Daan Vreeken wrote:
 One core is spending half it's time handling interrupts.
 /var/log/messages doesn't show any new message since the storm 
 started. vmstat -i now shows :
 
   # vmstat -i
   interrupt  total   rate
   irq3: uart1   917384 63
 --   irq16: ehci0   809547235  55608
 
 Apart from spending far too much time handling interrupts, the machine works 
 fine, so I'll let it run in case anyone wants me to try something on it.
 


Do you have any usb devices plugged in ? ie what does
usbconfig

show ?

Also, what USB settings do you have in the BIOS ? I would try disabling
usb legacy mode and and things like 80-64 translation.

---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Interrupt storm with MSI in combination with em1

2011-05-06 Thread Daan Vreeken
Hi Jack,

On Friday 06 May 2011 17:36:52 Jack Vogel wrote:
 On Fri, May 6, 2011 at 8:32 AM, Daan Vreeken d...@vehosting.nl wrote:
  Hi Steven,
 
  On Friday 06 May 2011 17:20:15 Steven Hartland wrote:
   From: Daan Vreeken d...@vehosting.nl
  
# vmstat -i
interrupt  total   rate
irq3: uart1   917384 63
-- irq16: ehci0   809547235  55608
  
   Have you tried removing USB from the kernel?
  
   USB seems to be a common course of this behaviour and here at least
   removing it from the kernel fixes in all cases assuming you don't
   need it for something?
 
  No, I haven't tried that yet. I could disable USB to run some tests, but
  I'll
  eventually need it enabled again.
  I'll wait for a couple of hours to see if anyone can come up with a test
  to run on the machine while the interrupt is still storming. After that
  I'll reboot it with USB disabled.

 I don't see why you are blaming em, you can see its on MSIX vectors
 that are NOT storming, its something with USB as noted. Trying to
 disable em from using MSIX is in exactly the wrong direction IMHO.

I'm not blaming this on 'em' per se. The only thing I've noticed in the tests 
that I've done so far is that I haven't seen the storms with MSI/MSIX 
disabled. With respect to the devices on IRQ 16, disabling MSI/MSIX only 
seems to change the way interrupts are delivered to the em0/em1 cards.
(This is what made me suspect the 'em' driver.)

At this moment all devices on IRQ 16 (including the PCI bridge itself) could 
be the source of the problem. I'm just trying to find a way to isolate the 
problem, either by finding a way to proof it is NOT device X, or by finding a 
way to proof it IS device Y.

I'll reboot the machine in a couple of minutes with USB disabled.


Regards,
-- 
Daan Vreeken
VEHosting
http://VEHosting.nl
tel: +31-(0)40-7113050 / +31-(0)6-46210825
KvK nr: 17174380
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Interrupt storm with MSI in combination with em1

2011-05-06 Thread John Baldwin
On Friday, May 06, 2011 11:36:52 am Jack Vogel wrote:
 I don't see why you are blaming em, you can see its on MSIX vectors
 that are NOT storming, its something with USB as noted. Trying to
 disable em from using MSIX is in exactly the wrong direction IMHO.

In the past Intel host bridges have exhibited very brain damaged behavior 
where em interrupts could trigger false interrupts on USB controllers.
These host bridges did this because they assumed that if the interrupt
line was masked in the I/O APIC, then the OS must be using the 8259A
PICs and not the I/O APICs, so it would forward the interrupt down to
the 8259A's in the ICH, and the effect was to trigger an interrupt on
the line shared with the USB controllers creating phantom USB interrupts
for each em(4) interrupt.

FreeBSD triggered this because when using INTx and not using Scott's
INTR_FAST changes, the kernel would mask em(4)'s interrupt in the I/O APIC
which triggered the buggy behavior in the bridge.

If for some reason em(4) is asserting both the INTx interrupt and the
MSI-X interrupt now, then since the INTx interrupt is not enabled in
FreeBSD, the I/O APIC pin will be masked and any INTx assertions would
trigger similar phantom interrupts if this bridge was similarly broken.

So given that, is there any chance that em(4) could still be asserting
its INTx line (or the simulated INTx line rather) when MSI-X is being
used?

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Interrupt storm with MSI in combination with em1

2011-05-05 Thread Daan Vreeken
Hi Jack,

On Thursday 05 May 2011 02:25:39 Jack Vogel wrote:
 OK, but the reason you see the multiple cases of irq 16 is that's the
 bridge,
 once you are using MSIX, as vmstat shows, its using other vectors.

 Can you capture the messages file with the actual storm happening?

I'll do that as soon as I witness another storm. Right now the system has been 
up over half a day (with MSI/MSIX enabled) and everything seems to be working 
as it should.

 I noticed some complaints about checksums in the dmesg, have you
 checked on BIOS upgrades or something like that on your motherboard?

Not yet. I'll reboot the machine later today when I have physical access to it 
to check the BIOS version. I'll keep you informed as soon as I get another 
storm going.


 On Wed, May 4, 2011 at 4:27 PM, Daan Vreeken d...@vehosting.nl wrote:
  On Thursday 05 May 2011 00:15:43 you wrote:
   This all looks completely kosher,  what IRQ is the storm on??
 
  IRQ 16. Further down this email there is a list of devices that share the
  IRQ
  according to 'dmesg'.
 
   On Wed, May 4, 2011 at 3:04 PM, Daan Vreeken d...@vehosting.nl wrote:
Hi,
   
On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote:
 Will you please set it back to a default and then boot and capture
  the
 message for me?
   
No problem. Here's the output with MSI/MSIX enabled :
   
http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt
   
I've also added the output of vmstat -i a couple of minutes after a
reboot
with MSI enabled :
   http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt
   
Note that in the above vmstat -i dump the interrupt storm hasn't
started yet. For some reason the storm doesn't always start directly
at boot. I haven't been able (yet) to pinpoint what's triggering it
to start.
   
 On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl
 
  wrote:
  Hi Jack,
 
  Wednesday 04 May 2011 19:46:05 Jack Vogel wrote:
   Who makes your motherboard? The problem you are having is that
  MSIX
   AND MSI are both failing as em0 comes up, so it falls back to
  Legacy
   interrupt mode,
   and must be having some issue with sharing the line, causing
   the storm.
 
  The motherboard is an Asus P7H55-M.
 
  Sorry, I should have mentioned that the dmesg output is from
  booting
  with :
   hw.pci.enable_msix=0
   hw.pci.enable_msi=0
 
  .. in loader.conf.
 
  With those lines in loader.conf, MSI and MSIX is disabled, both
  cards work
  like they should and there is no interrupt storm.
 
  With MSI/MSIX enabled, both cards work like they should and I see
  the
  counters
  of the MSI interrupts increase (in small amounts, like they
  should),
  but at boot-time an interrupt storm starts on 'legacy' IRQ 16.
 
  Because the only difference between disabling/enabling MSI/MSIX
  seems
  to be in
  the way em0/em1 are used, and because 'em1' shares IRQ 16
  according to the dmesg, I'm suspecting 'em1' is causing the
  storm. (But please correct me if I'm wrong :)
 
  What can I do to help track this problem down?
 
According to dmesg the following devices share IRQ 16 :
   pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on
  pci0
   em0: Intel(R) PRO/1000 Network Connection 7.2.3 port
0xcc00-0xcc1f mem
  0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd
  irq 16 at device 0.0 on pci1
   vgapci0: VGA-compatible display port 0xbc00-0xbc07
  mem 0xf780-0xf7bf,0xe000-0xefff irq
  16
at device 2.0 on
  pci0
   ehci0: Intel PCH USB 2.0 controller USB-B mem
0xf7cfa000-0xf7cfa3ff
  irq 16 at device 26.0 on pci0
   em1: Intel(R) PRO/1000 Network Connection 7.2.3 port
0xec00-0xec1f mem
  0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd
  irq 16 at device 0.0 on pci4
   pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on
  pci0
During a storm vmstat -i shows a rate of about 220.000
interrupts/sec.
MSI
interrupt delivery to both 'em0' and 'em1' seems to work
correctly during
a storm, as I see their counters increase normally in the
  vmstat
-i output.
As only 'em0' and 'em1' seem to be using MSI interrupts, my
  guess
is that the
e1000 driver is causing this problem. Could it be that the
  driver
forgets to
clear/mask legacy interrupts when attaching the MSI
interrupts perhaps?
   
Any tips on how to debug and/or fix this?
   
   
The full output of dmesg can be found here :
   
http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt
   
And the 

Re: Interrupt storm with MSI in combination with em1

2011-05-05 Thread Daan Vreeken
Hi Peter,

On Thursday 05 May 2011 21:28:02 Peter Jeremy wrote:
 On 2011-May-05 13:22:59 +0200, Daan Vreeken d...@vehosting.nl wrote:
 Not yet. I'll reboot the machine later today when I have physical access
  to it to check the BIOS version. I'll keep you informed as soon as I get
  another storm going.

 Depending on the quality of your BIOS (competence of the vendor), you
 might find that kenv(8) reports the BIOS version without needing a reboot.
 (Look at smbios.bios.* in the output).

Great! I didn't know that :)

# kenv
...
smbios.bios.reldate=07/15/2010
...
smbios.bios.version=0303   
...
smbios.planar.maker=ASUSTeK Computer INC.
smbios.planar.product=P7H55-M LX


Version 0402 is the latest and greatest, so it's time to upgrade. According 
to Asus it Improves system stability, so let's see if this 'cures' IRQ 16.


Thanks,
-- 
Daan Vreeken
VEHosting
http://VEHosting.nl
tel: +31-(0)40-7113050 / +31-(0)6-46210825
KvK nr: 17174380
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Interrupt storm with MSI in combination with em1

2011-05-05 Thread Jack Vogel
Cool, thanks for the update! Good luck.

Jack


On Thu, May 5, 2011 at 1:17 PM, Daan Vreeken d...@vehosting.nl wrote:

 Hi Peter,

 On Thursday 05 May 2011 21:28:02 Peter Jeremy wrote:
  On 2011-May-05 13:22:59 +0200, Daan Vreeken d...@vehosting.nl wrote:
  Not yet. I'll reboot the machine later today when I have physical access
   to it to check the BIOS version. I'll keep you informed as soon as I
 get
   another storm going.
 
  Depending on the quality of your BIOS (competence of the vendor), you
  might find that kenv(8) reports the BIOS version without needing a
 reboot.
  (Look at smbios.bios.* in the output).

 Great! I didn't know that :)

 # kenv
 ...
 smbios.bios.reldate=07/15/2010
 ...
 smbios.bios.version=0303   
 ...
 smbios.planar.maker=ASUSTeK Computer INC.
 smbios.planar.product=P7H55-M LX


 Version 0402 is the latest and greatest, so it's time to upgrade.
 According
 to Asus it Improves system stability, so let's see if this 'cures' IRQ
 16.


 Thanks,
 --
 Daan Vreeken
 VEHosting
 http://VEHosting.nl
 tel: +31-(0)40-7113050 / +31-(0)6-46210825
 KvK nr: 17174380

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Interrupt storm with MSI in combination with em1

2011-05-05 Thread Peter Jeremy
On 2011-May-05 13:22:59 +0200, Daan Vreeken d...@vehosting.nl wrote:
Not yet. I'll reboot the machine later today when I have physical access to it 
to check the BIOS version. I'll keep you informed as soon as I get another 
storm going.

Depending on the quality of your BIOS (competence of the vendor), you
might find that kenv(8) reports the BIOS version without needing a reboot.
(Look at smbios.bios.* in the output).

-- 
Peter Jeremy


pgpZbYhnW3y6u.pgp
Description: PGP signature


Interrupt storm with MSI in combination with em1

2011-05-04 Thread Daan Vreeken
Hi All,

I've just updated a machine to -current (r221321) and since then I'm seeing an 
interrupt storm on irq 16. The storm goes away when I disable MSI/MSIX with 
the following lines in loader.conf :

hw.pci.enable_msix=0
hw.pci.enable_msi=0

According to dmesg the following devices share IRQ 16 :

pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f
   mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd
   irq 16 at device 0.0 on pci1
vgapci0: VGA-compatible display port 0xbc00-0xbc07
   mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 
on
   pci0
ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff
   irq 16 at device 26.0 on pci0
em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f
   mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd
   irq 16 at device 0.0 on pci4
pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0

During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI 
interrupt delivery to both 'em0' and 'em1' seems to work correctly during a 
storm, as I see their counters increase normally in the vmstat -i output.

As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the 
e1000 driver is causing this problem. Could it be that the driver forgets to 
clear/mask legacy interrupts when attaching the MSI interrupts perhaps?

Any tips on how to debug and/or fix this?


The full output of dmesg can be found here :
http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt

And the full output of pciconf -lv is here :
http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt


Regards,
-- 
Daan Vreeken
VEHosting
http://VEHosting.nl
tel: +31-(0)40-7113050 / +31-(0)6-46210825
KvK nr: 17174380
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Interrupt storm with MSI in combination with em1

2011-05-04 Thread Jack Vogel
Who makes your motherboard? The problem you are having is that MSIX AND
MSI are both failing as em0 comes up, so it falls back to Legacy interrupt
mode,
and must be having some issue with sharing the line, causing the storm.

This is the second report in a matter of a week perhaps about a problematic
motherboard, I would like to know who makes them.

Thanks,

Jack


On Wed, May 4, 2011 at 8:34 AM, Daan Vreeken d...@vehosting.nl wrote:

 Hi All,

 I've just updated a machine to -current (r221321) and since then I'm seeing
 an
 interrupt storm on irq 16. The storm goes away when I disable MSI/MSIX with
 the following lines in loader.conf :

hw.pci.enable_msix=0
hw.pci.enable_msi=0

 According to dmesg the following devices share IRQ 16 :

pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f
   mem
 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd
   irq 16 at device 0.0 on pci1
vgapci0: VGA-compatible display port 0xbc00-0xbc07
   mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device
 2.0 on
   pci0
ehci0: Intel PCH USB 2.0 controller USB-B mem
 0xf7cfa000-0xf7cfa3ff
   irq 16 at device 26.0 on pci0
em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f
   mem
 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd
   irq 16 at device 0.0 on pci4
pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0

 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec.
 MSI
 interrupt delivery to both 'em0' and 'em1' seems to work correctly during a
 storm, as I see their counters increase normally in the vmstat -i output.

 As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that
 the
 e1000 driver is causing this problem. Could it be that the driver forgets
 to
 clear/mask legacy interrupts when attaching the MSI interrupts perhaps?

 Any tips on how to debug and/or fix this?


 The full output of dmesg can be found here :
http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt

 And the full output of pciconf -lv is here :
http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt


 Regards,
 --
 Daan Vreeken
 VEHosting
 http://VEHosting.nl
 tel: +31-(0)40-7113050 / +31-(0)6-46210825
 KvK nr: 17174380

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Interrupt storm with MSI in combination with em1

2011-05-04 Thread Daan Vreeken
Hi Jack,

Wednesday 04 May 2011 19:46:05 Jack Vogel wrote:
 Who makes your motherboard? The problem you are having is that MSIX AND
 MSI are both failing as em0 comes up, so it falls back to Legacy interrupt
 mode,
 and must be having some issue with sharing the line, causing the storm.

The motherboard is an Asus P7H55-M.
Sorry, I should have mentioned that the dmesg output is from booting with :

 hw.pci.enable_msix=0
 hw.pci.enable_msi=0

.. in loader.conf.

With those lines in loader.conf, MSI and MSIX is disabled, both cards work 
like they should and there is no interrupt storm.

With MSI/MSIX enabled, both cards work like they should and I see the counters 
of the MSI interrupts increase (in small amounts, like they should), but at 
boot-time an interrupt storm starts on 'legacy' IRQ 16.

Because the only difference between disabling/enabling MSI/MSIX seems to be in 
the way em0/em1 are used, and because 'em1' shares IRQ 16 according to the 
dmesg, I'm suspecting 'em1' is causing the storm.
(But please correct me if I'm wrong :)

What can I do to help track this problem down?

 
  According to dmesg the following devices share IRQ 16 :
 
 pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port
  0xcc00-0xcc1f mem
  0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd
irq 16 at device 0.0 on pci1
 vgapci0: VGA-compatible display port 0xbc00-0xbc07
mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at
  device 2.0 on
pci0
 ehci0: Intel PCH USB 2.0 controller USB-B mem
  0xf7cfa000-0xf7cfa3ff
irq 16 at device 26.0 on pci0
 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port
  0xec00-0xec1f mem
  0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd
irq 16 at device 0.0 on pci4
 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0
 
  During a storm vmstat -i shows a rate of about 220.000 interrupts/sec.
  MSI
  interrupt delivery to both 'em0' and 'em1' seems to work correctly during
  a storm, as I see their counters increase normally in the vmstat -i
  output.
 
  As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that
  the
  e1000 driver is causing this problem. Could it be that the driver forgets
  to
  clear/mask legacy interrupts when attaching the MSI interrupts perhaps?
 
  Any tips on how to debug and/or fix this?
 
 
  The full output of dmesg can be found here :
 http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt
 
  And the full output of pciconf -lv is here :
 http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt
 


Regards,
-- 
Daan Vreeken
VEHosting
http://VEHosting.nl
tel: +31-(0)40-7113050 / +31-(0)6-46210825
KvK nr: 17174380
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Interrupt storm with MSI in combination with em1

2011-05-04 Thread Jack Vogel
Will you please set it back to a default and then boot and capture the
message for me?

Thank you,

Jack


On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote:

 Hi Jack,

 Wednesday 04 May 2011 19:46:05 Jack Vogel wrote:
  Who makes your motherboard? The problem you are having is that MSIX AND
  MSI are both failing as em0 comes up, so it falls back to Legacy
 interrupt
  mode,
  and must be having some issue with sharing the line, causing the storm.

 The motherboard is an Asus P7H55-M.
 Sorry, I should have mentioned that the dmesg output is from booting with :

  hw.pci.enable_msix=0
  hw.pci.enable_msi=0

 .. in loader.conf.

 With those lines in loader.conf, MSI and MSIX is disabled, both cards
 work
 like they should and there is no interrupt storm.

 With MSI/MSIX enabled, both cards work like they should and I see the
 counters
 of the MSI interrupts increase (in small amounts, like they should), but at
 boot-time an interrupt storm starts on 'legacy' IRQ 16.

 Because the only difference between disabling/enabling MSI/MSIX seems to be
 in
 the way em0/em1 are used, and because 'em1' shares IRQ 16 according to the
 dmesg, I'm suspecting 'em1' is causing the storm.
 (But please correct me if I'm wrong :)

 What can I do to help track this problem down?

  
   According to dmesg the following devices share IRQ 16 :
  
  pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
  em0: Intel(R) PRO/1000 Network Connection 7.2.3 port
   0xcc00-0xcc1f mem
   0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd
 irq 16 at device 0.0 on pci1
  vgapci0: VGA-compatible display port 0xbc00-0xbc07
 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at
   device 2.0 on
 pci0
  ehci0: Intel PCH USB 2.0 controller USB-B mem
   0xf7cfa000-0xf7cfa3ff
 irq 16 at device 26.0 on pci0
  em1: Intel(R) PRO/1000 Network Connection 7.2.3 port
   0xec00-0xec1f mem
   0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd
 irq 16 at device 0.0 on pci4
  pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0
  
   During a storm vmstat -i shows a rate of about 220.000
 interrupts/sec.
   MSI
   interrupt delivery to both 'em0' and 'em1' seems to work correctly
 during
   a storm, as I see their counters increase normally in the vmstat -i
   output.
  
   As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is
 that
   the
   e1000 driver is causing this problem. Could it be that the driver
 forgets
   to
   clear/mask legacy interrupts when attaching the MSI interrupts perhaps?
  
   Any tips on how to debug and/or fix this?
  
  
   The full output of dmesg can be found here :
  http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt
  
   And the full output of pciconf -lv is here :
  http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt
  


 Regards,
 --
 Daan Vreeken
 VEHosting
 http://VEHosting.nl
 tel: +31-(0)40-7113050 / +31-(0)6-46210825
 KvK nr: 17174380

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Interrupt storm with MSI in combination with em1

2011-05-04 Thread Daan Vreeken
Hi,

On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote:
 Will you please set it back to a default and then boot and capture the
 message for me?

No problem. Here's the output with MSI/MSIX enabled :
http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt

I've also added the output of vmstat -i a couple of minutes after a reboot 
with MSI enabled :
http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt

Note that in the above vmstat -i dump the interrupt storm hasn't started 
yet. For some reason the storm doesn't always start directly at boot. I 
haven't been able (yet) to pinpoint what's triggering it to start.


 On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote:
  Hi Jack,
 
  Wednesday 04 May 2011 19:46:05 Jack Vogel wrote:
   Who makes your motherboard? The problem you are having is that MSIX AND
   MSI are both failing as em0 comes up, so it falls back to Legacy
 
  interrupt
 
   mode,
   and must be having some issue with sharing the line, causing the storm.
 
  The motherboard is an Asus P7H55-M.
 
  Sorry, I should have mentioned that the dmesg output is from booting 
with :
   hw.pci.enable_msix=0
   hw.pci.enable_msi=0
 
  .. in loader.conf.
 
  With those lines in loader.conf, MSI and MSIX is disabled, both cards
  work
  like they should and there is no interrupt storm.
 
  With MSI/MSIX enabled, both cards work like they should and I see the
  counters
  of the MSI interrupts increase (in small amounts, like they should), but
  at boot-time an interrupt storm starts on 'legacy' IRQ 16.
 
  Because the only difference between disabling/enabling MSI/MSIX seems to
  be in
  the way em0/em1 are used, and because 'em1' shares IRQ 16 according to
  the dmesg, I'm suspecting 'em1' is causing the storm.
  (But please correct me if I'm wrong :)
 
  What can I do to help track this problem down?
 
According to dmesg the following devices share IRQ 16 :
   
   pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
   em0: Intel(R) PRO/1000 Network Connection 7.2.3 port
0xcc00-0xcc1f mem
0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd
  irq 16 at device 0.0 on pci1
   vgapci0: VGA-compatible display port 0xbc00-0xbc07
  mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at
device 2.0 on
  pci0
   ehci0: Intel PCH USB 2.0 controller USB-B mem
0xf7cfa000-0xf7cfa3ff
  irq 16 at device 26.0 on pci0
   em1: Intel(R) PRO/1000 Network Connection 7.2.3 port
0xec00-0xec1f mem
0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd
  irq 16 at device 0.0 on pci4
   pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0
   
During a storm vmstat -i shows a rate of about 220.000
 
  interrupts/sec.
 
MSI
interrupt delivery to both 'em0' and 'em1' seems to work correctly
 
  during
 
a storm, as I see their counters increase normally in the vmstat -i
output.
   
As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is
 
  that
 
the
e1000 driver is causing this problem. Could it be that the driver
 
  forgets
 
to
clear/mask legacy interrupts when attaching the MSI interrupts
perhaps?
   
Any tips on how to debug and/or fix this?
   
   
The full output of dmesg can be found here :
   http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt
   
And the full output of pciconf -lv is here :
   http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt
 
  Regards,
  --
  Daan Vreeken
  VEHosting
  http://VEHosting.nl
  tel: +31-(0)40-7113050 / +31-(0)6-46210825
  KvK nr: 17174380


Regards,
-- 
Daan Vreeken
VEHosting
http://VEHosting.nl
tel: +31-(0)40-7113050 / +31-(0)6-46210825
KvK nr: 17174380
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Interrupt storm with MSI in combination with em1

2011-05-04 Thread Jack Vogel
This all looks completely kosher,  what IRQ is the storm on??

Jack


On Wed, May 4, 2011 at 3:04 PM, Daan Vreeken d...@vehosting.nl wrote:

 Hi,

 On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote:
  Will you please set it back to a default and then boot and capture the
  message for me?

 No problem. Here's the output with MSI/MSIX enabled :

 http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt

 I've also added the output of vmstat -i a couple of minutes after a
 reboot
 with MSI enabled :
http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt

 Note that in the above vmstat -i dump the interrupt storm hasn't started
 yet. For some reason the storm doesn't always start directly at boot. I
 haven't been able (yet) to pinpoint what's triggering it to start.


  On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote:
   Hi Jack,
  
   Wednesday 04 May 2011 19:46:05 Jack Vogel wrote:
Who makes your motherboard? The problem you are having is that MSIX
 AND
MSI are both failing as em0 comes up, so it falls back to Legacy
  
   interrupt
  
mode,
and must be having some issue with sharing the line, causing the
 storm.
  
   The motherboard is an Asus P7H55-M.
  
   Sorry, I should have mentioned that the dmesg output is from booting
 with :
hw.pci.enable_msix=0
hw.pci.enable_msi=0
  
   .. in loader.conf.
  
   With those lines in loader.conf, MSI and MSIX is disabled, both cards
   work
   like they should and there is no interrupt storm.
  
   With MSI/MSIX enabled, both cards work like they should and I see the
   counters
   of the MSI interrupts increase (in small amounts, like they should),
 but
   at boot-time an interrupt storm starts on 'legacy' IRQ 16.
  
   Because the only difference between disabling/enabling MSI/MSIX seems
 to
   be in
   the way em0/em1 are used, and because 'em1' shares IRQ 16 according to
   the dmesg, I'm suspecting 'em1' is causing the storm.
   (But please correct me if I'm wrong :)
  
   What can I do to help track this problem down?
  
 According to dmesg the following devices share IRQ 16 :

pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
em0: Intel(R) PRO/1000 Network Connection 7.2.3 port
 0xcc00-0xcc1f mem
 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd
   irq 16 at device 0.0 on pci1
vgapci0: VGA-compatible display port 0xbc00-0xbc07
   mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at
 device 2.0 on
   pci0
ehci0: Intel PCH USB 2.0 controller USB-B mem
 0xf7cfa000-0xf7cfa3ff
   irq 16 at device 26.0 on pci0
em1: Intel(R) PRO/1000 Network Connection 7.2.3 port
 0xec00-0xec1f mem
 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd
   irq 16 at device 0.0 on pci4
pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0

 During a storm vmstat -i shows a rate of about 220.000
  
   interrupts/sec.
  
 MSI
 interrupt delivery to both 'em0' and 'em1' seems to work correctly
  
   during
  
 a storm, as I see their counters increase normally in the vmstat
 -i
 output.

 As only 'em0' and 'em1' seem to be using MSI interrupts, my guess
 is
  
   that
  
 the
 e1000 driver is causing this problem. Could it be that the driver
  
   forgets
  
 to
 clear/mask legacy interrupts when attaching the MSI interrupts
 perhaps?

 Any tips on how to debug and/or fix this?


 The full output of dmesg can be found here :
http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt

 And the full output of pciconf -lv is here :

 http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt
  
   Regards,
   --
   Daan Vreeken
   VEHosting
   http://VEHosting.nl
   tel: +31-(0)40-7113050 / +31-(0)6-46210825
   KvK nr: 17174380


 Regards,
 --
 Daan Vreeken
 VEHosting
 http://VEHosting.nl
 tel: +31-(0)40-7113050 / +31-(0)6-46210825
 KvK nr: 17174380

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Interrupt storm with MSI in combination with em1

2011-05-04 Thread Jack Vogel
Right, it was you Wiktor :)  Oh, so yours is sort of a special case.

Thanks,

Jack


On Wed, May 4, 2011 at 3:27 PM, Wiktor Niesiobedzki b...@vink.pl wrote:

 2011/5/4 Jack Vogel jfvo...@gmail.com:
  This is the second report in a matter of a week perhaps about a
 problematic
  motherboard, I would like to know who makes them.

 Just for the record, the motherboard with which I had problems (I
 guess my problem is here referred) is VIA EPIA SN1. It's nothing
 new, and probably rarely used with additional PCIe cards, as this is
 embedded-like creature.

 Cheers,

 Wiktor Niesiobedzki

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Interrupt storm with MSI in combination with em1

2011-05-04 Thread Wiktor Niesiobedzki
2011/5/4 Jack Vogel jfvo...@gmail.com:
 This is the second report in a matter of a week perhaps about a problematic
 motherboard, I would like to know who makes them.

Just for the record, the motherboard with which I had problems (I
guess my problem is here referred) is VIA EPIA SN1. It's nothing
new, and probably rarely used with additional PCIe cards, as this is
embedded-like creature.

Cheers,

Wiktor Niesiobedzki
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Interrupt storm with MSI in combination with em1

2011-05-04 Thread Daan Vreeken
On Thursday 05 May 2011 00:15:43 you wrote:
 This all looks completely kosher,  what IRQ is the storm on??

IRQ 16. Further down this email there is a list of devices that share the IRQ 
according to 'dmesg'.


 On Wed, May 4, 2011 at 3:04 PM, Daan Vreeken d...@vehosting.nl wrote:
  Hi,
 
  On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote:
   Will you please set it back to a default and then boot and capture the
   message for me?
 
  No problem. Here's the output with MSI/MSIX enabled :
 
  http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt
 
  I've also added the output of vmstat -i a couple of minutes after a
  reboot
  with MSI enabled :
 http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt
 
  Note that in the above vmstat -i dump the interrupt storm hasn't
  started yet. For some reason the storm doesn't always start directly at
  boot. I haven't been able (yet) to pinpoint what's triggering it to
  start.
 
   On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote:
Hi Jack,
   
Wednesday 04 May 2011 19:46:05 Jack Vogel wrote:
 Who makes your motherboard? The problem you are having is that MSIX
 AND MSI are both failing as em0 comes up, so it falls back to Legacy
 interrupt mode,
 and must be having some issue with sharing the line, causing the
 storm.
The motherboard is an Asus P7H55-M.
   
Sorry, I should have mentioned that the dmesg output is from booting
with :
 hw.pci.enable_msix=0
 hw.pci.enable_msi=0
.. in loader.conf.
   
With those lines in loader.conf, MSI and MSIX is disabled, both
cards work
like they should and there is no interrupt storm.
   
With MSI/MSIX enabled, both cards work like they should and I see the
counters
of the MSI interrupts increase (in small amounts, like they should),
but at boot-time an interrupt storm starts on 'legacy' IRQ 16.
   
Because the only difference between disabling/enabling MSI/MSIX seems
to be in
the way em0/em1 are used, and because 'em1' shares IRQ 16 according
to the dmesg, I'm suspecting 'em1' is causing the storm.
(But please correct me if I'm wrong :)
   
What can I do to help track this problem down?
   
  According to dmesg the following devices share IRQ 16 :
 
 pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port
  0xcc00-0xcc1f mem
  0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd
irq 16 at device 0.0 on pci1
 vgapci0: VGA-compatible display port 0xbc00-0xbc07
mem 0xf780-0xf7bf,0xe000-0xefff irq 16
  at device 2.0 on
pci0
 ehci0: Intel PCH USB 2.0 controller USB-B mem
  0xf7cfa000-0xf7cfa3ff
irq 16 at device 26.0 on pci0
 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port
  0xec00-0xec1f mem
  0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd
irq 16 at device 0.0 on pci4
 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0
 
  During a storm vmstat -i shows a rate of about 220.000
  interrupts/sec.
  MSI
  interrupt delivery to both 'em0' and 'em1' seems to work
  correctly during
  a storm, as I see their counters increase normally in the vmstat
  -i output.
  As only 'em0' and 'em1' seem to be using MSI interrupts, my guess
  is that the
  e1000 driver is causing this problem. Could it be that the driver
  forgets to
  clear/mask legacy interrupts when attaching the MSI interrupts
  perhaps?
 
  Any tips on how to debug and/or fix this?
 
 
  The full output of dmesg can be found here :

  http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt
 
  And the full output of pciconf -lv is here :
 
  http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt


Thanks,
-- 
Daan Vreeken
VEHosting
http://VEHosting.nl
tel: +31-(0)40-7113050 / +31-(0)6-46210825
KvK nr: 17174380
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Interrupt storm with MSI in combination with em1

2011-05-04 Thread Jack Vogel
OK, but the reason you see the multiple cases of irq 16 is that's the
bridge,
once you are using MSIX, as vmstat shows, its using other vectors.

Can you capture the messages file with the actual storm happening?

I noticed some complaints about checksums in the dmesg, have you
checked on BIOS upgrades or something like that on your motherboard?

Regards,

Jack


On Wed, May 4, 2011 at 4:27 PM, Daan Vreeken d...@vehosting.nl wrote:

 On Thursday 05 May 2011 00:15:43 you wrote:
  This all looks completely kosher,  what IRQ is the storm on??

 IRQ 16. Further down this email there is a list of devices that share the
 IRQ
 according to 'dmesg'.


  On Wed, May 4, 2011 at 3:04 PM, Daan Vreeken d...@vehosting.nl wrote:
   Hi,
  
   On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote:
Will you please set it back to a default and then boot and capture
 the
message for me?
  
   No problem. Here's the output with MSI/MSIX enabled :
  
   http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt
  
   I've also added the output of vmstat -i a couple of minutes after a
   reboot
   with MSI enabled :
  http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt
  
   Note that in the above vmstat -i dump the interrupt storm hasn't
   started yet. For some reason the storm doesn't always start directly at
   boot. I haven't been able (yet) to pinpoint what's triggering it to
   start.
  
On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl
 wrote:
 Hi Jack,

 Wednesday 04 May 2011 19:46:05 Jack Vogel wrote:
  Who makes your motherboard? The problem you are having is that
 MSIX
  AND MSI are both failing as em0 comes up, so it falls back to
 Legacy
  interrupt mode,
  and must be having some issue with sharing the line, causing the
  storm.
 The motherboard is an Asus P7H55-M.

 Sorry, I should have mentioned that the dmesg output is from
 booting
 with :
  hw.pci.enable_msix=0
  hw.pci.enable_msi=0
 .. in loader.conf.

 With those lines in loader.conf, MSI and MSIX is disabled, both
 cards work
 like they should and there is no interrupt storm.

 With MSI/MSIX enabled, both cards work like they should and I see
 the
 counters
 of the MSI interrupts increase (in small amounts, like they
 should),
 but at boot-time an interrupt storm starts on 'legacy' IRQ 16.

 Because the only difference between disabling/enabling MSI/MSIX
 seems
 to be in
 the way em0/em1 are used, and because 'em1' shares IRQ 16 according
 to the dmesg, I'm suspecting 'em1' is causing the storm.
 (But please correct me if I'm wrong :)

 What can I do to help track this problem down?

   According to dmesg the following devices share IRQ 16 :
  
  pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on
 pci0
  em0: Intel(R) PRO/1000 Network Connection 7.2.3 port
   0xcc00-0xcc1f mem
  
 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd
 irq 16 at device 0.0 on pci1
  vgapci0: VGA-compatible display port 0xbc00-0xbc07
 mem 0xf780-0xf7bf,0xe000-0xefff irq
 16
   at device 2.0 on
 pci0
  ehci0: Intel PCH USB 2.0 controller USB-B mem
   0xf7cfa000-0xf7cfa3ff
 irq 16 at device 26.0 on pci0
  em1: Intel(R) PRO/1000 Network Connection 7.2.3 port
   0xec00-0xec1f mem
  
 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd
 irq 16 at device 0.0 on pci4
  pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on
 pci0
  
   During a storm vmstat -i shows a rate of about 220.000
   interrupts/sec.
   MSI
   interrupt delivery to both 'em0' and 'em1' seems to work
   correctly during
   a storm, as I see their counters increase normally in the
 vmstat
   -i output.
   As only 'em0' and 'em1' seem to be using MSI interrupts, my
 guess
   is that the
   e1000 driver is causing this problem. Could it be that the
 driver
   forgets to
   clear/mask legacy interrupts when attaching the MSI interrupts
   perhaps?
  
   Any tips on how to debug and/or fix this?
  
  
   The full output of dmesg can be found here :
  
   http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt
  
   And the full output of pciconf -lv is here :
  
   http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt


 Thanks,
 --
 Daan Vreeken
 VEHosting
 http://VEHosting.nl
 tel: +31-(0)40-7113050 / +31-(0)6-46210825
 KvK nr: 17174380

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org