Re: regression: msk0 watchdog timeout and interrupt storm
06.02.2014 21:12, Boris Samorodov пишет: > 06.02.2014 06:00, Yonghyeon PYUN пишет: >> On Sat, Feb 01, 2014 at 12:18:59PM +0400, Boris Samorodov wrote: >>> Hi Yonghyeon and All, >>> >>> (this time it's a CURRENT issue) >>> >>> 31.10.2013 17:33, Boris Samorodov пишет: 30.10.2013 06:16, Yonghyeon PYUN пишет: > On Tue, Oct 29, 2013 at 05:38:27PM +0400, Boris Samorodov wrote: >> >From time to time I use a notebook and boot FreeBSD from USB >> stick. FreeBSD 9.2-i386 works OK. So I tried to use >> FreeBSD 10.0-i386 BETA2 and the network adapter works for >> some 10-15 seconds and then stops with diagnostic message >> "msk0:watchdog timeout". I've found similar case at >> freebsd-current@ with no workaround. Yes, there is an >> interrupt storm as well. > > There had been no functional changes for very long time so I'm not > sure what's going on here. I've attached local change I have at > this moment but I'm afraid it wouldn't address the issue above. > > I recall jhb also reported interrupt storm in the past but the root > cause was not identified yet. Could you change msk_intr() and let > me know which interrupt is firing? I've yet to organize a build. >> Here is some additional info: >> - >> mskc0@pci0:3:0:0: class=0x02 card=0xff501179 chip=0x435511ab >> rev=0x12 hdr=0x00 >> vendor = 'Marvell Technology Group Ltd.' >> device = '88E8040T PCI-E Fast Ethernet Controller' >> class = network >> subclass = ethernet >> cap 01[48] = powerspec 3 supports D0 D1 D2 D3 current D0 >> cap 05[5c] = MSI supports 1 message, 64 bit enabled with 1 message >> cap 10[c0] = PCI-Express 2 legacy endpoint max data 128(128) link >> x1(x1) >> speed 2.5(2.5) ASPM disabled(L0s/L1) >> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected >> ecap 0003[130] = Serial 1 b8b063681e00 >> - Meanwhile some more investigations, "vmstat -i" for calm and storm: - interrupt total rate irq1: atkbd01025 2 irq9: acpi0 204 0 irq14: ata0 327 0 irq16: uhci0+246 0 irq20: hpet0 22472 52 irq23: uhci2 ehci1 10341 24 irq256: hdac0 52 0 irq257: mskc0258 0 irq258: ahci0221 0 Total 35146 81 - interrupt total rate irq1: atkbd01508 2 irq9: acpi0 234 0 irq14: ata0 409 0 irq16: uhci0+246 0 irq20: hpet0 72288131 irq23: uhci2 ehci1 10846 19 irq256: hdac0 52 0 irq257: mskc04419760 8021 irq258: ahci0221 0 Total4505564 8177 - And "vmstat -w1" for calm and storm: - procs memory pagedisks faults cpu r b w avmfre flt re pi pofr sr mm0 ad0 in sy cs us sy id 0 0 0 206928 956040 277 0 2 0 330 4 0 0 117 476 454 0 1 99 0 0 0 206928 956036 0 0 0 0 8 4 0 0 50 123 137 0 0 100 0 0 0 206928 956036 0 0 0 0 0 4 0 0 47 120 92 0 1 99 0 0 0 206928 956036 0 0 0 0 0 4 0 0 43 123 119 0 1 99 0 0 0 206928 956036 0 0 0 0 0 4 0 0 55 132 123 0 1 99 0 0 0 206928 956004 0 0 0 0 0 4 0 0 68 123 185 0 1 99 0 0 0 206928 956036 0 0 0 0 8 4 0 0 86 123 266 0 1 99 0 0 0 206928 956036 0 0 0 0 0 4 0 0 44 125 124 0 0 100 0 0 0 206928 956036 0 0 0 0 0 4 0 0 64 128 164 0 1 99 0 0 0 206928 956036 0 0 0 0 0 4 0 0 42 131 101 0 1 99 - procs memory pagedisks faults cpu r b w avmfre flt re pi pofr sr mm0 ad0 in sy cs us sy id 0 0 0 213648 954676 104 0 1 0 121 4 0 0 22299 204 44262 0 10 90 0 0 0 213648 954672 0 0 0 0 8 4 0 0 112259 123 222379 0 44 56 0 0 0 213648 954672 0 0 0 0 0 4 0
Re: regression: msk0 watchdog timeout and interrupt storm
06.02.2014 06:00, Yonghyeon PYUN пишет: > On Sat, Feb 01, 2014 at 12:18:59PM +0400, Boris Samorodov wrote: >> Hi Yonghyeon and All, >> >> (this time it's a CURRENT issue) >> >> 31.10.2013 17:33, Boris Samorodov пишет: >>> 30.10.2013 06:16, Yonghyeon PYUN пишет: On Tue, Oct 29, 2013 at 05:38:27PM +0400, Boris Samorodov wrote: >>> > >From time to time I use a notebook and boot FreeBSD from USB > stick. FreeBSD 9.2-i386 works OK. So I tried to use > FreeBSD 10.0-i386 BETA2 and the network adapter works for > some 10-15 seconds and then stops with diagnostic message > "msk0:watchdog timeout". I've found similar case at > freebsd-current@ with no workaround. Yes, there is an > interrupt storm as well. There had been no functional changes for very long time so I'm not sure what's going on here. I've attached local change I have at this moment but I'm afraid it wouldn't address the issue above. I recall jhb also reported interrupt storm in the past but the root cause was not identified yet. Could you change msk_intr() and let me know which interrupt is firing? >>> >>> I've yet to organize a build. >>> > Here is some additional info: > - > mskc0@pci0:3:0:0: class=0x02 card=0xff501179 chip=0x435511ab > rev=0x12 hdr=0x00 > vendor = 'Marvell Technology Group Ltd.' > device = '88E8040T PCI-E Fast Ethernet Controller' > class = network > subclass = ethernet > cap 01[48] = powerspec 3 supports D0 D1 D2 D3 current D0 > cap 05[5c] = MSI supports 1 message, 64 bit enabled with 1 message > cap 10[c0] = PCI-Express 2 legacy endpoint max data 128(128) link > x1(x1) > speed 2.5(2.5) ASPM disabled(L0s/L1) > ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected > ecap 0003[130] = Serial 1 b8b063681e00 > - >>> >>> Meanwhile some more investigations, "vmstat -i" for calm and storm: >>> - >>> interrupt total rate >>> irq1: atkbd01025 2 >>> irq9: acpi0 204 0 >>> irq14: ata0 327 0 >>> irq16: uhci0+246 0 >>> irq20: hpet0 22472 52 >>> irq23: uhci2 ehci1 10341 24 >>> irq256: hdac0 52 0 >>> irq257: mskc0258 0 >>> irq258: ahci0221 0 >>> Total 35146 81 >>> - >>> interrupt total rate >>> irq1: atkbd01508 2 >>> irq9: acpi0 234 0 >>> irq14: ata0 409 0 >>> irq16: uhci0+246 0 >>> irq20: hpet0 72288131 >>> irq23: uhci2 ehci1 10846 19 >>> irq256: hdac0 52 0 >>> irq257: mskc04419760 8021 >>> irq258: ahci0221 0 >>> Total4505564 8177 >>> - >>> >>> And "vmstat -w1" for calm and storm: >>> - >>> procs memory pagedisks faults cpu >>> r b w avmfre flt re pi pofr sr mm0 ad0 in sy cs >>> us sy id >>> 0 0 0 206928 956040 277 0 2 0 330 4 0 0 117 476 >>> 454 0 1 99 >>> 0 0 0 206928 956036 0 0 0 0 8 4 0 0 50 123 >>> 137 0 0 100 >>> 0 0 0 206928 956036 0 0 0 0 0 4 0 0 47 120 >>> 92 0 1 99 >>> 0 0 0 206928 956036 0 0 0 0 0 4 0 0 43 123 >>> 119 0 1 99 >>> 0 0 0 206928 956036 0 0 0 0 0 4 0 0 55 132 >>> 123 0 1 99 >>> 0 0 0 206928 956004 0 0 0 0 0 4 0 0 68 123 >>> 185 0 1 99 >>> 0 0 0 206928 956036 0 0 0 0 8 4 0 0 86 123 >>> 266 0 1 99 >>> 0 0 0 206928 956036 0 0 0 0 0 4 0 0 44 125 >>> 124 0 0 100 >>> 0 0 0 206928 956036 0 0 0 0 0 4 0 0 64 128 >>> 164 0 1 99 >>> 0 0 0 206928 956036 0 0 0 0 0 4 0 0 42 131 >>> 101 0 1 99 >>> - >>> procs memory pagedisks faults cpu >>> r b w avmfre flt re pi pofr sr mm0 ad0 in sy cs >>> us sy id >>> 0 0 0 213648 954676 104 0 1 0 121 4 0 0 22299 204 >>> 44262 0 10 90 >>> 0 0 0 213648 954672 0 0 0 0 8 4 0 0 112259 123 >>> 222379 0 44 56 >>> 0 0 0 213648 954672 0 0 0 0 0 4 0 0 111792 123 >>> 221489 0 43 57 >>> 0 0 0 213648 954672 1 0 0 0 0 4 0 0 109887 183 >>> 217754 0 43 57 >>> 0 0 0 213648 954668
Re: regression: msk0 watchdog timeout and interrupt storm
On Sat, Feb 01, 2014 at 12:18:59PM +0400, Boris Samorodov wrote: > Hi Yonghyeon and All, > > (this time it's a CURRENT issue) > > 31.10.2013 17:33, Boris Samorodov пишет: > > 30.10.2013 06:16, Yonghyeon PYUN пишет: > >> On Tue, Oct 29, 2013 at 05:38:27PM +0400, Boris Samorodov wrote: > > > >>> >From time to time I use a notebook and boot FreeBSD from USB > >>> stick. FreeBSD 9.2-i386 works OK. So I tried to use > >>> FreeBSD 10.0-i386 BETA2 and the network adapter works for > >>> some 10-15 seconds and then stops with diagnostic message > >>> "msk0:watchdog timeout". I've found similar case at > >>> freebsd-current@ with no workaround. Yes, there is an > >>> interrupt storm as well. > >> > >> There had been no functional changes for very long time so I'm not > >> sure what's going on here. I've attached local change I have at > >> this moment but I'm afraid it wouldn't address the issue above. > >> > >> I recall jhb also reported interrupt storm in the past but the root > >> cause was not identified yet. Could you change msk_intr() and let > >> me know which interrupt is firing? > > > > I've yet to organize a build. > > > >>> Here is some additional info: > >>> - > >>> mskc0@pci0:3:0:0: class=0x02 card=0xff501179 chip=0x435511ab > >>> rev=0x12 hdr=0x00 > >>> vendor = 'Marvell Technology Group Ltd.' > >>> device = '88E8040T PCI-E Fast Ethernet Controller' > >>> class = network > >>> subclass = ethernet > >>> cap 01[48] = powerspec 3 supports D0 D1 D2 D3 current D0 > >>> cap 05[5c] = MSI supports 1 message, 64 bit enabled with 1 message > >>> cap 10[c0] = PCI-Express 2 legacy endpoint max data 128(128) link > >>> x1(x1) > >>> speed 2.5(2.5) ASPM disabled(L0s/L1) > >>> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected > >>> ecap 0003[130] = Serial 1 b8b063681e00 > >>> - > > > > Meanwhile some more investigations, "vmstat -i" for calm and storm: > > - > > interrupt total rate > > irq1: atkbd01025 2 > > irq9: acpi0 204 0 > > irq14: ata0 327 0 > > irq16: uhci0+246 0 > > irq20: hpet0 22472 52 > > irq23: uhci2 ehci1 10341 24 > > irq256: hdac0 52 0 > > irq257: mskc0258 0 > > irq258: ahci0221 0 > > Total 35146 81 > > - > > interrupt total rate > > irq1: atkbd01508 2 > > irq9: acpi0 234 0 > > irq14: ata0 409 0 > > irq16: uhci0+246 0 > > irq20: hpet0 72288131 > > irq23: uhci2 ehci1 10846 19 > > irq256: hdac0 52 0 > > irq257: mskc04419760 8021 > > irq258: ahci0221 0 > > Total4505564 8177 > > - > > > > And "vmstat -w1" for calm and storm: > > - > > procs memory pagedisks faults cpu > > r b w avmfre flt re pi pofr sr mm0 ad0 in sy cs > > us sy id > > 0 0 0 206928 956040 277 0 2 0 330 4 0 0 117 476 > > 454 0 1 99 > > 0 0 0 206928 956036 0 0 0 0 8 4 0 0 50 123 > > 137 0 0 100 > > 0 0 0 206928 956036 0 0 0 0 0 4 0 0 47 120 > > 92 0 1 99 > > 0 0 0 206928 956036 0 0 0 0 0 4 0 0 43 123 > > 119 0 1 99 > > 0 0 0 206928 956036 0 0 0 0 0 4 0 0 55 132 > > 123 0 1 99 > > 0 0 0 206928 956004 0 0 0 0 0 4 0 0 68 123 > > 185 0 1 99 > > 0 0 0 206928 956036 0 0 0 0 8 4 0 0 86 123 > > 266 0 1 99 > > 0 0 0 206928 956036 0 0 0 0 0 4 0 0 44 125 > > 124 0 0 100 > > 0 0 0 206928 956036 0 0 0 0 0 4 0 0 64 128 > > 164 0 1 99 > > 0 0 0 206928 956036 0 0 0 0 0 4 0 0 42 131 > > 101 0 1 99 > > - > > procs memory pagedisks faults cpu > > r b w avmfre flt re pi pofr sr mm0 ad0 in sy cs > > us sy id > > 0 0 0 213648 954676 104 0 1 0 121 4 0 0 22299 204 > > 44262 0 10 90 > > 0 0 0 213648 954672 0 0 0 0 8 4 0 0 112259 123 > > 222379 0 44 56 > > 0 0 0 213648 954672 0 0 0 0 0 4 0 0 111792 123 > > 221489 0 43 57 > > 0 0 0 213648 954672 1 0 0 0 0 4 0 0 109887 183 > > 217754 0 43 57 > > 0 0 0 213648 954668 2 0 0 0 0 4 0 0 109543
Re: regression: msk0 watchdog timeout and interrupt storm
Hi Yonghyeon and All, (this time it's a CURRENT issue) 31.10.2013 17:33, Boris Samorodov пишет: > 30.10.2013 06:16, Yonghyeon PYUN пишет: >> On Tue, Oct 29, 2013 at 05:38:27PM +0400, Boris Samorodov wrote: > >>> >From time to time I use a notebook and boot FreeBSD from USB >>> stick. FreeBSD 9.2-i386 works OK. So I tried to use >>> FreeBSD 10.0-i386 BETA2 and the network adapter works for >>> some 10-15 seconds and then stops with diagnostic message >>> "msk0:watchdog timeout". I've found similar case at >>> freebsd-current@ with no workaround. Yes, there is an >>> interrupt storm as well. >> >> There had been no functional changes for very long time so I'm not >> sure what's going on here. I've attached local change I have at >> this moment but I'm afraid it wouldn't address the issue above. >> >> I recall jhb also reported interrupt storm in the past but the root >> cause was not identified yet. Could you change msk_intr() and let >> me know which interrupt is firing? > > I've yet to organize a build. > >>> Here is some additional info: >>> - >>> mskc0@pci0:3:0:0: class=0x02 card=0xff501179 chip=0x435511ab >>> rev=0x12 hdr=0x00 >>> vendor = 'Marvell Technology Group Ltd.' >>> device = '88E8040T PCI-E Fast Ethernet Controller' >>> class = network >>> subclass = ethernet >>> cap 01[48] = powerspec 3 supports D0 D1 D2 D3 current D0 >>> cap 05[5c] = MSI supports 1 message, 64 bit enabled with 1 message >>> cap 10[c0] = PCI-Express 2 legacy endpoint max data 128(128) link x1(x1) >>> speed 2.5(2.5) ASPM disabled(L0s/L1) >>> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected >>> ecap 0003[130] = Serial 1 b8b063681e00 >>> - > > Meanwhile some more investigations, "vmstat -i" for calm and storm: > - > interrupt total rate > irq1: atkbd01025 2 > irq9: acpi0 204 0 > irq14: ata0 327 0 > irq16: uhci0+246 0 > irq20: hpet0 22472 52 > irq23: uhci2 ehci1 10341 24 > irq256: hdac0 52 0 > irq257: mskc0258 0 > irq258: ahci0221 0 > Total 35146 81 > - > interrupt total rate > irq1: atkbd01508 2 > irq9: acpi0 234 0 > irq14: ata0 409 0 > irq16: uhci0+246 0 > irq20: hpet0 72288131 > irq23: uhci2 ehci1 10846 19 > irq256: hdac0 52 0 > irq257: mskc04419760 8021 > irq258: ahci0221 0 > Total4505564 8177 > - > > And "vmstat -w1" for calm and storm: > - > procs memory pagedisks faults cpu > r b w avmfre flt re pi pofr sr mm0 ad0 in sy cs > us sy id > 0 0 0 206928 956040 277 0 2 0 330 4 0 0 117 476 > 454 0 1 99 > 0 0 0 206928 956036 0 0 0 0 8 4 0 0 50 123 > 137 0 0 100 > 0 0 0 206928 956036 0 0 0 0 0 4 0 0 47 120 > 92 0 1 99 > 0 0 0 206928 956036 0 0 0 0 0 4 0 0 43 123 > 119 0 1 99 > 0 0 0 206928 956036 0 0 0 0 0 4 0 0 55 132 > 123 0 1 99 > 0 0 0 206928 956004 0 0 0 0 0 4 0 0 68 123 > 185 0 1 99 > 0 0 0 206928 956036 0 0 0 0 8 4 0 0 86 123 > 266 0 1 99 > 0 0 0 206928 956036 0 0 0 0 0 4 0 0 44 125 > 124 0 0 100 > 0 0 0 206928 956036 0 0 0 0 0 4 0 0 64 128 > 164 0 1 99 > 0 0 0 206928 956036 0 0 0 0 0 4 0 0 42 131 > 101 0 1 99 > - > procs memory pagedisks faults cpu > r b w avmfre flt re pi pofr sr mm0 ad0 in sy cs > us sy id > 0 0 0 213648 954676 104 0 1 0 121 4 0 0 22299 204 > 44262 0 10 90 > 0 0 0 213648 954672 0 0 0 0 8 4 0 0 112259 123 > 222379 0 44 56 > 0 0 0 213648 954672 0 0 0 0 0 4 0 0 111792 123 > 221489 0 43 57 > 0 0 0 213648 954672 1 0 0 0 0 4 0 0 109887 183 > 217754 0 43 57 > 0 0 0 213648 954668 2 0 0 0 0 4 0 0 109543 146 > 216963 0 44 56 > 0 0 0 213648 954668 0 0 0 0 0 4 0 0 110142 123 > 218187 0 45 55 > 0 0 0 213648 954660 472 0 0 0 474 4 0 0 109340 717 > 216674 0 42 57 > 0 0 0 213648 954656 2 0 0 0 0 4 0 0 109459 147 > 216831