Re: regression: msk0 watchdog timeout and interrupt storm

2014-02-09 Thread Boris Samorodov
06.02.2014 21:12, Boris Samorodov пишет:
> 06.02.2014 06:00, Yonghyeon PYUN пишет:
>> On Sat, Feb 01, 2014 at 12:18:59PM +0400, Boris Samorodov wrote:
>>> Hi Yonghyeon and All,
>>>
>>> (this time it's a CURRENT issue)
>>>
>>> 31.10.2013 17:33, Boris Samorodov пишет:
 30.10.2013 06:16, Yonghyeon PYUN пишет:
> On Tue, Oct 29, 2013 at 05:38:27PM +0400, Boris Samorodov wrote:

>> >From time to time I use a notebook and boot FreeBSD from USB
>> stick. FreeBSD 9.2-i386 works OK. So I tried to use
>> FreeBSD 10.0-i386 BETA2 and the network adapter works for
>> some 10-15 seconds and then stops with diagnostic message
>> "msk0:watchdog timeout". I've found similar case at
>> freebsd-current@ with no workaround. Yes, there is an
>> interrupt storm as well.
>
> There had been no functional changes for very long time so I'm not
> sure what's going on here.  I've attached local change I have at
> this moment but I'm afraid it wouldn't address the issue above.
>
> I recall jhb also reported interrupt storm in the past but the root
> cause was not identified yet.  Could you change msk_intr() and let
> me know which interrupt is firing?

 I've yet to organize a build.

>> Here is some additional info:
>> -
>> mskc0@pci0:3:0:0:   class=0x02 card=0xff501179 chip=0x435511ab
>> rev=0x12 hdr=0x00
>> vendor = 'Marvell Technology Group Ltd.'
>> device = '88E8040T PCI-E Fast Ethernet Controller'
>> class  = network
>> subclass   = ethernet
>> cap 01[48] = powerspec 3  supports D0 D1 D2 D3  current D0
>> cap 05[5c] = MSI supports 1 message, 64 bit enabled with 1 message
>> cap 10[c0] = PCI-Express 2 legacy endpoint max data 128(128) link 
>> x1(x1)
>>  speed 2.5(2.5) ASPM disabled(L0s/L1)
>> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
>> ecap 0003[130] = Serial 1 b8b063681e00
>> -

 Meanwhile some more investigations, "vmstat -i" for calm and storm:
 -
 interrupt  total   rate
 irq1: atkbd01025  2
 irq9: acpi0  204  0
 irq14: ata0  327  0
 irq16: uhci0+246  0
 irq20: hpet0   22472 52
 irq23: uhci2 ehci1 10341 24
 irq256: hdac0 52  0
 irq257: mskc0258  0
 irq258: ahci0221  0
 Total  35146 81
 -
 interrupt  total   rate
 irq1: atkbd01508  2
 irq9: acpi0  234  0
 irq14: ata0  409  0
 irq16: uhci0+246  0
 irq20: hpet0   72288131
 irq23: uhci2 ehci1 10846 19
 irq256: hdac0 52  0
 irq257: mskc04419760   8021
 irq258: ahci0221  0
 Total4505564   8177
 -

 And "vmstat -w1" for calm and storm:
 -
  procs  memory  pagedisks faults 
 cpu
  r b w avmfre   flt  re  pi  pofr  sr mm0 ad0   in   sy   cs
 us sy id
  0 0 0  206928  956040   277   0   2   0   330   4   0   0  117  476
 454  0  1 99
  0 0 0  206928  956036 0   0   0   0 8   4   0   0   50  123
 137  0  0 100
  0 0 0  206928  956036 0   0   0   0 0   4   0   0   47  120
 92  0  1 99
  0 0 0  206928  956036 0   0   0   0 0   4   0   0   43  123
 119  0  1 99
  0 0 0  206928  956036 0   0   0   0 0   4   0   0   55  132
 123  0  1 99
  0 0 0  206928  956004 0   0   0   0 0   4   0   0   68  123
 185  0  1 99
  0 0 0  206928  956036 0   0   0   0 8   4   0   0   86  123
 266  0  1 99
  0 0 0  206928  956036 0   0   0   0 0   4   0   0   44  125
 124  0  0 100
  0 0 0  206928  956036 0   0   0   0 0   4   0   0   64  128
 164  0  1 99
  0 0 0  206928  956036 0   0   0   0 0   4   0   0   42  131
 101  0  1 99
 -
  procs  memory  pagedisks faults 
 cpu
  r b w avmfre   flt  re  pi  pofr  sr mm0 ad0   in   sy   cs
 us sy id
  0 0 0  213648  954676   104   0   1   0   121   4   0   0 22299  204
 44262  0 10 90
  0 0 0  213648  954672 0   0   0   0 8   4   0   0 112259  123
 222379  0 44 56
  0 0 0  213648  954672 0   0   0   0 0   4   0  

Re: regression: msk0 watchdog timeout and interrupt storm

2014-02-06 Thread Boris Samorodov
06.02.2014 06:00, Yonghyeon PYUN пишет:
> On Sat, Feb 01, 2014 at 12:18:59PM +0400, Boris Samorodov wrote:
>> Hi Yonghyeon and All,
>>
>> (this time it's a CURRENT issue)
>>
>> 31.10.2013 17:33, Boris Samorodov пишет:
>>> 30.10.2013 06:16, Yonghyeon PYUN пишет:
 On Tue, Oct 29, 2013 at 05:38:27PM +0400, Boris Samorodov wrote:
>>>
> >From time to time I use a notebook and boot FreeBSD from USB
> stick. FreeBSD 9.2-i386 works OK. So I tried to use
> FreeBSD 10.0-i386 BETA2 and the network adapter works for
> some 10-15 seconds and then stops with diagnostic message
> "msk0:watchdog timeout". I've found similar case at
> freebsd-current@ with no workaround. Yes, there is an
> interrupt storm as well.

 There had been no functional changes for very long time so I'm not
 sure what's going on here.  I've attached local change I have at
 this moment but I'm afraid it wouldn't address the issue above.

 I recall jhb also reported interrupt storm in the past but the root
 cause was not identified yet.  Could you change msk_intr() and let
 me know which interrupt is firing?
>>>
>>> I've yet to organize a build.
>>>
> Here is some additional info:
> -
> mskc0@pci0:3:0:0:   class=0x02 card=0xff501179 chip=0x435511ab
> rev=0x12 hdr=0x00
> vendor = 'Marvell Technology Group Ltd.'
> device = '88E8040T PCI-E Fast Ethernet Controller'
> class  = network
> subclass   = ethernet
> cap 01[48] = powerspec 3  supports D0 D1 D2 D3  current D0
> cap 05[5c] = MSI supports 1 message, 64 bit enabled with 1 message
> cap 10[c0] = PCI-Express 2 legacy endpoint max data 128(128) link 
> x1(x1)
>  speed 2.5(2.5) ASPM disabled(L0s/L1)
> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
> ecap 0003[130] = Serial 1 b8b063681e00
> -
>>>
>>> Meanwhile some more investigations, "vmstat -i" for calm and storm:
>>> -
>>> interrupt  total   rate
>>> irq1: atkbd01025  2
>>> irq9: acpi0  204  0
>>> irq14: ata0  327  0
>>> irq16: uhci0+246  0
>>> irq20: hpet0   22472 52
>>> irq23: uhci2 ehci1 10341 24
>>> irq256: hdac0 52  0
>>> irq257: mskc0258  0
>>> irq258: ahci0221  0
>>> Total  35146 81
>>> -
>>> interrupt  total   rate
>>> irq1: atkbd01508  2
>>> irq9: acpi0  234  0
>>> irq14: ata0  409  0
>>> irq16: uhci0+246  0
>>> irq20: hpet0   72288131
>>> irq23: uhci2 ehci1 10846 19
>>> irq256: hdac0 52  0
>>> irq257: mskc04419760   8021
>>> irq258: ahci0221  0
>>> Total4505564   8177
>>> -
>>>
>>> And "vmstat -w1" for calm and storm:
>>> -
>>>  procs  memory  pagedisks faults cpu
>>>  r b w avmfre   flt  re  pi  pofr  sr mm0 ad0   in   sy   cs
>>> us sy id
>>>  0 0 0  206928  956040   277   0   2   0   330   4   0   0  117  476
>>> 454  0  1 99
>>>  0 0 0  206928  956036 0   0   0   0 8   4   0   0   50  123
>>> 137  0  0 100
>>>  0 0 0  206928  956036 0   0   0   0 0   4   0   0   47  120
>>> 92  0  1 99
>>>  0 0 0  206928  956036 0   0   0   0 0   4   0   0   43  123
>>> 119  0  1 99
>>>  0 0 0  206928  956036 0   0   0   0 0   4   0   0   55  132
>>> 123  0  1 99
>>>  0 0 0  206928  956004 0   0   0   0 0   4   0   0   68  123
>>> 185  0  1 99
>>>  0 0 0  206928  956036 0   0   0   0 8   4   0   0   86  123
>>> 266  0  1 99
>>>  0 0 0  206928  956036 0   0   0   0 0   4   0   0   44  125
>>> 124  0  0 100
>>>  0 0 0  206928  956036 0   0   0   0 0   4   0   0   64  128
>>> 164  0  1 99
>>>  0 0 0  206928  956036 0   0   0   0 0   4   0   0   42  131
>>> 101  0  1 99
>>> -
>>>  procs  memory  pagedisks faults cpu
>>>  r b w avmfre   flt  re  pi  pofr  sr mm0 ad0   in   sy   cs
>>> us sy id
>>>  0 0 0  213648  954676   104   0   1   0   121   4   0   0 22299  204
>>> 44262  0 10 90
>>>  0 0 0  213648  954672 0   0   0   0 8   4   0   0 112259  123
>>> 222379  0 44 56
>>>  0 0 0  213648  954672 0   0   0   0 0   4   0   0 111792  123
>>> 221489  0 43 57
>>>  0 0 0  213648  954672 1   0   0   0 0   4   0   0 109887  183
>>> 217754  0 43 57
>>>  0 0 0  213648  954668

Re: regression: msk0 watchdog timeout and interrupt storm

2014-02-05 Thread Yonghyeon PYUN
On Sat, Feb 01, 2014 at 12:18:59PM +0400, Boris Samorodov wrote:
> Hi Yonghyeon and All,
> 
> (this time it's a CURRENT issue)
> 
> 31.10.2013 17:33, Boris Samorodov пишет:
> > 30.10.2013 06:16, Yonghyeon PYUN пишет:
> >> On Tue, Oct 29, 2013 at 05:38:27PM +0400, Boris Samorodov wrote:
> > 
> >>> >From time to time I use a notebook and boot FreeBSD from USB
> >>> stick. FreeBSD 9.2-i386 works OK. So I tried to use
> >>> FreeBSD 10.0-i386 BETA2 and the network adapter works for
> >>> some 10-15 seconds and then stops with diagnostic message
> >>> "msk0:watchdog timeout". I've found similar case at
> >>> freebsd-current@ with no workaround. Yes, there is an
> >>> interrupt storm as well.
> >>
> >> There had been no functional changes for very long time so I'm not
> >> sure what's going on here.  I've attached local change I have at
> >> this moment but I'm afraid it wouldn't address the issue above.
> >>
> >> I recall jhb also reported interrupt storm in the past but the root
> >> cause was not identified yet.  Could you change msk_intr() and let
> >> me know which interrupt is firing?
> > 
> > I've yet to organize a build.
> > 
> >>> Here is some additional info:
> >>> -
> >>> mskc0@pci0:3:0:0:   class=0x02 card=0xff501179 chip=0x435511ab
> >>> rev=0x12 hdr=0x00
> >>> vendor = 'Marvell Technology Group Ltd.'
> >>> device = '88E8040T PCI-E Fast Ethernet Controller'
> >>> class  = network
> >>> subclass   = ethernet
> >>> cap 01[48] = powerspec 3  supports D0 D1 D2 D3  current D0
> >>> cap 05[5c] = MSI supports 1 message, 64 bit enabled with 1 message
> >>> cap 10[c0] = PCI-Express 2 legacy endpoint max data 128(128) link 
> >>> x1(x1)
> >>>  speed 2.5(2.5) ASPM disabled(L0s/L1)
> >>> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
> >>> ecap 0003[130] = Serial 1 b8b063681e00
> >>> -
> > 
> > Meanwhile some more investigations, "vmstat -i" for calm and storm:
> > -
> > interrupt  total   rate
> > irq1: atkbd01025  2
> > irq9: acpi0  204  0
> > irq14: ata0  327  0
> > irq16: uhci0+246  0
> > irq20: hpet0   22472 52
> > irq23: uhci2 ehci1 10341 24
> > irq256: hdac0 52  0
> > irq257: mskc0258  0
> > irq258: ahci0221  0
> > Total  35146 81
> > -
> > interrupt  total   rate
> > irq1: atkbd01508  2
> > irq9: acpi0  234  0
> > irq14: ata0  409  0
> > irq16: uhci0+246  0
> > irq20: hpet0   72288131
> > irq23: uhci2 ehci1 10846 19
> > irq256: hdac0 52  0
> > irq257: mskc04419760   8021
> > irq258: ahci0221  0
> > Total4505564   8177
> > -
> > 
> > And "vmstat -w1" for calm and storm:
> > -
> >  procs  memory  pagedisks faults cpu
> >  r b w avmfre   flt  re  pi  pofr  sr mm0 ad0   in   sy   cs
> > us sy id
> >  0 0 0  206928  956040   277   0   2   0   330   4   0   0  117  476
> > 454  0  1 99
> >  0 0 0  206928  956036 0   0   0   0 8   4   0   0   50  123
> > 137  0  0 100
> >  0 0 0  206928  956036 0   0   0   0 0   4   0   0   47  120
> > 92  0  1 99
> >  0 0 0  206928  956036 0   0   0   0 0   4   0   0   43  123
> > 119  0  1 99
> >  0 0 0  206928  956036 0   0   0   0 0   4   0   0   55  132
> > 123  0  1 99
> >  0 0 0  206928  956004 0   0   0   0 0   4   0   0   68  123
> > 185  0  1 99
> >  0 0 0  206928  956036 0   0   0   0 8   4   0   0   86  123
> > 266  0  1 99
> >  0 0 0  206928  956036 0   0   0   0 0   4   0   0   44  125
> > 124  0  0 100
> >  0 0 0  206928  956036 0   0   0   0 0   4   0   0   64  128
> > 164  0  1 99
> >  0 0 0  206928  956036 0   0   0   0 0   4   0   0   42  131
> > 101  0  1 99
> > -
> >  procs  memory  pagedisks faults cpu
> >  r b w avmfre   flt  re  pi  pofr  sr mm0 ad0   in   sy   cs
> > us sy id
> >  0 0 0  213648  954676   104   0   1   0   121   4   0   0 22299  204
> > 44262  0 10 90
> >  0 0 0  213648  954672 0   0   0   0 8   4   0   0 112259  123
> > 222379  0 44 56
> >  0 0 0  213648  954672 0   0   0   0 0   4   0   0 111792  123
> > 221489  0 43 57
> >  0 0 0  213648  954672 1   0   0   0 0   4   0   0 109887  183
> > 217754  0 43 57
> >  0 0 0  213648  954668 2   0   0   0 0   4   0   0 109543 

Re: regression: msk0 watchdog timeout and interrupt storm

2014-02-01 Thread Boris Samorodov
Hi Yonghyeon and All,

(this time it's a CURRENT issue)

31.10.2013 17:33, Boris Samorodov пишет:
> 30.10.2013 06:16, Yonghyeon PYUN пишет:
>> On Tue, Oct 29, 2013 at 05:38:27PM +0400, Boris Samorodov wrote:
> 
>>> >From time to time I use a notebook and boot FreeBSD from USB
>>> stick. FreeBSD 9.2-i386 works OK. So I tried to use
>>> FreeBSD 10.0-i386 BETA2 and the network adapter works for
>>> some 10-15 seconds and then stops with diagnostic message
>>> "msk0:watchdog timeout". I've found similar case at
>>> freebsd-current@ with no workaround. Yes, there is an
>>> interrupt storm as well.
>>
>> There had been no functional changes for very long time so I'm not
>> sure what's going on here.  I've attached local change I have at
>> this moment but I'm afraid it wouldn't address the issue above.
>>
>> I recall jhb also reported interrupt storm in the past but the root
>> cause was not identified yet.  Could you change msk_intr() and let
>> me know which interrupt is firing?
> 
> I've yet to organize a build.
> 
>>> Here is some additional info:
>>> -
>>> mskc0@pci0:3:0:0:   class=0x02 card=0xff501179 chip=0x435511ab
>>> rev=0x12 hdr=0x00
>>> vendor = 'Marvell Technology Group Ltd.'
>>> device = '88E8040T PCI-E Fast Ethernet Controller'
>>> class  = network
>>> subclass   = ethernet
>>> cap 01[48] = powerspec 3  supports D0 D1 D2 D3  current D0
>>> cap 05[5c] = MSI supports 1 message, 64 bit enabled with 1 message
>>> cap 10[c0] = PCI-Express 2 legacy endpoint max data 128(128) link x1(x1)
>>>  speed 2.5(2.5) ASPM disabled(L0s/L1)
>>> ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected
>>> ecap 0003[130] = Serial 1 b8b063681e00
>>> -
> 
> Meanwhile some more investigations, "vmstat -i" for calm and storm:
> -
> interrupt  total   rate
> irq1: atkbd01025  2
> irq9: acpi0  204  0
> irq14: ata0  327  0
> irq16: uhci0+246  0
> irq20: hpet0   22472 52
> irq23: uhci2 ehci1 10341 24
> irq256: hdac0 52  0
> irq257: mskc0258  0
> irq258: ahci0221  0
> Total  35146 81
> -
> interrupt  total   rate
> irq1: atkbd01508  2
> irq9: acpi0  234  0
> irq14: ata0  409  0
> irq16: uhci0+246  0
> irq20: hpet0   72288131
> irq23: uhci2 ehci1 10846 19
> irq256: hdac0 52  0
> irq257: mskc04419760   8021
> irq258: ahci0221  0
> Total4505564   8177
> -
> 
> And "vmstat -w1" for calm and storm:
> -
>  procs  memory  pagedisks faults cpu
>  r b w avmfre   flt  re  pi  pofr  sr mm0 ad0   in   sy   cs
> us sy id
>  0 0 0  206928  956040   277   0   2   0   330   4   0   0  117  476
> 454  0  1 99
>  0 0 0  206928  956036 0   0   0   0 8   4   0   0   50  123
> 137  0  0 100
>  0 0 0  206928  956036 0   0   0   0 0   4   0   0   47  120
> 92  0  1 99
>  0 0 0  206928  956036 0   0   0   0 0   4   0   0   43  123
> 119  0  1 99
>  0 0 0  206928  956036 0   0   0   0 0   4   0   0   55  132
> 123  0  1 99
>  0 0 0  206928  956004 0   0   0   0 0   4   0   0   68  123
> 185  0  1 99
>  0 0 0  206928  956036 0   0   0   0 8   4   0   0   86  123
> 266  0  1 99
>  0 0 0  206928  956036 0   0   0   0 0   4   0   0   44  125
> 124  0  0 100
>  0 0 0  206928  956036 0   0   0   0 0   4   0   0   64  128
> 164  0  1 99
>  0 0 0  206928  956036 0   0   0   0 0   4   0   0   42  131
> 101  0  1 99
> -
>  procs  memory  pagedisks faults cpu
>  r b w avmfre   flt  re  pi  pofr  sr mm0 ad0   in   sy   cs
> us sy id
>  0 0 0  213648  954676   104   0   1   0   121   4   0   0 22299  204
> 44262  0 10 90
>  0 0 0  213648  954672 0   0   0   0 8   4   0   0 112259  123
> 222379  0 44 56
>  0 0 0  213648  954672 0   0   0   0 0   4   0   0 111792  123
> 221489  0 43 57
>  0 0 0  213648  954672 1   0   0   0 0   4   0   0 109887  183
> 217754  0 43 57
>  0 0 0  213648  954668 2   0   0   0 0   4   0   0 109543  146
> 216963  0 44 56
>  0 0 0  213648  954668 0   0   0   0 0   4   0   0 110142  123
> 218187  0 45 55
>  0 0 0  213648  954660   472   0   0   0   474   4   0   0 109340  717
> 216674  0 42 57
>  0 0 0  213648  954656 2   0   0   0 0   4   0   0 109459  147
> 216831