Re: [LEDE-DEV] [OpenWrt-Devel] Wifi-related kernel-oops on mt7621 after 4.14 update

2018-05-14 Thread Kristian Evensen
Hello,

On Wed, Apr 18, 2018 at 11:34 AM, Kristian Evensen
 wrote:
> I will keep an eye on this router, just in case, but it seems the
> problem is gone. Thanks for fixing it so fast!

The router (WG3526) has been running fine for a while now, but after
changing configuration from client to access point (for both
interfaces) and updating I started seeing kernel oopses + reboot loops
again. The error messages is as follows:

[   30.665207] CPU 1 Unable to handle kernel paging request at virtual
address ea09a0dd, epc == 8f3a060c, ra == 8ed06fac
[   30.676034] Oops[#1]:
[   30.678341] CPU: 1 PID: 27 Comm: kworker/u8:1 Not tainted 4.14.37 #0
[   30.684852] Workqueue: phy0 ieee80211_ibss_leave [mac80211]
[   30.690409] task: 8fce8000 task.stack: 8fce4000
[   30.694922] $ 0   :  0001 7ac0ae80 0020
[   30.700149] $ 4   : 8ec4cbc0 8ee83c20 ea099ae0 8f79f400
[   30.705373] $ 8   :  80452db0 0007 00096a93
[   30.710593] $12   :  0264 000390fa 77f5d3c0
[   30.715812] $16   : 8ec4d560 8f581000 8ee83480 8ec4cbc0
[   30.721033] $20   :   8056 fffe
[   30.726252] $24   : 0fa3 80058514
[   30.731475] $28   : 8fce4000 8fce5ce8 8057 8ed06fac
[   30.736697] Hi: 000a
[   30.739562] Lo: 6669
[   30.742470] epc   : 8f3a060c 0x8f3a060c
[   30.746401] ra: 8ed06fac sta_set_sinfo+0xcc/0xbb0 [mac80211]
[   30.752380] Status: 11007c03 KERNEL EXL IE
[   30.756561] Cause : 4088 (ExcCode 02)
[   30.760550] BadVA : ea09a0dd
[   30.763418] PrId  : 0001992f (MIPS 1004Kc)
[   30.767492] Modules linked in: rt2800pci rt2800mmio rt2800lib
qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib
rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp
nf_conntrack_ipv6p
[   30.838554]  nf_nat_snmp_basic nf_nat_sip nf_nat_redirect
nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv4
nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_log_ipv4
nf_flow_table
[   30.909546]  ip_set_hash_netport ip_set_hash_netnet ip_set_hash_net
ip_set_hash_netportnet ip_set_hash_mac ip_set_hash_ipportnet
ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark
ip_set_hash_ip ip_sd
[   30.979947]  ehci_platform sd_mod scsi_mod ehci_hcd
gpio_button_hotplug usbcore nls_base usb_common mii
[   30.989345] Process kworker/u8:1 (pid: 27, threadinfo=8fce4000,
task=8fce8000, tls=)
[   30.997742] Stack : 0200  8fce5d78  8fce5d78
80081240 8f79f400 8f581000
[   31.006092] 8ee83480 8ec4cbc0 8f581130  8056
fffe 8057 8ed06fac
[   31.014443]  8f581000 8fc06000 8ee834ac 8f581000
8ec4cbc0 8f581000 8f79f400
[   31.022793] 8ee83480 8ee834ac 8ec4cbc0  8056
8ed07a10 8f148a80 8007be74
[   31.031143]   8fce5d70 8fce5d70 
8f581000 8fc06000 8ed07ac0
[   31.039491] ...
[   31.041940] Call Trace:
[   31.044386] [<8f3a060c>] 0x8f3a060c
[   31.047866] Code: 000630c0  02063021  94f40002 <90d205fd> 00e0b025
1682  3253  2414001f  96d50004
[   31.057611]
[   31.059362] ---[ end trace 7868a781b307fb50 ]---
[   31.068983] Kernel panic - not syncing: Fatal exception
[   31.076144] Rebooting in 3 seconds..

I will try to compile an image with KALLMSYS and see if I can
reproduce the issue. My firmware is based on latest nightly.

BR,
Kristian

___
Lede-dev mailing list
Lede-dev@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/lede-dev


Re: [LEDE-DEV] [OpenWrt-Devel] Wifi-related kernel-oops on mt7621 after 4.14 update

2018-04-18 Thread Kristian Evensen
Hi,

On Tue, Apr 17, 2018 at 3:34 PM, Kristian Evensen
 wrote:
> Thanks, great. I just started building a new image for my router, will
> test and let you know if I still see the issue.

I think I have finished my testing, at least for now, and it seems the
problem is fixed. I compiled an image with the latest changes to mt76,
installed the image on one of my WG3526-routers showing the issue,
configured both radios as clients and updated the router ~10 times,
rebooted, etc. I did not see the crash, wifi was rock solid. I then
"updated" to the older image without the latest changes and the oops
appeared right away.

I will keep an eye on this router, just in case, but it seems the
problem is gone. Thanks for fixing it so fast!

BR,
Kristian

___
Lede-dev mailing list
Lede-dev@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/lede-dev


Re: [LEDE-DEV] [OpenWrt-Devel] Wifi-related kernel-oops on mt7621 after 4.14 update

2018-04-17 Thread Kristian Evensen
On Tue, Apr 17, 2018 at 2:56 PM, Felix Fietkau  wrote:
> On 2018-04-17 13:50, Kristian Evensen wrote:
>> This is with the same image as last time (commit
>> f6e6eadc99c6274207f8f2ebc739063549959a1f) and configuration (radios
>> used as clients). I see that mt76 has been updated during the weekend
>> so I will go ahead and compile a new image with the latest updates.
> I'm about to push another update in a minute. Please wait for that and
> test it. I fixed some more issues in the code.

Thanks, great. I just started building a new image for my router, will
test and let you know if I still see the issue.

BR,
Kristian

___
Lede-dev mailing list
Lede-dev@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/lede-dev


Re: [LEDE-DEV] [OpenWrt-Devel] Wifi-related kernel-oops on mt7621 after 4.14 update

2018-04-17 Thread Felix Fietkau
On 2018-04-17 13:50, Kristian Evensen wrote:
> This is with the same image as last time (commit
> f6e6eadc99c6274207f8f2ebc739063549959a1f) and configuration (radios
> used as clients). I see that mt76 has been updated during the weekend
> so I will go ahead and compile a new image with the latest updates.
I'm about to push another update in a minute. Please wait for that and
test it. I fixed some more issues in the code.

- Felix

___
Lede-dev mailing list
Lede-dev@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/lede-dev


Re: [LEDE-DEV] [OpenWrt-Devel] Wifi-related kernel-oops on mt7621 after 4.14 update

2018-04-17 Thread Kristian Evensen
Hi,

On Thu, Apr 12, 2018 at 3:28 PM, Kristian Evensen
 wrote:
> Thanks for the pointer. I compiled a new image KALLSYMS, but now I am
> not able to reproduce the error. Perhaps there was something dirty in
> my build directory. I will keep the image KALLSYMS on the routers and
> keep checking for the error.

The error came back after I updated my router again. Here are the
oops'es with KALLMSYS enabled:

[   36.714334] CPU 1 Unable to handle kernel paging request at virtual
address f32f0c10, epc == 8f391304, ra == 8f391304
[   36.724966] Oops[#1]:
[   36.727246] CPU: 1 PID: 33 Comm: kworker/u8:2 Tainted: GW
4.14.32 #0
[   36.734949] Workqueue: phy1 ieee80211_ibss_leave [mac80211]
[   36.740523] task: 8fd48000 task.stack: 8fd36000
[   36.745037] $ 0   :  0001 000e 0001
[   36.750270] $ 4   : 8f37957c   
[   36.755506] $ 8   :  80452970 0001 00122121
[   36.760726] $12   :   0010 77ec6230
[   36.765946] $16   : 8f37957c 8fd37d58 f32f0c10 94573690
[   36.771173] $20   : 0001 0040 00ff 8f378bc0
[   36.776394] $24   : 10d9 8f391218
[   36.781615] $28   : 8fd36000 8fd37d10  8f391304
[   36.786837] Hi: 969d
[   36.789701] Lo: 0110
[   36.792603] epc   : 8f391304 mt76_get_survey+0xec/0x31c [mt76]
[   36.798417] ra: 8f391304 mt76_get_survey+0xec/0x31c [mt76]
[   36.804220] Status: 11007c03 KERNEL EXL IE
[   36.808399] Cause : 4088 (ExcCode 02)
[   36.812389] BadVA : f32f0c10
[   36.815257] PrId  : 0001992f (MIPS 1004Kc)
[   36.819331] Modules linked in: rt2800pci rt2800mmio rt2800lib
qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib
rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp
nf_conntrack_ipv6p
[   36.890390]  nf_nat_snmp_basic nf_nat_sip nf_nat_redirect
nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv4
nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_log_ipv4
nf_flow_tablt
[   36.961381]  ip_set_hash_netiface ip_set_hash_netport
ip_set_hash_netnet ip_set_hash_net ip_set_hash_netportnet
ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip
ip_set_hash_ipport ip_set_hash_ipmarm
[   37.032821]  ohci_hcd ehci_platform sd_mod scsi_mod ehci_hcd
gpio_button_hotplug usbcore nls_base usb_common mii
[   37.043006] Process kworker/u8:2 (pid: 33, threadinfo=8fd36000,
task=8fd48000, tls=)
[   37.051404] Stack : 0001 80051a40 81494dc0  0400
8f57c000  
[   37.059753]  8ec110dc 002f113b  8fc2a500
81494dc0  0001
[   37.068100] 814a2dc0 8fc2a614 94573690  
  
[   37.076449]     
  
[   37.084797] 000c   8ec113a0 0020
8149d2ec 814a2dc0 8fc2a500
[   37.093147] ...
[   37.095597] Call Trace:
[   37.098045] [<8f391304>] mt76_get_survey+0xec/0x31c [mt76]
[   37.103671] [<8ec110dc>]
___ieee80211_start_rx_ba_session+0x15c/0x39c [mac80211]
[   37.27] [<8ec113a0>] __ieee80211_start_rx_ba_session+0x84/0xb8 [mac80211]
[   37.118315] [<8ec1144c>] ieee80211_process_addba_request+0x78/0x8c [mac80211]
[   37.125507] [<8ec152a0>] ieee80211_ibss_leave+0x44c/0x19c8 [mac80211]
[   37.132067] Code: 2610001c  0c116236  02002025 <8e44> 3c058d4f
34a5df3b  00850019  3012  3810
[   37.141817]
[   37.143582] ---[ end trace 5af5293c693da408 ]---
[   37.151753] Kernel panic - not syncing: Fatal exception in interrupt
[   37.160354] Rebooting in 3 seconds..

[   30.252516] CPU 0 Unable to handle kernel paging request at virtual
address eb44a0d5, epc == 8ed40ba4, ra == 8ec86fac
[   30.263189] Oops[#1]:
[   30.265506] CPU: 0 PID: 33 Comm: kworker/u8:2 Tainted: GW
4.14.32 #0
[   30.273244] Workqueue: phy1 ieee80211_ibss_leave [mac80211]
[   30.278811] task: 8fd48000 task.stack: 8fd36000
[   30.283321] $ 0   :  0001 7adc6e80 
[   30.288546] $ 4   : 8f3d8bc0 8fd27c20 eb449ae0 8e03a800
[   30.293766] $ 8   :  80452970 0007 0006edf8
[   30.298985] $12   :  8ee8d0c0 0007 1dcd6501
[   30.304205] $16   : 8f3d9560 8f5b8800 8fd27480 8f3d8bc0
[   30.309425] $20   : 8e03a800  8056 fffe
[   30.314645] $24   :  
[   30.319865] $28   : 8fd36000 8fd37cf8 8056 8ec86fac
[   30.325085] Hi: 329d
[   30.327951] Lo: 010e
[   30.330849] epc   : 8ed40ba4 mt76x2_dma_cleanup+0x478/0x1128 [mt76x2e]
[   30.337408] ra: 8ec86fac sta_set_sinfo+0xcc/0xbb0 [mac80211]
[   30.343386] Status: 11008403 KERNEL EXL IE
[   30.347565] Cause : c088 (ExcCode 02)
[   30.351553] BadVA : eb44a0d5
[   30.354421] PrId  : 0001992f (MIPS 1004Kc)
[   30.358494] Modules linked in: rt2800pci rt2800mmio rt2800lib
qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib
rndis_host qmi_wwan ppp_generic nf_nat_pptp 

Re: [LEDE-DEV] [OpenWrt-Devel] Wifi-related kernel-oops on mt7621 after 4.14 update

2018-04-12 Thread Fushan Wen
> [45729.251928] PC is at tcp_push+0x44/0xfc

This should be fixed in kernel 4.14.32. Try the latest snapshot.

___
Lede-dev mailing list
Lede-dev@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/lede-dev


Re: [LEDE-DEV] [OpenWrt-Devel] Wifi-related kernel-oops on mt7621 after 4.14 update

2018-04-12 Thread Kristian Evensen
Hi,

On Thu, Apr 12, 2018 at 1:02 PM, John Crispin  wrote:
> try enabling KALLSYMS to get a verbose stack trace.

Thanks for the pointer. I compiled a new image KALLSYMS, but now I am
not able to reproduce the error. Perhaps there was something dirty in
my build directory. I will keep the image KALLSYMS on the routers and
keep checking for the error.

BR,
Kristian

___
Lede-dev mailing list
Lede-dev@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/lede-dev


Re: [LEDE-DEV] [OpenWrt-Devel] Wifi-related kernel-oops on mt7621 after 4.14 update

2018-04-12 Thread John Crispin



On 12/04/18 12:42, Kristian Evensen wrote:

Hello,

I have recently updated some ramips mt7621-devices (ZBT WG3526) to the
latest nightly. Almost everything seems to work fine, but using either
wifi interface in client mode seems triggers an oops. I see two
different oops-messages:

Message 1:
[   66.442802] CPU 1 Unable to handle kernel paging request at virtual
address e9e9e0d5, epc == 8f3e060c, ra == 8ec86fac
[   66.453460] Oops[#1]:
[   66.455743] CPU: 1 PID: 3679 Comm: wifib Tainted: GW   4.14.32 #0
[   66.462857] task: 8e223200 task.stack: 8e1b4000
[   66.467374] $ 0   :  0001 7abc2e80 0020
[   66.472612] $ 4   : 8ec48bc0 8e76dc20 e9e9dae0 8e1b5848
[   66.477847] $ 8   : 8ec4902c 80452968 00ee4000 ff80
[   66.483061] $12   : 80583f8c 0040  77f0f3c0
[   66.488276] $16   : 8ec49560 8f578000 8e76d480 8ec48bc0
[   66.493493] $20   :  0002 8e1b5cb8 0008
[   66.498711] $24   :  77e74ff0
[   66.503937] $28   : 8e1b4000 8e1b5780  8ec86fac
[   66.509153] Hi: 
[   66.512020] Lo: 0068
[   66.514913] epc   : 8f3e060c 0x8f3e060c
[   66.518866] ra: 8ec86fac sta_set_sinfo+0xcc/0xbb0 [mac80211]
[   66.524843] Status: 11007c03 KERNEL EXL IE
[   66.529015] Cause : 4088 (ExcCode 02)
[   66.533005] BadVA : e9e9e0d5
[   66.535869] PrId  : 0001992f (MIPS 1004Kc)
[   66.539941] Modules linked in: rt2800pci rt2800mmio rt2800lib
qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib
rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp
nf_conntrack_ipv6p
[   66.610889]  nf_nat_snmp_basic nf_nat_sip nf_nat_redirect
nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv4
nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_log_ipv4
nf_flow_tablt
[   66.681822]  ip_set_hash_netiface ip_set_hash_netport
ip_set_hash_netnet ip_set_hash_net ip_set_hash_netportnet
ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip
ip_set_hash_ipport ip_set_hash_ipmarm
[   66.753184]  ohci_hcd ehci_platform sd_mod scsi_mod ehci_hcd
gpio_button_hotplug usbcore nls_base usb_common mii
[   66.763357] Process wifib (pid: 3679, threadinfo=8e1b4000,
task=8e223200, tls=77f10ec0)
[   66.771321] Stack :     
 8e1b5848 8f578000
[   66.779654] 8e76d480 8ec48bc0 8f578130 0002 8e1b5cb8
0008  8ec86fac
[   66.787987] 0100 8e134628 0007 8e1b5b98 8e134628
 8e1b5b90 8ec49014
[   66.796325] 8e76d000  fffe 0002 8e1b5cb8
8ec9e338 8ec315ac 
[   66.804661] 01d2 8058   
8e134628 8e068840 8ec1fb28
[   66.812996] ...
[   66.815446] Call Trace:
[   66.817894] [<8f3e060c>] 0x8f3e060c
[   66.821370] Code: 000630c0  02063021  94f40002 <90d205f5> 00e0b025
1682  3253  2414001f  96d50004
[   66.831098]
[   66.833187] ---[ end trace 8c8a003de3eabcd8 ]---
[   66.841897] Kernel panic - not syncing: Fatal exception
[   66.849317] Rebooting in 3 seconds..

Message 2:
[  132.613293] CPU 0 Unable to handle kernel paging request at virtual
address ea9160d5, epc == 8f2c060c, ra == 8ec86fac
[  132.623927] Oops[#1]:
[  132.626199] CPU: 0 PID: 41 Comm: kworker/u8:3 Tainted: GW
 4.14.32 #0
[  132.633882] Workqueue: phy0 ieee80211_ibss_leave [mac80211]
[  132.639431] task: 8fd48c80 task.stack: 8fd94000
[  132.643933] $ 0   :  0001 7ac52e80 0020
[  132.649141] $ 4   : 8f2d0bc0 8e04dc20 ea915ae0 8f122400
[  132.654350] $ 8   :  80452970 8fc02b00 0005376b
[  132.659558] $12   : 12d8   001c
[  132.664766] $16   : 8f2d1560 8f58a000 8e04d480 8f2d0bc0
[  132.669973] $20   :  0001 8f2d1014 
[  132.675181] $24   : 3b9aca00 
[  132.680390] $28   : 8fd94000 8fd95c88 8ece1618 8ec86fac
[  132.685605] Hi: 07d0
[  132.688473] Lo: 0bb8
[  132.691357] epc   : 8f2c060c 0x8f2c060c
[  132.695235] ra: 8ec86fac sta_set_sinfo+0xcc/0xbb0 [mac80211]
[  132.701212] Status: 11008403 KERNEL EXL IE
[  132.705391] Cause : 4088 (ExcCode 02)
[  132.709380] BadVA : ea9160d5
[  132.712247] PrId  : 0001992f (MIPS 1004Kc)
[  132.716320] Modules linked in: rt2800pci rt2800mmio rt2800lib
qcserial ppp_async option usb_wwan rt2x00pci rt2x00mmio rt2x00lib
rndis_host qmi_wwan ppp_generic nf_nat_pptp nf_conntrack_pptp
nf_conntrack_ipv6p
[  132.787381]  nf_nat_snmp_basic nf_nat_sip nf_nat_redirect
nf_nat_proto_gre nf_nat_masquerade_ipv4 nf_nat_irc nf_conntrack_ipv4
nf_nat_ipv4 nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_log_ipv4
nf_flow_tablt
[  132.858369]  ip_set_hash_netiface ip_set_hash_netport
ip_set_hash_netnet ip_set_hash_net ip_set_hash_netportnet
ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip
ip_set_hash_ipport ip_set_hash_ipmarm
[  132.929808]  ohci_hcd ehci_platform sd_mod scsi_mod ehci_hcd
gpio_button_hotplug usbcore nls_base usb_common mii
[  132.939989] Process