Re: [OpenWrt-Devel] AA on brcm47xx: Unhandled kernel unaligned access

2014-07-15 Thread Nikolai Zhubr

15.07.2014 1:42, Jonas Gorski:
[...]

or
bw32(bp, B44_RXMAXLEN, bp-dev-mtu + ETH_HLEN + 8) ?


This is the right one; mtu (the payload) + ETH_HLEN (14 bytes) + 8
(4 bytes for vlan tag, probably 4 extra bytes for custom header
optionally used by broadcom switches)


Ok, tested this. Unfortunately it's still panicing under load (and 
seemingly this change made no difference whatsoever):



[  271.21] [ cut here ]
[  271.22] WARNING: at net/core/dev.c:2194 
skb_warn_bad_offload+0xc0/0xe8()
[  271.22] b44: caps=(0x4000, 0x) 
len=377 data_len=0 gso_size=57048 gso_type=32506 ip_summed=0
[  271.24] Modules linked in: pppoe ppp_async iptable_nat b43legacy 
b43 pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211 
ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_multiport 
xt_mark xt_mac xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT 
xt_LOG xt_CT slhc nf_nat_irc nf_nat_ftp nf_nat nf_defrag_ipv4 
nf_conntrack_irc nf_conntrack_ftp iptable_raw iptable_mangle 
iptable_filter ipt_REJECT ip_tables crc_ccitt compat ip6t_REJECT 
ip6table_raw ip6table_mangle ip6table_filter ip6_tables x_tables 
nf_conntrack_ipv6 nf_conntrack nf_defrag_ipv6 ipv6 arc4 crypto_blkcipher 
leds_gpio gpio_button_hotplug tg3 hwmon bgmac b44 ptp pps_core

[  271.30] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 3.10.44 #2
[  271.30] Stack :     8030d552 
0036 818201d0 0008

  80272688 802bf23b 0003 8030cd00 818201d0 0008 802bb6e4 

  802bb6dc 8001c204 0003 80019bc4 80299520 0008 80273f28 
8182bc5c
         

         
8182bbe8
  ...
[  271.34] Call Trace:
[  271.34] [80010ca0] show_stack+0x48/0x70
[  271.35] [80019cc0] warn_slowpath_common+0x78/0xa8
[  271.35] [80019d1c] warn_slowpath_fmt+0x2c/0x38
[  271.36] [801b2d10] skb_warn_bad_offload+0xc0/0xe8
[  271.36] [801b68c4] __skb_gso_segment+0x50/0xec
[  271.37] [801de5bc] ip_forward_finish+0x108/0x1bc
[  271.37] [801b3da0] __netif_receive_skb_core+0x46c/0x52c
[  271.38] [81ad41d4] 0x81ad41d4
[  271.38]
[  271.38] ---[ end trace b4f0aa7175b12bf7 ]---
[  271.39] Unhandled kernel unaligned access[#1]:
[  271.39] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: GW 
3.10.44 #2

[  271.39] task: 81820028 ti: 8182a000 task.ti: 8182a000
[  271.39] $ 0   :  0001 81696a48 0028
[  271.39] $ 4   : 2d37d9ee  7088 
[  271.39] $ 8   : 002d 35373137 62323162 5d203766
[  271.39] $12   :  03bf  bc00
[  271.39] $16   : 80e7fec0 0001 0001 0014
[  271.39] $20   :  0008 802bb6e4 
[  271.39] $24   : 0003 80150bcc
[  271.39] $28   : 8182a000 8182bd28 802bb6dc 801ab22c
[  271.39] Hi: 
[  271.39] Lo: 0083
[  271.39] epc   : 80064440 put_page+0x0/0x4c
[  271.39] Tainted: GW
[  271.39] ra: 801ab22c skb_release_data+0xc4/0x118
[  271.39] Status: 1000b803 KERNEL EXL IE
[  271.39] Cause : 00800010
[  271.39] BadVA : 2d37d9ee
[  271.39] PrId  : 00029006 (Broadcom BMIPS3300)
[  271.39] Modules linked in: pppoe ppp_async iptable_nat b43legacy 
b43 pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211 
ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_multiport 
xt_mark xt_mac xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT 
xt_LOG xt_CT slhc nf_nat_irc nf_nat_ftp nf_nat nf_defrag_ipv4 
nf_conntrack_irc nf_conntrack_ftp iptable_raw iptable_mangle 
iptable_filter ipt_REJECT ip_tables crc_ccitt compat ip6t_REJECT 
ip6table_raw ip6table_mangle ip6table_filter ip6_tables x_tables 
nf_conntrack_ipv6 nf_conntrack nf_defrag_ipv6 ipv6 arc4 crypto_blkcipher 
leds_gpio gpio_button_hotplug tg3 hwmon bgmac b44 ptp pps_core
[  271.39] Process ksoftirqd/0 (pid: 3, threadinfo=8182a000, 
task=81820028, tls=)
[  271.39] Stack : 80e7fec0 801ab294 80e7fec0 7088  
80e7fec0 ffea 801ab2d0

  802bb6e4 80e7fec0 80e5da40 0001 80e7fec0 801de5d4 0850 
80f72ac0
  81b68000 801de4b4 0001 801aa3e4 802bca98 802bca98 802bb6d0 
81abc000
  80e7fec0 801b3da0 0042 81ad0964 81b7df20 801aa3e4 802bb6e4 
8018e658
  010a 01f1 81abc3e8 81abc3c0 0042 80e7fec0 0017 
0187
  ...
[  271.39] Call Trace:
[  271.39] [80064440] put_page+0x0/0x4c
[  271.39] [801ab22c] skb_release_data+0xc4/0x118
[  271.39] [801ab2d0] __kfree_skb+0x14/0xd4
[  271.39] [801de5d4] ip_forward_finish+0x120/0x1bc
[  271.39] [801b3da0] __netif_receive_skb_core+0x46c/0x52c
[  271.39] [81ad41d4] 0x81ad41d4
[  271.39]
[  271.39]
Code: 3c058006  080190c9  24a538e4 8c82 

Re: [OpenWrt-Devel] AA on brcm47xx: Unhandled kernel unaligned access

2014-07-15 Thread Nikolai Zhubr

15.07.2014 12:04, Nikolai Zhubr:

15.07.2014 1:42, Jonas Gorski:
[...]

or
bw32(bp, B44_RXMAXLEN, bp-dev-mtu + ETH_HLEN + 8) ?


This is the right one; mtu (the payload) + ETH_HLEN (14 bytes) + 8
(4 bytes for vlan tag, probably 4 extra bytes for custom header
optionally used by broadcom switches)


Ok, tested this. Unfortunately it's still panicing under load (and
seemingly this change made no difference whatsoever):


And I've performed yet another experiment. If I insert an additional 
router (running also openwrt but atheros-based) between this WL-500W and 
uplink (with the idea to filter out any strange and bogus incoming 
packets) and redo the same test, I get no panic but instead a silent 
spontaneous reboot in a few minutes after reaching 30mbit traffic. I'll 
retest this more carefully later, and meanwhile I think:


1. Apparently some (bogus?) packets ocasionally coming from uplink still 
confuse b44 driver and cause panics regardless of my B44_RXMAXLEN 
correction.


2. Silent reboot might probably indicate hardware problem like 
overheating. Although I have its case open and I touched its chips, 
well, they were acceptably warm I think. Another point is that CPU 
performance limits routing capability of this device (when using openwrt 
at least) somewhere around 33mbit, so getting close to continuous 100% 
CPU usage might probably lead to watchdog trigger? (Just a random 
speculation)



Thank you.
Nikolai




[ 271.21] [ cut here ]
[ 271.22] WARNING: at net/core/dev.c:2194
skb_warn_bad_offload+0xc0/0xe8()
[ 271.22] b44: caps=(0x4000, 0x) len=377
data_len=0 gso_size=57048 gso_type=32506 ip_summed=0
[ 271.24] Modules linked in: pppoe ppp_async iptable_nat b43legacy
b43 pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211
ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_multiport
xt_mark xt_mac xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT
xt_LOG xt_CT slhc nf_nat_irc nf_nat_ftp nf_nat nf_defrag_ipv4
nf_conntrack_irc nf_conntrack_ftp iptable_raw iptable_mangle
iptable_filter ipt_REJECT ip_tables crc_ccitt compat ip6t_REJECT
ip6table_raw ip6table_mangle ip6table_filter ip6_tables x_tables
nf_conntrack_ipv6 nf_conntrack nf_defrag_ipv6 ipv6 arc4 crypto_blkcipher
leds_gpio gpio_button_hotplug tg3 hwmon bgmac b44 ptp pps_core
[ 271.30] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 3.10.44 #2
[ 271.30] Stack :     8030d552
0036 818201d0 0008
80272688 802bf23b 0003 8030cd00 818201d0 0008 802bb6e4 
802bb6dc 8001c204 0003 80019bc4 80299520 0008 80273f28 8182bc5c
       
       8182bbe8
...
[ 271.34] Call Trace:
[ 271.34] [80010ca0] show_stack+0x48/0x70
[ 271.35] [80019cc0] warn_slowpath_common+0x78/0xa8
[ 271.35] [80019d1c] warn_slowpath_fmt+0x2c/0x38
[ 271.36] [801b2d10] skb_warn_bad_offload+0xc0/0xe8
[ 271.36] [801b68c4] __skb_gso_segment+0x50/0xec
[ 271.37] [801de5bc] ip_forward_finish+0x108/0x1bc
[ 271.37] [801b3da0] __netif_receive_skb_core+0x46c/0x52c
[ 271.38] [81ad41d4] 0x81ad41d4
[ 271.38]
[ 271.38] ---[ end trace b4f0aa7175b12bf7 ]---
[ 271.39] Unhandled kernel unaligned access[#1]:
[ 271.39] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G W 3.10.44 #2
[ 271.39] task: 81820028 ti: 8182a000 task.ti: 8182a000
[ 271.39] $ 0 :  0001 81696a48 0028
[ 271.39] $ 4 : 2d37d9ee  7088 
[ 271.39] $ 8 : 002d 35373137 62323162 5d203766
[ 271.39] $12 :  03bf  bc00
[ 271.39] $16 : 80e7fec0 0001 0001 0014
[ 271.39] $20 :  0008 802bb6e4 
[ 271.39] $24 : 0003 80150bcc
[ 271.39] $28 : 8182a000 8182bd28 802bb6dc 801ab22c
[ 271.39] Hi : 
[ 271.39] Lo : 0083
[ 271.39] epc : 80064440 put_page+0x0/0x4c
[ 271.39] Tainted: G W
[ 271.39] ra : 801ab22c skb_release_data+0xc4/0x118
[ 271.39] Status: 1000b803 KERNEL EXL IE
[ 271.39] Cause : 00800010
[ 271.39] BadVA : 2d37d9ee
[ 271.39] PrId : 00029006 (Broadcom BMIPS3300)
[ 271.39] Modules linked in: pppoe ppp_async iptable_nat b43legacy
b43 pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211
ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_multiport
xt_mark xt_mac xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT
xt_LOG xt_CT slhc nf_nat_irc nf_nat_ftp nf_nat nf_defrag_ipv4
nf_conntrack_irc nf_conntrack_ftp iptable_raw iptable_mangle
iptable_filter ipt_REJECT ip_tables crc_ccitt compat ip6t_REJECT
ip6table_raw ip6table_mangle ip6table_filter ip6_tables x_tables
nf_conntrack_ipv6 nf_conntrack nf_defrag_ipv6 ipv6 arc4 crypto_blkcipher
leds_gpio gpio_button_hotplug tg3 hwmon bgmac b44 ptp pps_core
[ 271.39] Process ksoftirqd/0 (pid: 3, 

Re: [OpenWrt-Devel] AA on brcm47xx: Unhandled kernel unaligned access

2014-07-15 Thread Nikolai Zhubr

15.07.2014 23:26, Nikolai Zhubr:
[...]

And I've performed yet another experiment. If I insert an additional
router (running also openwrt but atheros-based) between this WL-500W and
uplink (with the idea to filter out any strange and bogus incoming
packets) and redo the same test, I get no panic but instead a silent
spontaneous reboot in a few minutes after reaching 30mbit traffic. I'll


Here is a slightly different panic, although also involving 
netif_receive_skb_core (And this is still with additional openwrt router 
inserted before uplink):


[  900.72] CPU 0 Unable to handle kernel paging request at virtual 
address 0004, epc == 80119aa0, ra == 8011b2e8

[  900.72] Oops[#1]:
[  900.72] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 3.10.44 #2
[  900.72] task: 81820028 ti: 8182a000 task.ti: 8182a000
[  900.72] $ 0   :  10003800 80f29a48 
[  900.72] $ 4   : 802be1a0 802bdc1c  fffc
[  900.72] $ 8   : 0384 2b82ea80 00989680 
[  900.72] $12   : 0384 3c87  
[  900.72] $16   : 802be1a0 802bdc1c 802bdc40 7fff
[  900.72] $20   : 0384  2aea8de9 
[  900.72] $24   :  80016dc0
[  900.72] $28   : 8182a000 8182bb50 0384 8011b2e8
[  900.72] Hi: 
[  900.72] Lo: 3c87
[  900.72] epc   : 80119aa0 rb_insert_color+0x2c/0x14c
[  900.72] Not tainted
[  900.72] ra: 8011b2e8 timerqueue_add+0xc0/0x118
[  900.72] Status: 10003802 KERNEL EXL
[  900.72] Cause : 0088
[  900.72] BadVA : 0004
[  900.72] PrId  : 00029006 (Broadcom BMIPS3300)
[  900.72] Modules linked in: pppoe ppp_async iptable_nat b43legacy 
b43 pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211 
ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_multiport 
xt_mark xt_mac xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT 
xt_LOG xt_CT slhc nf_nat_irc nf_nat_ftp nf_nat nf_defrag_ipv4 
nf_conntrack_irc nf_conntrack_ftp iptable_raw iptable_mangle 
iptable_filter ipt_REJECT ip_tables crc_ccitt compat ip6t_REJECT 
ip6table_raw ip6table_mangle ip6table_filter ip6_tables x_tables 
nf_conntrack_ipv6 nf_conntrack nf_defrag_ipv6 ipv6 arc4 crypto_blkcipher 
leds_gpio gpio_button_hotplug tg3 hwmon bgmac b44 ptp pps_core
[  900.72] Process ksoftirqd/0 (pid: 3, threadinfo=8182a000, 
task=81820028, tls=)
[  900.72] Stack : 802bdc40 7fff 0384  802be1a0 
802bdc10 802bdc40 8003a144

  802be1a0 802c  802c 802c 802bdbe0 2aea8de9 
8003a9c8
   8182bc08 80c52220 80eb93a0 0001 0001 2aea8de9 
0384
  0003 2aea8de9 0384 80f122e4  0007 802c2870 

   80a169b5 802c 802765f4 80276608 80012d00 0001 
00014600
  ...
[  900.72] Call Trace:
[  900.72] [80119aa0] rb_insert_color+0x2c/0x14c
[  900.72] [8011b2e8] timerqueue_add+0xc0/0x118
[  900.72] [8003a144] __run_hrtimer.isra.26+0x7c/0xf8
[  900.72] [8003a9c8] hrtimer_interrupt+0x14c/0x3f4
[  900.72] [80012d00] c0_compare_interrupt+0x74/0xa0
[  900.72] [8005335c] handle_irq_event_percpu+0x64/0x1ec
[  900.72] [80055e60] handle_percpu_irq+0x54/0x84
[  900.72] [80052ce0] generic_handle_irq+0x28/0x44
[  900.72] [8000e24c] do_IRQ+0x1c/0x2c
[  900.72] [8000a3ec] plat_irq_dispatch+0x40/0xb8
[  900.72] [80001448] ret_from_irq+0x0/0x4
[  900.72] [80005590] __copy_user_common+0x248/0x2d8
[  900.72] [801a8830] skb_copy_ubufs+0xec/0x204
[  900.72] [801b3db0] __netif_receive_skb_core+0x47c/0x52c
[  900.72] [81ad41d4] 0x81ad41d4
[  900.72]
[  900.72]
Code: 30660001  14c00047   8c660004 10460016   
10c5    8cc8

[  900.72] ---[ end trace de6e4d131b0441ac ]---
[  900.72] Kernel panic - not syncing: Fatal exception in interrupt




retest this more carefully later, and meanwhile I think:

1. Apparently some (bogus?) packets ocasionally coming from uplink still
confuse b44 driver and cause panics regardless of my B44_RXMAXLEN
correction.

2. Silent reboot might probably indicate hardware problem like
overheating. Although I have its case open and I touched its chips,
well, they were acceptably warm I think. Another point is that CPU
performance limits routing capability of this device (when using openwrt
at least) somewhere around 33mbit, so getting close to continuous 100%
CPU usage might probably lead to watchdog trigger? (Just a random
speculation)


Thank you.
Nikolai




[ 271.21] [ cut here ]
[ 271.22] WARNING: at net/core/dev.c:2194
skb_warn_bad_offload+0xc0/0xe8()
[ 271.22] b44: caps=(0x4000, 0x) len=377
data_len=0 gso_size=57048 gso_type=32506 ip_summed=0
[ 271.24] Modules linked in: pppoe ppp_async iptable_nat b43legacy
b43 pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211

Re: [OpenWrt-Devel] AA on brcm47xx: Unhandled kernel unaligned access

2014-07-14 Thread Rafał Miłecki
On 21 June 2014 18:36, Nikolai Zhubr n-a-zh...@yandex.ru wrote:
 [  637.43] [ cut here ]
 [  637.44] WARNING: at net/core/dev.c:2194
 skb_warn_bad_offload+0xc0/0xe8()
 [  637.45] b44: caps=(0x4000, 0x) len=1500
 data_len=0 gso_size=53118 gso_type=59551 ip_summed=0
 [  637.46] Modules linked in: pppoe ppp_async iptable_nat b43legacy b43
 pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211 ipt_MASQUERADE
 cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_multiport xt_mark xt_mac
 xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_CT slhc
 nf_nat_irc nf_nat_ftp nf_nat nf_defrag_ipv4 nf_conntrack_irc
 nf_conntrack_ftp iptable_raw iptable_mangle iptable_filter ipt_REJECT
 ip_tables crc_ccitt compat ip6t_REJECT ip6table_raw ip6table_mangle
 ip6table_filter ip6_tables x_tables nf_conntrack_ipv6 nf_conntrack
 nf_defrag_ipv6 ipv6 arc4 crypto_blkcipher leds_gpio gpio_button_hotplug tg3
 hwmon bgmac b44 ptp pps_core
 [  637.52] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 3.10.36 #1
 [  637.52] Stack :     8030d552 0036
 818201d0 0008
   8026cfd0 802bb23b 0003 8030cd00 818201d0 0008 802b76e4
 
   802b76dc 8001c118 0003 80019ad8 80293ecc 0008 8026e870
 8182bc5c
         
 
         
 8182bbe8
   ...
 [  637.56] Call Trace:
 [  637.56] [80010bb4] show_stack+0x48/0x70
 [  637.57] [80019bd4] warn_slowpath_common+0x78/0xa8
 [  637.57] [80019c30] warn_slowpath_fmt+0x2c/0x38
 [  637.58] [801b27dc] skb_warn_bad_offload+0xc0/0xe8
 [  637.58] [801b6390] __skb_gso_segment+0x50/0xec
 [  637.59] [801de0dc] ip_forward_finish+0x108/0x1bc
 [  637.59] [801b386c] __netif_receive_skb_core+0x46c/0x52c
 [  637.60] [81acc16c] 0x81acc16c
 [  637.60]
 [  637.60] ---[ end trace 2c2a6a28d6589bcc ]---

Any idea anyone? Does above mean b44 provided a corrupted packet? Or
some wrong pointer?
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] AA on brcm47xx: Unhandled kernel unaligned access

2014-07-14 Thread Nikolai Zhubr

14.07.2014 18:42, Rafał Miłecki:
[...]

[  637.56] Call Trace:
[  637.56] [80010bb4] show_stack+0x48/0x70
[  637.57] [80019bd4] warn_slowpath_common+0x78/0xa8
[  637.57] [80019c30] warn_slowpath_fmt+0x2c/0x38
[  637.58] [801b27dc] skb_warn_bad_offload+0xc0/0xe8
[  637.58] [801b6390] __skb_gso_segment+0x50/0xec
[  637.59] [801de0dc] ip_forward_finish+0x108/0x1bc
[  637.59] [801b386c] __netif_receive_skb_core+0x46c/0x52c
[  637.60] [81acc16c] 0x81acc16c
[  637.60]
[  637.60] ---[ end trace 2c2a6a28d6589bcc ]---


Any idea anyone? Does above mean b44 provided a corrupted packet? Or
some wrong pointer?


Yet another note: the problem apparently appeared since after 10.03.1.
Maybe I could try to bisect the revision of interest, but doing it 
blindly would probably require tons of time, unless someone aware of 
what was happening to the driver at that time gives some enlightening 
instructions.



Thank you.
Nikolai



.


___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] AA on brcm47xx: Unhandled kernel unaligned access

2014-07-14 Thread Felix Fietkau
On 2014-07-14 16:42, Rafał Miłecki wrote:
 On 21 June 2014 18:36, Nikolai Zhubr n-a-zh...@yandex.ru wrote:
 [  637.43] [ cut here ]
 [  637.44] WARNING: at net/core/dev.c:2194
 skb_warn_bad_offload+0xc0/0xe8()
 [  637.45] b44: caps=(0x4000, 0x) len=1500
 data_len=0 gso_size=53118 gso_type=59551 ip_summed=0
 [  637.46] Modules linked in: pppoe ppp_async iptable_nat b43legacy b43
 pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211 ipt_MASQUERADE
 cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_multiport xt_mark xt_mac
 xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_CT slhc
 nf_nat_irc nf_nat_ftp nf_nat nf_defrag_ipv4 nf_conntrack_irc
 nf_conntrack_ftp iptable_raw iptable_mangle iptable_filter ipt_REJECT
 ip_tables crc_ccitt compat ip6t_REJECT ip6table_raw ip6table_mangle
 ip6table_filter ip6_tables x_tables nf_conntrack_ipv6 nf_conntrack
 nf_defrag_ipv6 ipv6 arc4 crypto_blkcipher leds_gpio gpio_button_hotplug tg3
 hwmon bgmac b44 ptp pps_core
 [  637.52] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 3.10.36 #1
 [  637.52] Stack :     8030d552 0036
 818201d0 0008
   8026cfd0 802bb23b 0003 8030cd00 818201d0 0008 802b76e4
 
   802b76dc 8001c118 0003 80019ad8 80293ecc 0008 8026e870
 8182bc5c
         
 
         
 8182bbe8
   ...
 [  637.56] Call Trace:
 [  637.56] [80010bb4] show_stack+0x48/0x70
 [  637.57] [80019bd4] warn_slowpath_common+0x78/0xa8
 [  637.57] [80019c30] warn_slowpath_fmt+0x2c/0x38
 [  637.58] [801b27dc] skb_warn_bad_offload+0xc0/0xe8
 [  637.58] [801b6390] __skb_gso_segment+0x50/0xec
 [  637.59] [801de0dc] ip_forward_finish+0x108/0x1bc
 [  637.59] [801b386c] __netif_receive_skb_core+0x46c/0x52c
 [  637.60] [81acc16c] 0x81acc16c
 [  637.60]
 [  637.60] ---[ end trace 2c2a6a28d6589bcc ]---
 
 Any idea anyone? Does above mean b44 provided a corrupted packet? Or
 some wrong pointer?
It looks to me like the hardware is overwriting the skb shared info (at
the end of the skb data buffer), possibly because the configured maximum
frame length may be too big for the buffer.

If I were to speculate wildly, I would guess that B44_RXMAXLEN refers to
the maximum frame length, not the maximum buffer length - and in the
code, it's being fed with the maximum buffer length.
This would allow the hardware to receive slightly oversized frames which
can corrupt the skb.

- Felix
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] AA on brcm47xx: Unhandled kernel unaligned access

2014-07-14 Thread Jonas Gorski
On Mon, Jul 14, 2014 at 6:23 PM, Felix Fietkau n...@openwrt.org wrote:
 On 2014-07-14 16:42, Rafał Miłecki wrote:
 On 21 June 2014 18:36, Nikolai Zhubr n-a-zh...@yandex.ru wrote:
 [  637.43] [ cut here ]
 [  637.44] WARNING: at net/core/dev.c:2194
 skb_warn_bad_offload+0xc0/0xe8()
 [  637.45] b44: caps=(0x4000, 0x) len=1500
 data_len=0 gso_size=53118 gso_type=59551 ip_summed=0
 [  637.46] Modules linked in: pppoe ppp_async iptable_nat b43legacy b43
 pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211 ipt_MASQUERADE
 cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_multiport xt_mark xt_mac
 xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_CT slhc
 nf_nat_irc nf_nat_ftp nf_nat nf_defrag_ipv4 nf_conntrack_irc
 nf_conntrack_ftp iptable_raw iptable_mangle iptable_filter ipt_REJECT
 ip_tables crc_ccitt compat ip6t_REJECT ip6table_raw ip6table_mangle
 ip6table_filter ip6_tables x_tables nf_conntrack_ipv6 nf_conntrack
 nf_defrag_ipv6 ipv6 arc4 crypto_blkcipher leds_gpio gpio_button_hotplug tg3
 hwmon bgmac b44 ptp pps_core
 [  637.52] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 3.10.36 #1
 [  637.52] Stack :     8030d552 0036
 818201d0 0008
   8026cfd0 802bb23b 0003 8030cd00 818201d0 0008 802b76e4
 
   802b76dc 8001c118 0003 80019ad8 80293ecc 0008 8026e870
 8182bc5c
         
 
         
 8182bbe8
   ...
 [  637.56] Call Trace:
 [  637.56] [80010bb4] show_stack+0x48/0x70
 [  637.57] [80019bd4] warn_slowpath_common+0x78/0xa8
 [  637.57] [80019c30] warn_slowpath_fmt+0x2c/0x38
 [  637.58] [801b27dc] skb_warn_bad_offload+0xc0/0xe8
 [  637.58] [801b6390] __skb_gso_segment+0x50/0xec
 [  637.59] [801de0dc] ip_forward_finish+0x108/0x1bc
 [  637.59] [801b386c] __netif_receive_skb_core+0x46c/0x52c
 [  637.60] [81acc16c] 0x81acc16c
 [  637.60]
 [  637.60] ---[ end trace 2c2a6a28d6589bcc ]---

 Any idea anyone? Does above mean b44 provided a corrupted packet? Or
 some wrong pointer?
 It looks to me like the hardware is overwriting the skb shared info (at
 the end of the skb data buffer), possibly because the configured maximum
 frame length may be too big for the buffer.

 If I were to speculate wildly, I would guess that B44_RXMAXLEN refers to
 the maximum frame length, not the maximum buffer length - and in the
 code, it's being fed with the maximum buffer length.
 This would allow the hardware to receive slightly oversized frames which
 can corrupt the skb.

Since there is a public datasheet[1], this is easily verifiable, and
it looks you are right:

Receive Maximum Length Register (RcvLength, Offset 0x404):

The value stored in this register specifies the largest valid Ethernet
Frame to be received.

The same is true for the XmtMaxLength register, which is also set too
large (it defaults to 1518).


Jonas

[1]: https://www.broadcom.com/collateral/pg/440X-PG02-R.pdf
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] AA on brcm47xx: Unhandled kernel unaligned access

2014-07-14 Thread Nikolai Zhubr

14.07.2014 20:44, Jonas Gorski:
[...]

If I were to speculate wildly, I would guess that B44_RXMAXLEN refers to
the maximum frame length, not the maximum buffer length - and in the
code, it's being fed with the maximum buffer length.
This would allow the hardware to receive slightly oversized frames which
can corrupt the skb.


Since there is a public datasheet[1], this is easily verifiable, and
it looks you are right:

Receive Maximum Length Register (RcvLength, Offset 0x404):

The value stored in this register specifies the largest valid Ethernet
Frame to be received.


Ok, so I'd suppose
bw32(bp, B44_RXMAXLEN, bp-dev-mtu + ETH_HLEN + 8 + RX_HEADER_LEN)
should instead be
bw32(bp, B44_RXMAXLEN, bp-dev-mtu + ETH_HLEN) ?
or
bw32(bp, B44_RXMAXLEN, bp-dev-mtu + ETH_HLEN + 8) ?
or maybe even
bw32(bp, B44_RXMAXLEN, bp-dev-mtu) ?

Apology for my ignorance, just can't stand testing it immediately to 
hopefully get it right for BB.



Thank you.
Nikolai



The same is true for the XmtMaxLength register, which is also set too
large (it defaults to 1518).


Jonas

[1]: https://www.broadcom.com/collateral/pg/440X-PG02-R.pdf

.


___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] AA on brcm47xx: Unhandled kernel unaligned access

2014-07-14 Thread Rafał Miłecki
On 14 July 2014 18:44, Jonas Gorski j...@openwrt.org wrote:
 On Mon, Jul 14, 2014 at 6:23 PM, Felix Fietkau n...@openwrt.org wrote:
 It looks to me like the hardware is overwriting the skb shared info (at
 the end of the skb data buffer), possibly because the configured maximum
 frame length may be too big for the buffer.

 If I were to speculate wildly, I would guess that B44_RXMAXLEN refers to
 the maximum frame length, not the maximum buffer length - and in the
 code, it's being fed with the maximum buffer length.
 This would allow the hardware to receive slightly oversized frames which
 can corrupt the skb.

 Since there is a public datasheet[1], this is easily verifiable, and
 it looks you are right:

 Receive Maximum Length Register (RcvLength, Offset 0x404):

 The value stored in this register specifies the largest valid Ethernet
 Frame to be received.

 The same is true for the XmtMaxLength register, which is also set too
 large (it defaults to 1518).

I wonder what's the point of that register if we set length per-skb
anyway (b44_alloc_rx_skb):
ctrl = (DESC_CTRL_LEN  RX_PKT_BUF_SZ);
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] AA on brcm47xx: Unhandled kernel unaligned access

2014-07-14 Thread Jonas Gorski
On Mon, Jul 14, 2014 at 11:48 PM, Nikolai Zhubr n-a-zh...@yandex.ru wrote:
 14.07.2014 20:44, Jonas Gorski:
 [...]

 If I were to speculate wildly, I would guess that B44_RXMAXLEN refers to
 the maximum frame length, not the maximum buffer length - and in the
 code, it's being fed with the maximum buffer length.
 This would allow the hardware to receive slightly oversized frames which
 can corrupt the skb.


 Since there is a public datasheet[1], this is easily verifiable, and
 it looks you are right:

 Receive Maximum Length Register (RcvLength, Offset 0x404):

 The value stored in this register specifies the largest valid Ethernet
 Frame to be received.


 Ok, so I'd suppose
 bw32(bp, B44_RXMAXLEN, bp-dev-mtu + ETH_HLEN + 8 + RX_HEADER_LEN)
 should instead be
 bw32(bp, B44_RXMAXLEN, bp-dev-mtu + ETH_HLEN) ?
 or
 bw32(bp, B44_RXMAXLEN, bp-dev-mtu + ETH_HLEN + 8) ?

This is the right one; mtu (the payload) + ETH_HLEN (14 bytes) + 8
(4 bytes for vlan tag, probably 4 extra bytes for custom header
optionally used by broadcom switches)

 or maybe even
 bw32(bp, B44_RXMAXLEN, bp-dev-mtu) ?

 Apology for my ignorance, just can't stand testing it immediately to
 hopefully get it right for BB.


 Thank you.
 Nikolai

Thanks for testing!


Jonas
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] AA on brcm47xx: Unhandled kernel unaligned access

2014-06-21 Thread Nikolai Zhubr

21.06.2014 0:23, Rafał Miłecki:
[...]
This time uplink load was even no more than 20 Mbit.
Here is what I got (although it doesn't look very promising to me).
Maybe I should enable some more debugging somewhere?

[  543.432000] Unhandled kernel unaligned access[#1]:
[  543.432000] Cpu 0
[  543.432000] $ 0   :  1000b800 2280a89f 
[  543.432000] $ 4   : 8032e4b0 804e86b6  8033
[  543.432000] $ 8   : 804e86b6  2400 
[  543.432000] $12   : 03bf ac00  0c00
[  543.432000] $16   : 8033 8179ebe0 1000b801 0100
[  543.432000] $20   : 0005 0024 8039 8039
[  543.432000] $24   : 0004 
[  543.432000] $28   : 81824000 81825e18 8031e490 801e3e90
[  543.432000] Hi: fea0
[  543.432000] Lo: 0160
[  543.432000] epc   : 801e3e9c __dst_free+0x2c/0x150
[  543.432000] Tainted: G   O
[  543.432000] ra: 801e3e90 __dst_free+0x20/0x150
[  543.432000] Status: 1000b803KERNEL EXL IE
[  543.432000] Cause : 00800010
[  543.432000] BadVA : 2280a99b
[  543.432000] PrId  : 00029006 (Broadcom BMIPS3300)
[  543.432000] Modules linked in: nf_nat_irc nf_conntrack_irc nf_nat_ftp 
nf_conntrack_ftp ipt_MASQUERADE iptable_nat nf_nat pppoe xt_conntrack 
xt_CT xt_NOTRACK iptable_raw xt_state nf_conntrack_ipv4 nf_defrag_ipv4 
nf_conntrack pppox ipt_REJECT xt_TCPMSS ipt_LOG xt_comment xt_multiport 
xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp 
x_tables ppp_async ppp_generic slhc b43legacy(O) b43(O) mac80211(O) 
crc_ccitt cfg80211(O) compat(O) arc4 aes_generic crypto_algapi 
switch_robo(O) switch_core(O) diag(O)
[  543.432000] Process ksoftirqd/0 (pid: 3, threadinfo=81824000, 
task=81822060, tls=)
[  543.432000] Stack : 8031e490 80063ab8 1000b802  945ef2d5 
945ef2d5 8179ebe0 80063b88
[  543.432000] 0009 80063bc4 0005 000c 8038f088 
0001 0009 80020200
[  543.432000]  8005426c    
 8039 8039
[  543.432000]  0001    
  8002037c
[  543.432000] 81819e24  800202e4   
81819e24  800202e4

[  543.432000] ...
[  543.432000] Call Trace:
[  543.432000] [801e3e9c] __dst_free+0x2c/0x150
[  543.432000] [80063b88] __rcu_process_callbacks+0x118/0x140
[  543.432000] [80020200] __do_softirq+0xd0/0x1b4
[  543.432000] [8002037c] run_ksoftirqd+0x98/0x154
[  543.432000] [80037678] kthread+0x88/0x90
[  543.432000] [80007cc0] kernel_thread_helper+0x10/0x18
[  543.432000]
[  543.432000]
[  543.432000] Code: 8e22000c  5046  3c02801e 8c4200fc 30420001 
1446  24020002  3c02801e  244240cc

[  543.664000] ---[ end trace f941e3bc313ba83f ]---
[  543.668000] Kernel panic - not syncing: Fatal exception in interrupt





I suppose this is something that should not normally happen, and I'd like to
have it fixed somehow. I haven't tried trunk yet, but I will, if it could
make some difference.


There are tons of updates in trunk, this bug can be fixed for a long
time already ;)

.


___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] AA on brcm47xx: Unhandled kernel unaligned access

2014-06-21 Thread Nikolai Zhubr

21.06.2014 0:23, Rafał Miłecki:

On 20 June 2014 22:12, Nikolai Zhubrn-a-zh...@yandex.ru  wrote:
There are tons of updates in trunk, this bug can be fixed for a long
time already ;)


Unfortunately it seems no :/
Also, with trunk version, routing speed limit seems to be noticably 
lower (~27 Mbit trunk compared to ~34 Mbit 12.09)

Serial log follows (Unaligned access and something)
Please let me know what else I can do to find and fix it :)

Thank you,
Nikolai
 -
 BARRIER BREAKER (Bleeding Edge, r41293)
 -
  * 1/2 oz Galliano Pour all ingredients into
  * 4 oz cold Coffeean irish coffee mug filled
  * 1 1/2 oz Dark Rum   with crushed ice. Stir.
  * 2 tsp. Creme de Cacao
 -
root@OpenWrt:/# exit
Please press Enter to activate this console.
[  637.43] [ cut here ]
[  637.44] WARNING: at net/core/dev.c:2194 
skb_warn_bad_offload+0xc0/0xe8()
[  637.45] b44: caps=(0x4000, 0x) 
len=1500 data_len=0 gso_size=53118 gso_type=59551 ip_summed=0
[  637.46] Modules linked in: pppoe ppp_async iptable_nat b43legacy 
b43 pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211 
ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_multiport 
xt_mark xt_mac xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT 
xt_LOG xt_CT slhc nf_nat_irc nf_nat_ftp nf_nat nf_defrag_ipv4 
nf_conntrack_irc nf_conntrack_ftp iptable_raw iptable_mangle 
iptable_filter ipt_REJECT ip_tables crc_ccitt compat ip6t_REJECT 
ip6table_raw ip6table_mangle ip6table_filter ip6_tables x_tables 
nf_conntrack_ipv6 nf_conntrack nf_defrag_ipv6 ipv6 arc4 crypto_blkcipher 
leds_gpio gpio_button_hotplug tg3 hwmon bgmac b44 ptp pps_core

[  637.52] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 3.10.36 #1
[  637.52] Stack :     8030d552 
0036 818201d0 0008

  8026cfd0 802bb23b 0003 8030cd00 818201d0 0008 802b76e4 

  802b76dc 8001c118 0003 80019ad8 80293ecc 0008 8026e870 
8182bc5c
         

         
8182bbe8
  ...
[  637.56] Call Trace:
[  637.56] [80010bb4] show_stack+0x48/0x70
[  637.57] [80019bd4] warn_slowpath_common+0x78/0xa8
[  637.57] [80019c30] warn_slowpath_fmt+0x2c/0x38
[  637.58] [801b27dc] skb_warn_bad_offload+0xc0/0xe8
[  637.58] [801b6390] __skb_gso_segment+0x50/0xec
[  637.59] [801de0dc] ip_forward_finish+0x108/0x1bc
[  637.59] [801b386c] __netif_receive_skb_core+0x46c/0x52c
[  637.60] [81acc16c] 0x81acc16c
[  637.60]
[  637.60] ---[ end trace 2c2a6a28d6589bcc ]---
[  637.61] Unhandled kernel unaligned access[#1]:
[  637.61] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: GW 
3.10.36 #1

[  637.61] task: 81820028 ti: 8182a000 task.ti: 8182a000
[  637.61] $ 0   :  0001 80d72b68 0028
[  637.61] $ 4   : c36ae951  7088 
[  637.61] $ 8   : 002d 36643832 62393835 5d206363
[  637.61] $12   :  03bf  bc00
[  637.61] $16   : 80ea6ce0 0001 0001 0014
[  637.61] $20   :  0008 802b76e4 
[  637.61] $24   : 0003 801507e8
[  637.61] $28   : 8182a000 8182bd28 802b76dc 801aadc0
[  637.61] Hi: 
[  637.61] Lo: 0083
[  637.61] epc   : 80064208 put_page+0x0/0x4c
[  637.61] Tainted: GW
[  637.61] ra: 801aadc0 skb_release_data+0xc4/0x118
[  637.61] Status: 1000b803 KERNEL EXL IE
[  637.61] Cause : 00800010
[  637.61] BadVA : c36ae951
[  637.61] PrId  : 00029006 (Broadcom BMIPS3300)
[  637.61] Modules linked in: pppoe ppp_async iptable_nat b43legacy 
b43 pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211 
ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_multiport 
xt_mark xt_mac xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT 
xt_LOG xt_CT slhc nf_nat_irc nf_nat_ftp nf_nat nf_defrag_ipv4 
nf_conntrack_irc nf_conntrack_ftp iptable_raw iptable_mangle 
iptable_filter ipt_REJECT ip_tables crc_ccitt compat ip6t_REJECT 
ip6table_raw ip6table_mangle ip6table_filter ip6_tables x_tables 
nf_conntrack_ipv6 nf_conntrack nf_defrag_ipv6 ipv6 arc4 crypto_blkcipher 
leds_gpio gpio_button_hotplug tg3 hwmon bgmac b44 ptp pps_core
[  637.61] Process ksoftirqd/0 (pid: 3, threadinfo=8182a000, 
task=81820028, tls=)
[  637.61] Stack : 80ea6ce0 801aae28 80ea6ce0 7088  
80ea6ce0 ffea 801aae64

  802b76e4 80ea6ce0 80d6f6c0 0001 80ea6ce0 801de0f4 0258 
8088de40
  81a9e000 801ddfd4 0001 80ea6ce0 802b8a98 802b8a98 802b76d0 
81ab5000
  80ea6ce0 801b386c 0183 81ac8964 

[OpenWrt-Devel] AA on brcm47xx: Unhandled kernel unaligned access

2014-06-20 Thread Nikolai Zhubr

Hello people,

I have asus wl-500W router (http://wiki.openwrt.org/toh/asus/wl500w). It 
is also very similar to wl-500gp.
Some few months ago I updated to 12.09. I can't recall now if it was 
backfire or kamikaze before, but I noticed 2 things immediately:


1. Maximum practically achievable download speed increased somewhat.
(From ~30Mbit to ~34mbit approx)

2. After reaching (and keeping) this max download speed, the device will 
always reboot soon. (Absolutely reproducible)


For some time I was thinking that's just hardware, like bad electrolytic 
capacitors and/or weak power supply. Finally I opened the router, 
replaced all 3 capacitors (2 of 3 appeared somewhat damaged indeed), 
attached a voltmeter to check for undervoltage. Still nothing: power 
supply is OK, reboots still happen.


So I had to turn to the software side, and found 2 new things again:

1. While uplink load goes up approaching 34Mbit, softirqs eat up more 
and more CPU, approaching 100% CPU.


2. At some point I get (on a serial link):
[  368.948000] sched: RT throttling activated
[  382.688000] Unhandled kernel unaligned access[#1]:
trim
[  382.932000] Kernel panic - not syncing: Fatal exception in interrupt
[  382.94] Rebooting in 3 seconds..
(Trimmed all in between because there is no debugging info for now)

I suppose this is something that should not normally happen, and I'd 
like to have it fixed somehow. I haven't tried trunk yet, but I will, if 
it could make some difference. I can provide serial logs, compile trunk, 
apply patches, redo testing etc. (As time permits)

Any hints appreciated.

Thank you,
Nikolai
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel


Re: [OpenWrt-Devel] AA on brcm47xx: Unhandled kernel unaligned access

2014-06-20 Thread Rafał Miłecki
On 20 June 2014 22:12, Nikolai Zhubr n-a-zh...@yandex.ru wrote:
 2. At some point I get (on a serial link):
 [  368.948000] sched: RT throttling activated
 [  382.688000] Unhandled kernel unaligned access[#1]:
 trim
 [  382.932000] Kernel panic - not syncing: Fatal exception in interrupt
 [  382.94] Rebooting in 3 seconds..
 (Trimmed all in between because there is no debugging info for now)

Debugging info would be really wanted.


 I suppose this is something that should not normally happen, and I'd like to
 have it fixed somehow. I haven't tried trunk yet, but I will, if it could
 make some difference.

There are tons of updates in trunk, this bug can be fixed for a long
time already ;)
___
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel