Re: Race with ip=dhcp bootparameter in ip_rcv_finish on am335x
On 09/23/2015 12:29 PM, Alexander Aring wrote: > Hi, > > I am using am335x (beaglebone white) with "ip=dhcp" inside my > bootparameters. It seems there exists some race, rarely it boots fine > and often it crashed with following messages: > > First (Without successful dhcp ip assign and is very rarely): > > [6.249083] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready > [6.398943] . > [7.400822] Unable to handle kernel paging request at virtual address > 7402a8c8 > [7.408419] pgd = c0004000 > [7.411351] [7402a8c8] *pgd= > [7.415121] Internal error: Oops: 5 [#1] SMP ARM > [7.419968] Modules linked in: > [7.423197] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GW > 4.2.0-11195-ge81be90 #124 > [7.432118] Hardware name: Generic AM33XX (Flattened Device Tree) > [7.438509] task: c0a996c8 ti: c0a92000 task.ti: c0a92000 > [7.444189] PC is at ip_route_input_noref+0x620/0xeb4 > [7.449491] LR is at trace_hardirqs_on_caller+0xb0/0x1e8 <<>> > [7.759169] [] (ip_route_input_noref) from [] > (ip_rcv_finish+0xd4/0x67c) <<>> > > > > I tried to bisect this issue without success, sometimes it's booting fine > and sometimes I got the above messages. Could anybody help me here to > fixing this issue? > > - Alex Hi Alex, This should be fixed by commit bde6f9ded1bd ("net: Initialize table in fib result") in net-next. Cheers, Nik -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Race with ip=dhcp bootparameter in ip_rcv_finish on am335x
Hi, I am using am335x (beaglebone white) with "ip=dhcp" inside my bootparameters. It seems there exists some race, rarely it boots fine and often it crashed with following messages: First (Without successful dhcp ip assign and is very rarely): [6.249083] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [6.398943] . [7.400822] Unable to handle kernel paging request at virtual address 7402a8c8 [7.408419] pgd = c0004000 [7.411351] [7402a8c8] *pgd= [7.415121] Internal error: Oops: 5 [#1] SMP ARM [7.419968] Modules linked in: [7.423197] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GW 4.2.0-11195-ge81be90 #124 [7.432118] Hardware name: Generic AM33XX (Flattened Device Tree) [7.438509] task: c0a996c8 ti: c0a92000 task.ti: c0a92000 [7.444189] PC is at ip_route_input_noref+0x620/0xeb4 [7.449491] LR is at trace_hardirqs_on_caller+0xb0/0x1e8 [7.455055] pc : []lr : []psr: 2113 [7.455055] sp : c0a93c70 ip : 000f fp : ff8f [7.467072] r10: c0b35c80 r9 : r8 : ce5f9400 [7.472543] r7 : ce5ef000 r6 : r5 : ce615640 r4 : ce5ffd40 [7.479378] r3 : 7402a8c0 r2 : 0001 r1 : c0549f74 r0 : ce5ffd40 [7.486216] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [7.493686] Control: 10c5387d Table: 80004019 DAC: 0051 [7.499699] Process swapper/0 (pid: 0, stack limit = 0xc0a92218) [7.505989] Stack: (0xc0a93c70 to 0xc0a94000) [7.510561] 3c60: c054afa4 [7.519131] 3c80: c0b56080 00070044 0004 [7.527705] 3ca0: 7402a8c0 012d 0004 c0586298 [7.536278] 3cc0: 0102a8c0 7402a8c0 cfba6c34 c0a996c8 ce615640 [7.544852] 3ce0: c0551310 ce61b850 c07c6b10 c0b560b8 c0a95744 c0550668 [7.553422] 3d00: ce5ef000 c0a5ed00 6113 c0090d44 0001 ce615640 c04f85d0 0050 [7.561996] 3d20: ce61b800 ce5ef000 c0a95744 c0551310 0001 [7.570563] 3d40: c04f85d0 ce000140 ce638100 0050 ce61b800 cfba6b10 ce61b99a ce61b99a [7.579135] 3d60: c0a82730 c0a92000 ce61b850 c0a5ed94 0100 0102a8c0 c0a996c8 c0a9791c [7.587709] 3d80: ce5ef000 ce5ef048 0008 ce615640 ce5ef05c ce6156ec ce615640 c04f85d0 [7.596289] 3da0: c0ad3f0c c0a92000 ce5ef054 c04fb4e8 6113 ce615640 [7.604869] 3dc0: ce5ef05c c04fb4e8 c0b5697c ce6381c0 [7.613445] 3de0: 0020 ce615640 c0b55f7e ce615640 ce5ef5c0 0158 ce5f4550 8db6 [7.622027] 3e00: 0100 c04fb534 c04fb4e8 ce5ef000 4113 [7.630603] 3e20: ce64f900 c04ed184 ce64f942 ce5ef000 ce5ef000 ce5ef000 ce6381c0 c041fec4 [7.639172] 3e40: 0001 d0978000 ce5f2c50 0001 d0978000 ce5f2c70 0158 c041d35c [7.647750] 3e60: 0001 ce5f2c50 0040 cfb9f0c0 c0a93eb0 0040 c041d3d8 [7.656322] 3e80: ce5ef5e8 ce5ef5e8 0040 c0421f44 ce5ef5e8 0001 012c c04fcd3c [7.664892] 3ea0: 6193 0f10e000 cfb9f0c0 c0a910c0 c0a93eb0 c0a93eb0 c0a93eb8 c0a93eb8 [7.673466] 3ec0: c0b5697c c0a9408c 0008 0003 c0b565c4 0004 000c [7.682039] 3ee0: 0100 c0042614 c0a996c8 ce199900 000a 8db5 0020 0001 [7.690602] 3f00: 00aa c0a8e828 c0a95744 0001 ce017000 [7.699170] 3f20: c077471c c0042b3c c0a8e828 c009bb5c c0a93f50 c0010218 2113 [7.707733] 3f40: c0a93f84 c0b56e14 c076cef8 0001 0001 c0a996c8 [7.716309] 3f60: c0a92000 c0a94b2c c0a94acc c0b56e14 c077471c [7.724881] 3f80: c0a93fa0 c008e21c c0010218 2113 0051 c0010214 [7.733453] 3fa0: c0083cb4 c0a8c4e4 c0a8f8c8 c0b64000 c0a94a00 c0a19c1c [7.742025] 3fc0: c0a1967c c0a76a48 c0b64214 c0a94ab0 [7.750588] 3fe0: c0a76a44 c0a9b258 80004059 413fc082 8000807c [7.759169] [] (ip_route_input_noref) from [] (ip_rcv_finish+0xd4/0x67c) [7.768010] [] (ip_rcv_finish) from [] (ip_rcv+0x490/0x72c) [7.775672] [] (ip_rcv) from [] (__netif_receive_skb_core+0x240/0x97c) [7.784339] [] (__netif_receive_skb_core) from [] (netif_receive_skb_internal+0x5c/0x1d8) [7.794742] [] (netif_receive_skb_internal) from [] (cpsw_rx_handler+0xac/0x1e8) [7.804318] [] (cpsw_rx_handler) from [] (__cpdma_chan_process+0xe8/0x128) [7.813346] [] (__cpdma_chan_process) from [] (cpdma_chan_process+0x3c/0x5c) [7.822544] [] (cpdma_chan_process) from [] (cpsw_rx_poll+0x14/0x94) [7.831024] [] (cpsw_rx_poll) from [] (net_rx_action+0x1dc/0x33c) [7.839232] [] (net_rx_action) from [] (__do_softirq+0x138/0x340) [
Re: Race with ip=dhcp bootparameter in ip_rcv_finish on am335x
On Wed, 23 Sep 2015 14:27:15 +0200 Alexander Aringwrote: \ > Finally I figured out that commit > 4876cc779ff525b9c2376d8076edf47815e71f2c ("SUNRPC: Ensure we release the > TCP socket once it has been closed") occur this races. After reverting > this commit everything works fine. Yay! Everybody sing! "Fix one bug and replace it by another..." -- Steve -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Race with ip=dhcp bootparameter in ip_rcv_finish on am335x
+linux-nfs mailing list On Wed, Sep 23, 2015 at 8:27 AM, Alexander Aringwrote: > Hi, > > On Wed, Sep 23, 2015 at 01:57:57PM +0200, Alexander Aring wrote: > ... >> > >> >> Ok, I think I have two issues with two different races the first one was >> fixed by bde6f9ded1bd ("net: Initialize table in fib result"), but the >> second one is still there: >> >> [8.615806] [ cut here ] >> [8.620678] Kernel BUG at c016c3d0 [verbose debug info unavailable] >> [8.627229] Internal error: Oops - BUG: 0 [#1] SMP ARM >> [8.632611] Modules linked in: >> [8.635836] CPU: 0 PID: 766 Comm: kworker/0:1H Tainted: GW >> 4.2.0-11248-gfbd0351 #140 >> [8.645208] Hardware name: Generic AM33XX (Flattened Device Tree) >> [8.651616] Workqueue: rpciod xprt_autoclose >> [8.656091] task: ce3c52c0 ti: ce642000 task.ti: ce642000 >> [8.661744] PC is at iput+0x1a8/0x1f0 >> [8.665579] LR is at xprt_autoclose+0x2c/0x54 >> [8.670136] pc : []lr : []psr: 2113 >> [8.670136] sp : ce643e80 ip : fp : c0b56688 >> [8.682133] r10: 0001 r9 : ce643ec8 r8 : >> [8.687599] r7 : feff3000 r6 : ce615800 r5 : ce615bc0 r4 : ce615b54 >> [8.694421] r3 : 0060 r2 : 000f r1 : 0f10e000 r0 : cdbed720 >> [8.701254] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment >> none >> [8.708718] Control: 10c5387d Table: 80004019 DAC: 0051 >> [8.714732] Process kworker/0:1H (pid: 766, stack limit = 0xce642218) >> [8.721464] Stack: (0xce643e80 to 0xce644000) >> [8.726033] 3e80: c066f828 ce615b54 ce615bc0 ce615800 feff3000 >> ce643ec8 c066c884 >> [8.734596] 3ea0: ce615b54 ce5ff440 cfb9e340 c0057928 0001 >> c00578b4 cfb9e340 >> [8.743152] 3ec0: c0057cc8 c137972c c0cc1960 c09979f4 >> cfb9e340 cfb9e340 >> [8.751714] 3ee0: ce5ff458 cfb9e370 ce642000 0008 c0b55ba0 ce5ff440 >> cfb9e340 c0057c54 >> [8.760274] 3f00: ce659940 ce5ff440 c0057c18 ce659940 ce5ff440 >> c0057c18 >> [8.768834] 3f20: c005d918 c0b5697c >> ce5ff440 >> [8.777390] 3f40: dead4ead c0b65d60 >> >> [8.785951] 3f60: c0922088 ce643f64 ce643f64 dead4ead >> >> [8.794513] 3f80: c0b65d60 c0922088 ce643f90 ce643f90 >> ce643fac ce659940 >> [8.803069] 3fa0: c005d844 c000f770 >> >> [8.811628] 3fc0: >> >> [8.820185] 3fe0: 0013 >> 8fdf6861 8fdf6c61 >> [8.828741] [] (iput) from [] >> (xprt_autoclose+0x2c/0x54) >> [8.836133] [] (xprt_autoclose) from [] >> (process_one_work+0x19c/0x48c) >> [8.844784] [] (process_one_work) from [] >> (worker_thread+0x3c/0x4a0) >> [8.853256] [] (worker_thread) from [] >> (kthread+0xd4/0xf0) >> [8.860827] [] (kthread) from [] >> (ret_from_fork+0x14/0x24) >> [8.868387] Code: e59f0044 e59f1044 ebfb467a eac1 (e7f001f2) > > Additional missing information is that I am booting via nfsroot and > xprt_autoclose is something from sunrpc. > > Finally I figured out that commit > 4876cc779ff525b9c2376d8076edf47815e71f2c ("SUNRPC: Ensure we release the > TCP socket once it has been closed") occur this races. After reverting > this commit everything works fine. > > I added now: > > Steven Rostedt > Trond Myklebust > > to cc to report about this issue. > Is that happening when the transport is being torn down? If so, is it fixed by http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=79234c3db6842a3de03817211d891e0c2878f756 ? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Race with ip=dhcp bootparameter in ip_rcv_finish on am335x
Hi, On Wed, Sep 23, 2015 at 12:36:13PM +0200, Nikolay Aleksandrov wrote: > On 09/23/2015 12:29 PM, Alexander Aring wrote: > > Hi, > > > > I am using am335x (beaglebone white) with "ip=dhcp" inside my > > bootparameters. It seems there exists some race, rarely it boots fine > > and often it crashed with following messages: > > > > First (Without successful dhcp ip assign and is very rarely): > > > > [6.249083] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready > > [6.398943] . > > [7.400822] Unable to handle kernel paging request at virtual address > > 7402a8c8 > > [7.408419] pgd = c0004000 > > [7.411351] [7402a8c8] *pgd= > > [7.415121] Internal error: Oops: 5 [#1] SMP ARM > > [7.419968] Modules linked in: > > [7.423197] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GW > > 4.2.0-11195-ge81be90 #124 > > [7.432118] Hardware name: Generic AM33XX (Flattened Device Tree) > > [7.438509] task: c0a996c8 ti: c0a92000 task.ti: c0a92000 > > [7.444189] PC is at ip_route_input_noref+0x620/0xeb4 > > [7.449491] LR is at trace_hardirqs_on_caller+0xb0/0x1e8 > <<>> > > [7.759169] [] (ip_route_input_noref) from [] > > (ip_rcv_finish+0xd4/0x67c) > <<>> > > > > > > > > I tried to bisect this issue without success, sometimes it's booting fine > > and sometimes I got the above messages. Could anybody help me here to > > fixing this issue? > > > > - Alex > > Hi Alex, > This should be fixed by commit bde6f9ded1bd > ("net: Initialize table in fib result") in net-next. > I tried to cherry-picked this commit and saw this is already in net-next. I also tried to boot current net, the race is still there. - Alex -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Race with ip=dhcp bootparameter in ip_rcv_finish on am335x
On Wed, Sep 23, 2015 at 12:36:13PM +0200, Nikolay Aleksandrov wrote: > On 09/23/2015 12:29 PM, Alexander Aring wrote: > > Hi, > > > > I am using am335x (beaglebone white) with "ip=dhcp" inside my > > bootparameters. It seems there exists some race, rarely it boots fine > > and often it crashed with following messages: > > > > First (Without successful dhcp ip assign and is very rarely): > > > > [6.249083] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready > > [6.398943] . > > [7.400822] Unable to handle kernel paging request at virtual address > > 7402a8c8 > > [7.408419] pgd = c0004000 > > [7.411351] [7402a8c8] *pgd= > > [7.415121] Internal error: Oops: 5 [#1] SMP ARM > > [7.419968] Modules linked in: > > [7.423197] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GW > > 4.2.0-11195-ge81be90 #124 > > [7.432118] Hardware name: Generic AM33XX (Flattened Device Tree) > > [7.438509] task: c0a996c8 ti: c0a92000 task.ti: c0a92000 > > [7.444189] PC is at ip_route_input_noref+0x620/0xeb4 > > [7.449491] LR is at trace_hardirqs_on_caller+0xb0/0x1e8 > <<>> > > [7.759169] [] (ip_route_input_noref) from [] > > (ip_rcv_finish+0xd4/0x67c) > <<>> > > > > > > > > I tried to bisect this issue without success, sometimes it's booting fine > > and sometimes I got the above messages. Could anybody help me here to > > fixing this issue? > > > > - Alex > > Hi Alex, > This should be fixed by commit bde6f9ded1bd > ("net: Initialize table in fib result") in net-next. > Ok, I think I have two issues with two different races the first one was fixed by bde6f9ded1bd ("net: Initialize table in fib result"), but the second one is still there: [8.615806] [ cut here ] [8.620678] Kernel BUG at c016c3d0 [verbose debug info unavailable] [8.627229] Internal error: Oops - BUG: 0 [#1] SMP ARM [8.632611] Modules linked in: [8.635836] CPU: 0 PID: 766 Comm: kworker/0:1H Tainted: GW 4.2.0-11248-gfbd0351 #140 [8.645208] Hardware name: Generic AM33XX (Flattened Device Tree) [8.651616] Workqueue: rpciod xprt_autoclose [8.656091] task: ce3c52c0 ti: ce642000 task.ti: ce642000 [8.661744] PC is at iput+0x1a8/0x1f0 [8.665579] LR is at xprt_autoclose+0x2c/0x54 [8.670136] pc : []lr : []psr: 2113 [8.670136] sp : ce643e80 ip : fp : c0b56688 [8.682133] r10: 0001 r9 : ce643ec8 r8 : [8.687599] r7 : feff3000 r6 : ce615800 r5 : ce615bc0 r4 : ce615b54 [8.694421] r3 : 0060 r2 : 000f r1 : 0f10e000 r0 : cdbed720 [8.701254] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [8.708718] Control: 10c5387d Table: 80004019 DAC: 0051 [8.714732] Process kworker/0:1H (pid: 766, stack limit = 0xce642218) [8.721464] Stack: (0xce643e80 to 0xce644000) [8.726033] 3e80: c066f828 ce615b54 ce615bc0 ce615800 feff3000 ce643ec8 c066c884 [8.734596] 3ea0: ce615b54 ce5ff440 cfb9e340 c0057928 0001 c00578b4 cfb9e340 [8.743152] 3ec0: c0057cc8 c137972c c0cc1960 c09979f4 cfb9e340 cfb9e340 [8.751714] 3ee0: ce5ff458 cfb9e370 ce642000 0008 c0b55ba0 ce5ff440 cfb9e340 c0057c54 [8.760274] 3f00: ce659940 ce5ff440 c0057c18 ce659940 ce5ff440 c0057c18 [8.768834] 3f20: c005d918 c0b5697c ce5ff440 [8.777390] 3f40: dead4ead c0b65d60 [8.785951] 3f60: c0922088 ce643f64 ce643f64 dead4ead [8.794513] 3f80: c0b65d60 c0922088 ce643f90 ce643f90 ce643fac ce659940 [8.803069] 3fa0: c005d844 c000f770 [8.811628] 3fc0: [8.820185] 3fe0: 0013 8fdf6861 8fdf6c61 [8.828741] [] (iput) from [] (xprt_autoclose+0x2c/0x54) [8.836133] [] (xprt_autoclose) from [] (process_one_work+0x19c/0x48c) [8.844784] [] (process_one_work) from [] (worker_thread+0x3c/0x4a0) [8.853256] [] (worker_thread) from [] (kthread+0xd4/0xf0) [8.860827] [] (kthread) from [] (ret_from_fork+0x14/0x24) [8.868387] Code: e59f0044 e59f1044 ebfb467a eac1 (e7f001f2) [8.874771] ---[ end trace 6c5e5808fb5c8423 ]--- [8.880045] Unable to handle kernel paging request at virtual address ffd0 [8.887603] pgd = c0004000 [8.890438] [ffd0] *pgd=8fdf6861, *pte=, *ppte= [8.897034] Internal error: Oops: 37 [#2] SMP ARM [8.901950] Modules linked in: [8.905165] CPU: 0 PID: 766 Comm: kworker/0:1H Tainted: G D W 4.2.0-11248-gfbd0351 #140 [8.914530] Hardware name: Generic AM33XX (Flattened Device Tree) [8.920923] task: ce3c52c0 ti: ce642000 task.ti: ce642000 [
Re: Race with ip=dhcp bootparameter in ip_rcv_finish on am335x
Hi, On Wed, Sep 23, 2015 at 09:16:00AM -0400, Trond Myklebust wrote: ... > > Is that happening when the transport is being torn down? If so, is it > fixed by > http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=commitdiff;h=79234c3db6842a3de03817211d891e0c2878f756 > ? thanks. This patch fixed my issue. - Alex -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html