Re: NET_MPSAFE [was Re: CVS commit: src/sys]
On 01/21/17 10:13, Nick Hudson wrote: On 01/20/17 03:34, Ryota Ozaki wrote: On Mon, Dec 12, 2016 at 5:24 PM, Nick Hudsonwrote: On 11/28/16 09:43, Ryota Ozaki wrote: On Mon, Nov 28, 2016 at 4:29 PM, Nick Hudson wrote: This is hurting me again. FYI https://nxr.netbsd.org/xref/src/sys/dev/usb/if_axe.c#527 I think we need NET_MPSAFE finished for netbsd-8 Further investigations show that even if we enable NET_MPSAFE by default, there are cases that calling driver Tx with holding softnet_lock. For example, packet transmission via a unconnected socket and TCP syn ack (tcp_input still needs softnet_lock). So I have brought another idea to address the issue: Tx softint; it defers if_start to softint, which is implemented with deferred if_start. By doing so, driver's if_start is called without holding softnet_lock. Note that of course the feature is only enabled for drivers that request it. This is a patch: http://www.netbsd.org/~ozaki-r/tx_softint.diff The patch includes changes to if_axe for an example. The drawback of the approach is of course performance degradation. I tried it with vioif and saw 3% down on iperf. I haven't tested it using USB devices, so I don't know the performance impact on USB Tx. Nick, how do you think the approach? And can you try the patch and benchmark if the approach satisfies you? I'll look into this, but... Thinking about this I don't think it's needed as I don't think usb drivers call if_start from hard interrupt context. Still not sure how this fixes the softnet_lock held for too long problem Nick
Re: NET_MPSAFE [was Re: CVS commit: src/sys]
On 01/20/17 03:34, Ryota Ozaki wrote: On Mon, Dec 12, 2016 at 5:24 PM, Nick Hudsonwrote: On 11/28/16 09:43, Ryota Ozaki wrote: On Mon, Nov 28, 2016 at 4:29 PM, Nick Hudson wrote: This is hurting me again. FYI https://nxr.netbsd.org/xref/src/sys/dev/usb/if_axe.c#527 I think we need NET_MPSAFE finished for netbsd-8 Further investigations show that even if we enable NET_MPSAFE by default, there are cases that calling driver Tx with holding softnet_lock. For example, packet transmission via a unconnected socket and TCP syn ack (tcp_input still needs softnet_lock). So I have brought another idea to address the issue: Tx softint; it defers if_start to softint, which is implemented with deferred if_start. By doing so, driver's if_start is called without holding softnet_lock. Note that of course the feature is only enabled for drivers that request it. This is a patch: http://www.netbsd.org/~ozaki-r/tx_softint.diff The patch includes changes to if_axe for an example. The drawback of the approach is of course performance degradation. I tried it with vioif and saw 3% down on iperf. I haven't tested it using USB devices, so I don't know the performance impact on USB Tx. Nick, how do you think the approach? And can you try the patch and benchmark if the approach satisfies you? I'll look into this, but... Note that this is a tentative solution for netbsd-8. The ideal solution is still to get rid of softnet_lock completely. How hard is it to remove the need to hold softnet_lock for the case you mention above? It would be a shame to rototill all the drivers if it isn't required. Thanks, Nick
Re: NET_MPSAFE [was Re: CVS commit: src/sys]
On Mon, Dec 12, 2016 at 5:24 PM, Nick Hudsonwrote: > On 11/28/16 09:43, Ryota Ozaki wrote: >> >> On Mon, Nov 28, 2016 at 4:29 PM, Nick Hudson wrote: > > >> This is hurting me again. > > > FYI > > > https://nxr.netbsd.org/xref/src/sys/dev/usb/if_axe.c#527 > > I think we need NET_MPSAFE finished for netbsd-8 Further investigations show that even if we enable NET_MPSAFE by default, there are cases that calling driver Tx with holding softnet_lock. For example, packet transmission via a unconnected socket and TCP syn ack (tcp_input still needs softnet_lock). So I have brought another idea to address the issue: Tx softint; it defers if_start to softint, which is implemented with deferred if_start. By doing so, driver's if_start is called without holding softnet_lock. Note that of course the feature is only enabled for drivers that request it. This is a patch: http://www.netbsd.org/~ozaki-r/tx_softint.diff The patch includes changes to if_axe for an example. The drawback of the approach is of course performance degradation. I tried it with vioif and saw 3% down on iperf. I haven't tested it using USB devices, so I don't know the performance impact on USB Tx. Nick, how do you think the approach? And can you try the patch and benchmark if the approach satisfies you? Note that this is a tentative solution for netbsd-8. The ideal solution is still to get rid of softnet_lock completely. ozaki-r
Re: NET_MPSAFE [was Re: CVS commit: src/sys]
On 2016-12-14 03:40, Ryota Ozaki wrote: That's one option though, migrating to dhcpcd is a pretty big task for us and we don't have time to do so :-/ Just using dhcpcd for IPv6 RS/RA handling should be straight-forward and that would be all that is required. I know you guys use something else for DHCP/DHCPv6 and I'm not suggeting you change that. Roy
Re: NET_MPSAFE [was Re: CVS commit: src/sys]
On Mon, Dec 12, 2016 at 10:41 PM, Roy Marpleswrote: > On 2016-12-12 13:26, Ryota Ozaki wrote: >> >> On Tue, Nov 29, 2016 at 12:37 AM, Roy Marples wrote: >>> >>> On 28/11/2016 09:43, Ryota Ozaki wrote: (2) MP-ification of IPv6-specific stuffs (nd_defrouter, nd_prefix, etc.) >>> >>> >>> Rip the ND RA stuff out, let userland handle it entirely. >>> Less work, less code, less bloat :) >> >> >> Nobody objects? :) >> >> ...but I think we (IIJ) still need the feature. > > > I'm sure someone would object once it's been ripped out ;) > I'm actualy surprised no-one has so far > > IIJ could use dhcpcd just to handle IPv6 RS/RA alongside their existing > offering I would imagine. That's one option though, migrating to dhcpcd is a pretty big task for us and we don't have time to do so :-/ ozaki-r
Re: NET_MPSAFE [was Re: CVS commit: src/sys]
On 2016-12-12 13:26, Ryota Ozaki wrote: On Tue, Nov 29, 2016 at 12:37 AM, Roy Marpleswrote: On 28/11/2016 09:43, Ryota Ozaki wrote: (2) MP-ification of IPv6-specific stuffs (nd_defrouter, nd_prefix, etc.) Rip the ND RA stuff out, let userland handle it entirely. Less work, less code, less bloat :) Nobody objects? :) ...but I think we (IIJ) still need the feature. I'm sure someone would object once it's been ripped out ;) I'm actualy surprised no-one has so far IIJ could use dhcpcd just to handle IPv6 RS/RA alongside their existing offering I would imagine. Roy
Re: NET_MPSAFE [was Re: CVS commit: src/sys]
On Mon, Dec 12, 2016 at 5:24 PM, Nick Hudsonwrote: > On 11/28/16 09:43, Ryota Ozaki wrote: >> >> On Mon, Nov 28, 2016 at 4:29 PM, Nick Hudson wrote: > > >> This is hurting me again. > > > FYI > > > https://nxr.netbsd.org/xref/src/sys/dev/usb/if_axe.c#527 > > I think we need NET_MPSAFE finished for netbsd-8 Please someone(tm) help on (8) in the todo list :) ozaki-r
Re: NET_MPSAFE [was Re: CVS commit: src/sys]
On Tue, Nov 29, 2016 at 12:37 AM, Roy Marpleswrote: > On 28/11/2016 09:43, Ryota Ozaki wrote: >> (2) MP-ification of IPv6-specific stuffs >> (nd_defrouter, nd_prefix, etc.) > > Rip the ND RA stuff out, let userland handle it entirely. > Less work, less code, less bloat :) Nobody objects? :) ...but I think we (IIJ) still need the feature. ozaki-r
Re: NET_MPSAFE [was Re: CVS commit: src/sys]
On 11/28/16 09:43, Ryota Ozaki wrote: On Mon, Nov 28, 2016 at 4:29 PM, Nick Hudsonwrote: This is hurting me again. FYI https://nxr.netbsd.org/xref/src/sys/dev/usb/if_axe.c#527 I think we need NET_MPSAFE finished for netbsd-8 Nick
Re: NET_MPSAFE [was Re: CVS commit: src/sys]
On 28/11/2016 09:43, Ryota Ozaki wrote: > (2) MP-ification of IPv6-specific stuffs > (nd_defrouter, nd_prefix, etc.) Rip the ND RA stuff out, let userland handle it entirely. Less work, less code, less bloat :) Roy
Re: NET_MPSAFE [was Re: CVS commit: src/sys]
On Mon, Nov 28, 2016 at 4:29 PM, Nick Hudsonwrote: > On 08/13/16 14:27, Ryota Ozaki wrote: >> >> On Fri, Aug 12, 2016 at 11:04 PM, Nick Hudson wrote: >>> >>> On 07/28/16 10:03, Ryota Ozaki wrote: Module Name:src Committed By: ozaki-r Date: Thu Jul 28 09:03:51 UTC 2016 Modified Files: src/sys/netinet: if_arp.c in.c src/sys/netinet6: in6.c nd6_nbr.c Log Message: Fix panic on adding/deleting IP addresses under network load Adding and deleting IP addresses aren't serialized with other network opeartions, e.g., forwarding packets. So if we add or delete an IP address under network load, a kernel panic may happen on manipulating network-related shared objects such as rtentry and rtcache. To avoid such panicks, we still need to hold softnet_lock in in_control and in6_control that are called via ioctl and do network-related operations including IP address additions/deletions. Fix PR kern/51356 >>> >>> >>> Hi, >>> >>> This is contributory to the problems in >>> >>> http://gnats.netbsd.org/49065 >>> >>> http://gnats.netbsd.org/50491 >>> >>> http://gnats.netbsd.org/51395 >>> >>> Where softnet_lock is held by something that sleeps, e.g. a usb transfer. >>> >>> http://mail-index.netbsd.org/tech-net/2015/12/06/msg005443.html >>> >>> This patch >>> >>> http://www.netbsd.org/~skrll/usb.softint.diff >>> >>> helps, but I'm not sure it deals with all the problems in the network >>> stack. >>> Is this something you intend to address? >> >> No. The commit prevents parallel accesses on shared data (rtentry, ifaddr, >> etc.). The issue of USB transfers seem to be a deadlock between softints >> of the network stack and USB interrupt processing. >> >> I think we can commit your patch if it fixes the PRs and doesn't break >> anything. >> >> Of course we should get rid of softnet_lock at some point. >> >> Thanks, >>ozaki-r >> > This is hurting me again. > > Can you, or someone else at iij, explain the plan to allow NET_MPSAFE to > be enabled by default. Perhaps others can help if there are clear steps. AFAIK, we need to do the following tasks: (1) MP-ification of the routing table (2) MP-ification of IPv6-specific stuffs (nd_defrouter, nd_prefix, etc.) (3) MP-ification of some components (needed by IIJ) (pppoe, bpf, vlan, ipsec, opencrypto and pfil.) (4) Restructuring for MP-ification of bpf - Move bpf_mtap in drivers to softint (percpu if_input) - Prevent bpf_mtap from being called in if_start that runs probably in interrupt context - Make the ieee80211 stack and wireless drivers run in softint - They call bpf_mtap variants (5) Adding KERNEL_LOCK (and/or softnet_lock) to pr_input (input routine from Layer 3 to Layer 4) if needed (6) Make statistical counters per-CPU (7) Make some print functions MP-safe - Stop using static local buffers (8) Adding KERNEL_LOCK (and/or softnet_lock) to non-MP-safe components (ALTQ, pf/ipf, tun/tap, and many others) (or MP-ifying them) I'm working on (1) and will propose a patch soon. We'll work on (2)-(7) in (half of) a year. We don't have time to do (8). (3) and (4) are not must for enabling NET_MPSAFE and instead adding KERNEL_LOCK is probably enough. And also (6) and (7) are not critical and we can do it later. So it's helpful for us (IIJ) that someone works on (8). (Of course, working on other tasks is also welcome.) Regards, ozaki-r