Re: VIMAGE + kldload wlan + kldload wtap panic
I added VNET_DEBUG and noticed this warning (original scan_task code): CURVNET_SET() recursion in sosend() line 1350, prev in kern_kldload() 0xfe0002202c40 - 0xfe0002202c40 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a kdb_backtrace() at kdb_backtrace+0x37 sosend() at sosend+0xbd clnt_vc_call() at clnt_vc_call+0x3e6 clnt_reconnect_call() at clnt_reconnect_call+0xf5 newnfs_request() at newnfs_request+0x9fb nfscl_request() at nfscl_request+0x72 nfsrpc_lookup() at nfsrpc_lookup+0x1be nfs_lookup() at nfs_lookup+0x297 VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x95 lookup() at lookup+0x3b8 namei() at namei+0x484 vn_open_cred() at vn_open_cred+0x1e2 link_elf_load_file() at link_elf_load_file+0xb3 linker_load_module() at linker_load_module+0x794 kern_kldload() at kern_kldload+0x145 sys_kldload() at sys_kldload+0x84 amd64_syscall() at amd64_syscall+0x39e Xfast_syscall() at Xfast_syscall+0xf7 On Tue, Mar 6, 2012 at 7:01 PM, Adrian Chadd adrian.ch...@gmail.com wrote: Hi, What do we need to do to get VNET working in net80211 and the network drivers? Adrian On 6 March 2012 07:33, Marko Zec z...@fer.hr wrote: On Tuesday 06 March 2012 10:53:18 Monthadar Al Jaberi wrote: On Tue, Mar 6, 2012 at 12:52 AM, Marko Zec z...@fer.hr wrote: On Monday 05 March 2012 22:14:45 Monthadar Al Jaberi wrote: Hi, I am a very happy VIMAGE user. But lately I have been having problems using it, and its too complicated for me to dig in so I hope you can help me (and help Adrian too). I am using FreeBSD Current with a kernel config without wlan module and wireless devices attach kernel config. uname -a shows: FreeBSD acke 10.0-CURRENT FreeBSD 10.0-CURRENT #2: Mon Mar 5 20:02:38 CET 2012 root@acke:/usr/obj/usr/src/sys/VNET_without_wlan amd64 I run the following commands: cd /usr/sys/module/wlan make load cd /usr/sys/modules/wtap make load then: /usr/src/ools/tools/wtap/wtap/wtap c 0 ifconfig wlan create wlandev wtap0 wlanmode mesh wlandebug -i wlan0 hwmp+mesh+output+input+inact ifconfig wlan0 meshid mymesh ifconfig wlan0 inet 192.168.2.1 and freebsd panics with: Mon Mar 5 21:17:46 CET 2012 Mar 5 21:59:23 acke login: ROOT LOGIN (root) ON ttyv0 Using visibility wtap plugin... Loaded wtap wireless simulator wtap0: ieee80211_radiotap_attach: no tx channel, radiotap 0x0wtap0: ieee80211_radiotap_attach: no rx channel, radiotap 0x0wlan0: Ethernet address: 00:98:9a:98:96:97 wlan0: ieee80211_start: ignore queue, in SCAN state wlan0: [00:98:9a:98:96:97] ieee80211_alloc_node: inact_reload 2 Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex wtap0_com_lock (wtap0_com_lock) r = 0 (0xff8002395018) locked @ /usr/src/sys/modules/wlan/../../net80211/ieee80211_proto.c:1937 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a kdb_backtrace() at kdb_backtrace+0x37 _witness_debugger() at _witness_debugger+0x2c witness_warn() at witness_warn+0x2c4 trap() at trap+0x2fe calltrap() at calltrap+0x8 --- trap 0xc, rip = 0x80885d0c, rsp = 0xff80003e9a00, rbp = 0xff80003e9a20 --- rt_dispatch() at rt_dispatch+0x2c rt_ieee80211msg() at rt_ieee80211msg+0x7f scan_task() at scan_task+0x4cd taskqueue_run_locked() at taskqueue_run_locked+0x93 taskqueue_thread_loop() at taskqueue_thread_loop+0x3e It may be that scan_task() calls further down into the network stack without setting curvnet first. I added CURVNET_SET(TD_TO_VNET(curthread))/CURVNET_RESTORE() in scan_task but it didnt help scan_task() is called from a taskqueue trampoline which doesn't have a valid TD_TO_VNET(curthread) context, so what you attempted to do can't work. You need to set curvnet context to the one to which the network interface (or a packet) you're working with belongs to. Perhaps you could use (struct ieee80211_scan_state *) ss-ss_vap-iv_ifp-if_curvnet in scan_task() to set curvnet, but having no clue on how the 80211 code works, I might be wrong... And maybe you could consider rebuilding your kernel with options VNET_DEBUG turned on - that should more verbosely point to the problem at hand, while not introducing too big a performance penalty at runtime. Good luck, Marko -- Monthadar Al Jaberi ___ freebsd-virtualization@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to freebsd-virtualization-unsubscr...@freebsd.org
Re: VIMAGE + kldload wlan + kldload wtap panic
On Tuesday 06 March 2012 19:01:11 Adrian Chadd wrote: Hi, What do we need to do to get VNET working in net80211 and the network drivers? In principle one shouldn't have to touch anything in the network drivers. In order for networking code to properly resolve virtualized state, curvnet (which is a synonim for curthread-td_vnet) must be set to a proper context when calling into networking code, and restored to previous value on returned. CURVNET_SET() and CURVNET_RESTORE() are macros which should be used to do this. As far as I know there are three ways for a function call to end up in the networking code: A) system calls (socket related, or syctls); B) servicing inbound packets; and C) timers / periodic tasks. Cases A) and B) should be mostly covered in generic socket handlers and in netisr dispatchers, except sysctls which need to access any state you may have decided to virtualize - those should converted to use SYSCTL_VNET_* macros. Cases which fall into category C) are the ones which probably require most attention and in certain cases may require certain amount of code restructuring. Handlers which are dispatched via timers are not bound to any of the network stack instances, so in those you have to follow one of the two possible strategies: 1) iterate over all existing VNETs and do your timer-driven stuff once in each of those - in that case take a look at how VNET_LIST_RLOCK(), VNET_ITERATOR_DECL() and VNET_FOREACH() family of macros are used elsewhere, or 2) if the timer-driven event is bound to an object which is associated with an ifnet interface, you should explicitly set the curvnet context to match the one the ifnet is associated with, using CURVNET_SET(ifp-if_vnet), as I pointed out earlier in this thread. Buiding a VNET_DEBUG kernel may help you a bit to find out the critical spots when things go wrong, but don't expect miracles... I think both Julian and Bjoern have already written pretty nice and extensive documents covering those topics, but I would't know the current status or whereabouts of those... Hope this helps, Marko Adrian On 6 March 2012 07:33, Marko Zec z...@fer.hr wrote: On Tuesday 06 March 2012 10:53:18 Monthadar Al Jaberi wrote: On Tue, Mar 6, 2012 at 12:52 AM, Marko Zec z...@fer.hr wrote: On Monday 05 March 2012 22:14:45 Monthadar Al Jaberi wrote: Hi, I am a very happy VIMAGE user. But lately I have been having problems using it, and its too complicated for me to dig in so I hope you can help me (and help Adrian too). I am using FreeBSD Current with a kernel config without wlan module and wireless devices attach kernel config. uname -a shows: FreeBSD acke 10.0-CURRENT FreeBSD 10.0-CURRENT #2: Mon Mar 5 20:02:38 CET 2012 root@acke:/usr/obj/usr/src/sys/VNET_without_wlan amd64 I run the following commands: cd /usr/sys/module/wlan make load cd /usr/sys/modules/wtap make load then: /usr/src/ools/tools/wtap/wtap/wtap c 0 ifconfig wlan create wlandev wtap0 wlanmode mesh wlandebug -i wlan0 hwmp+mesh+output+input+inact ifconfig wlan0 meshid mymesh ifconfig wlan0 inet 192.168.2.1 and freebsd panics with: Mon Mar 5 21:17:46 CET 2012 Mar 5 21:59:23 acke login: ROOT LOGIN (root) ON ttyv0 Using visibility wtap plugin... Loaded wtap wireless simulator wtap0: ieee80211_radiotap_attach: no tx channel, radiotap 0x0wtap0: ieee80211_radiotap_attach: no rx channel, radiotap 0x0wlan0: Ethernet address: 00:98:9a:98:96:97 wlan0: ieee80211_start: ignore queue, in SCAN state wlan0: [00:98:9a:98:96:97] ieee80211_alloc_node: inact_reload 2 Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex wtap0_com_lock (wtap0_com_lock) r = 0 (0xff8002395018) locked @ /usr/src/sys/modules/wlan/../../net80211/ieee80211_proto.c:1937 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a kdb_backtrace() at kdb_backtrace+0x37 _witness_debugger() at _witness_debugger+0x2c witness_warn() at witness_warn+0x2c4 trap() at trap+0x2fe calltrap() at calltrap+0x8 --- trap 0xc, rip = 0x80885d0c, rsp = 0xff80003e9a00, rbp = 0xff80003e9a20 --- rt_dispatch() at rt_dispatch+0x2c rt_ieee80211msg() at rt_ieee80211msg+0x7f scan_task() at scan_task+0x4cd taskqueue_run_locked() at taskqueue_run_locked+0x93 taskqueue_thread_loop() at taskqueue_thread_loop+0x3e It may be that scan_task() calls further down into the network stack without setting curvnet first. I added CURVNET_SET(TD_TO_VNET(curthread))/CURVNET_RESTORE() in scan_task but it didnt help scan_task() is called from a taskqueue trampoline which doesn't have a valid TD_TO_VNET(curthread) context, so what you attempted to do can't work. You need to set curvnet context to the one to which the network interface (or a packet) you're working with belongs to. Perhaps
Re: VIMAGE + kldload wlan + kldload wtap panic
On Tue, Mar 6, 2012 at 9:57 PM, Marko Zec z...@fer.hr wrote: On Tuesday 06 March 2012 21:29:32 Monthadar Al Jaberi wrote: On Tue, Mar 6, 2012 at 9:22 PM, Marko Zec z...@fer.hr wrote: On Tuesday 06 March 2012 21:13:00 Monthadar Al Jaberi wrote: I am confused so whats the difference between having wlan in kernel config or not? Cuase that seems the reason why we panic... linker problems? Its not impossible. Have you tried to do CURVNET_SET(ss-ss_vap-iv_ifp-if_vnet) on entry to scan_task() as I suggested earlier in this thread? this is the code I added: diff --git a/sys/net80211/ieee80211_scan.c b/sys/net80211/ieee80211_scan.c index 5c1e3d9..bd20653 100644 --- a/sys/net80211/ieee80211_scan.c +++ b/sys/net80211/ieee80211_scan.c @@ -850,6 +850,7 @@ scan_task(void *arg, int pending) int scandone = 0; IEEE80211_LOCK(ic); + CURVNET_SET((struct ieee80211_scan_state *) ss-ss_vap-iv_ifp-if_curvnet); ^^^ You couldn't have ever compiled this, so you must be booting an old kernel. Whats wrong with this line, I am running new kernel remember I compile wlan afterward and kldload it it seems to compile fine if I type wrong names inside CURVNET_SET hmm... I cant copy/paste db output from VBox butI am attaching two pictures. Pls. make sure you have actually rebuilt and rebooted a new kernel, an let us know the outcome. Thanks Marko if (vap == NULL || (ic-ic_flags IEEE80211_F_SCAN) == 0 || (SCAN_PRIVATE(ss)-ss_iflags ISCAN_ABORT)) { /* Cancelled before we started */ @@ -1004,6 +1005,7 @@ scan_task(void *arg, int pending) ss-ss_ops-scan_restart(ss, vap); /* XXX? */ ieee80211_runtask(ic, SCAN_PRIVATE(ss)-ss_scan_task); IEEE80211_UNLOCK(ic); + CURVNET_RESTORE(); return; } @@ -1043,6 +1045,7 @@ done: SCAN_PRIVATE(ss)-ss_iflags = ~(ISCAN_CANCEL|ISCAN_ABORT); ss-ss_flags = ~(IEEE80211_SCAN_ONCE | IEEE80211_SCAN_PICK1ST); IEEE80211_UNLOCK(ic); + CURVNET_RESTORE(); #undef ISCAN_REP } same panic... Cheers, Marko On Tue, Mar 6, 2012 at 9:06 PM, Adrian Chadd adrian.ch...@gmail.com wrote: Hi, The trouble here is that net80211 has quite a few other contexts that things are called from: * driver taskqueue; * net80211 taskqueue; * driver callouts; * net80211 callouts; * ioctls via net80211. That's in parallel with frame tx/rx and device ioctls. I don't personally have the time to go through net80211 and driver(s) at the moment to figure out what's going on. Since ath(4) does a bunch of frame processing in taskqueue context (and I'm trying to eliminate frame processing in _callout_ context, ew..) things can potentially get a bit hairy. Adrian On 6 March 2012 11:59, Marko Zec z...@fer.hr wrote: On Tuesday 06 March 2012 20:49:38 Monthadar Al Jaberi wrote: I added VNET_DEBUG and noticed this warning (original scan_task code): CURVNET_SET() recursion in sosend() line 1350, prev in kern_kldload() 0xfe0002202c40 - 0xfe0002202c40 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a kdb_backtrace() at kdb_backtrace+0x37 sosend() at sosend+0xbd clnt_vc_call() at clnt_vc_call+0x3e6 clnt_reconnect_call() at clnt_reconnect_call+0xf5 newnfs_request() at newnfs_request+0x9fb nfscl_request() at nfscl_request+0x72 nfsrpc_lookup() at nfsrpc_lookup+0x1be nfs_lookup() at nfs_lookup+0x297 VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x95 lookup() at lookup+0x3b8 namei() at namei+0x484 vn_open_cred() at vn_open_cred+0x1e2 link_elf_load_file() at link_elf_load_file+0xb3 linker_load_module() at linker_load_module+0x794 kern_kldload() at kern_kldload+0x145 sys_kldload() at sys_kldload+0x84 amd64_syscall() at amd64_syscall+0x39e Xfast_syscall() at Xfast_syscall+0xf7 You can safely ignore those. Recursing on curvnet is harmless, but in certain cases can't be avoided. When injecting new CURVNET_SET() / CURVNET_RESTORE() points in the existing code, those warnings are here to help us becoming aware that we are setting curvnet in a function which was invoked with an already valid curvnet context. Marko -- Monthadar Al Jaberi ___ freebsd-virtualization@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to freebsd-virtualization-unsubscr...@freebsd.org
Re: VIMAGE + kldload wlan + kldload wtap panic
On Tuesday 06 March 2012 22:30:26 Monthadar Al Jaberi wrote: On Tue, Mar 6, 2012 at 9:57 PM, Marko Zec z...@fer.hr wrote: On Tuesday 06 March 2012 21:29:32 Monthadar Al Jaberi wrote: On Tue, Mar 6, 2012 at 9:22 PM, Marko Zec z...@fer.hr wrote: On Tuesday 06 March 2012 21:13:00 Monthadar Al Jaberi wrote: I am confused so whats the difference between having wlan in kernel config or not? Cuase that seems the reason why we panic... linker problems? Its not impossible. Have you tried to do CURVNET_SET(ss-ss_vap-iv_ifp-if_vnet) on entry to scan_task() as I suggested earlier in this thread? this is the code I added: diff --git a/sys/net80211/ieee80211_scan.c b/sys/net80211/ieee80211_scan.c index 5c1e3d9..bd20653 100644 --- a/sys/net80211/ieee80211_scan.c +++ b/sys/net80211/ieee80211_scan.c @@ -850,6 +850,7 @@ scan_task(void *arg, int pending) int scandone = 0; IEEE80211_LOCK(ic); + CURVNET_SET((struct ieee80211_scan_state *) ss-ss_vap-iv_ifp-if_curvnet); ^^^ You couldn't have ever compiled this, so you must be booting an old kernel. Whats wrong with this line, I am running new kernel remember I compile wlan afterward and kldload it struct ifnet doesn't have a field named if_curvnet, but it does contain a field named if_vnet. it seems to compile fine if I type wrong names inside CURVNET_SET hmm... No it does not compile. Marko I cant copy/paste db output from VBox butI am attaching two pictures. Pls. make sure you have actually rebuilt and rebooted a new kernel, an let us know the outcome. Thanks Marko if (vap == NULL || (ic-ic_flags IEEE80211_F_SCAN) == 0 || (SCAN_PRIVATE(ss)-ss_iflags ISCAN_ABORT)) { /* Cancelled before we started */ @@ -1004,6 +1005,7 @@ scan_task(void *arg, int pending) ss-ss_ops-scan_restart(ss, vap); /* XXX? */ ieee80211_runtask(ic, SCAN_PRIVATE(ss)-ss_scan_task); IEEE80211_UNLOCK(ic); + CURVNET_RESTORE(); return; } @@ -1043,6 +1045,7 @@ done: SCAN_PRIVATE(ss)-ss_iflags = ~(ISCAN_CANCEL|ISCAN_ABORT); ss-ss_flags = ~(IEEE80211_SCAN_ONCE | IEEE80211_SCAN_PICK1ST); IEEE80211_UNLOCK(ic); + CURVNET_RESTORE(); #undef ISCAN_REP } same panic... Cheers, Marko On Tue, Mar 6, 2012 at 9:06 PM, Adrian Chadd adrian.ch...@gmail.com wrote: Hi, The trouble here is that net80211 has quite a few other contexts that things are called from: * driver taskqueue; * net80211 taskqueue; * driver callouts; * net80211 callouts; * ioctls via net80211. That's in parallel with frame tx/rx and device ioctls. I don't personally have the time to go through net80211 and driver(s) at the moment to figure out what's going on. Since ath(4) does a bunch of frame processing in taskqueue context (and I'm trying to eliminate frame processing in _callout_ context, ew..) things can potentially get a bit hairy. Adrian On 6 March 2012 11:59, Marko Zec z...@fer.hr wrote: On Tuesday 06 March 2012 20:49:38 Monthadar Al Jaberi wrote: I added VNET_DEBUG and noticed this warning (original scan_task code): CURVNET_SET() recursion in sosend() line 1350, prev in kern_kldload() 0xfe0002202c40 - 0xfe0002202c40 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a kdb_backtrace() at kdb_backtrace+0x37 sosend() at sosend+0xbd clnt_vc_call() at clnt_vc_call+0x3e6 clnt_reconnect_call() at clnt_reconnect_call+0xf5 newnfs_request() at newnfs_request+0x9fb nfscl_request() at nfscl_request+0x72 nfsrpc_lookup() at nfsrpc_lookup+0x1be nfs_lookup() at nfs_lookup+0x297 VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x95 lookup() at lookup+0x3b8 namei() at namei+0x484 vn_open_cred() at vn_open_cred+0x1e2 link_elf_load_file() at link_elf_load_file+0xb3 linker_load_module() at linker_load_module+0x794 kern_kldload() at kern_kldload+0x145 sys_kldload() at sys_kldload+0x84 amd64_syscall() at amd64_syscall+0x39e Xfast_syscall() at Xfast_syscall+0xf7 You can safely ignore those. Recursing on curvnet is harmless, but in certain cases can't be avoided. When injecting new CURVNET_SET() / CURVNET_RESTORE() points in the existing code, those warnings are here to help us becoming aware that we are setting curvnet in a function which was invoked with an already valid curvnet context. Marko ___ freebsd-virtualization@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to
Re: VIMAGE + kldload wlan + kldload wtap panic
On Tuesday 06 March 2012 22:49:56 Adrian Chadd wrote: Wait a sec. Is it possible that the macro is a no-op when we're building modules w/ VNET? Yes that's it, the VNET stuff resolves (mostly) to whitespace when options VIMAGE is not on! Marko ___ freebsd-virtualization@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to freebsd-virtualization-unsubscr...@freebsd.org
Re: VIMAGE + kldload wlan + kldload wtap panic
On Tuesday 06 March 2012 22:52:43 Adrian Chadd wrote: On 6 March 2012 13:51, Marko Zec z...@fer.hr wrote: On Tuesday 06 March 2012 22:49:56 Adrian Chadd wrote: Wait a sec. Is it possible that the macro is a no-op when we're building modules w/ VNET? Yes that's it, the VNET stuff resolves (mostly) to whitespace when options VIMAGE is not on! So the question is - how the heck are the modules supposed to pull in the VNET config? is there a missing opt_x.h somewhere? No, you don't need any extra explicit #includes, but you should see this in opt_global.h if you've really configured the kernel with options VIMAGE: #define VIMAGE 1 Marko ___ freebsd-virtualization@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to freebsd-virtualization-unsubscr...@freebsd.org