Re: [PATCH net] Panic when tc_lookup_action_n finds a partially initialized action.
On Tue, Oct 11, 2016 at 2:28 AM, Krister Johansen wrote: > On Wed, Oct 05, 2016 at 11:01:38AM -0700, Cong Wang wrote: >> Does the attached patch make any sense now? Our pernet init doesn't >> rely on act_base, so even we have some race, the worst case is after >> we initialize the pernet netns for an action but its ops still not >> visible, which seems fine (at least no crash). > > I tried to reproduce the panic with this latest patch, but I am unable > to do so. The one difference I notice between this patch, and the one I Nice, so the crash is fixed. I will send out my patch formally. > sent to the list, is that with yours it takes much longer before we get > any output from the simultaneous launch of these containers. Presumably > that's the extra latency added by allowing many extra modprobe calls to > get spawned by request_module(). This is a different problem. When we register a pernet ops, the ops->init() will be called on each container to initialize its pernet data structures, this is why request_module() blocks on waiting for register_pernet_subsys() to finish. As we discussed earlier, this could be solved or workaround by loading modules prior to creating containers. Or if this really needs to be fixed, it should be in register_pernet_subsys(), since it is not specific to tc actions, we have so many places loading modules at run-time in networking subsystem. Thanks for testing!
Re: [PATCH net] Panic when tc_lookup_action_n finds a partially initialized action.
On Sat, Oct 8, 2016 at 11:13 PM, Krister Johansen wrote: > Hi Cong, > Thanks for the follow-up. > > On Thu, Oct 06, 2016 at 12:01:15PM -0700, Cong Wang wrote: >> On Wed, Oct 5, 2016 at 11:11 PM, Krister Johansen >> > pernet_operations pointer. The code in register_pernet_subsys() makes >> > no attempt to check for duplicates. If we add a pointer that's already >> > in the list, and subsequently call unregister, the results seem >> > undefined. It looks like we'll remove the pernet_operations for the >> > existing action, assuming we don't corrupt the list in the process. >> > >> > Is this actually safe? If so, what corner case is the act->type / >> > act->kind protecting us from? >> >> ops->type and ops->kind should be unique too, user-space already >> relies on this (tc action ls action xxx). The code exists probably just >> for sanity check. > > With that in mind, would it make sense to change the check to a WARN/BUG > or some kind of assertion? I mistakenly inferred that it was possible to > legtimately end up in this scenario. Yes, it makes sense to me. >> > Part of the desire to inhibit extra modprobe calls is that if hundreds >> > of these all start at once on boot, it's really unnecessary to have all >> > of the rest of them wait while lots of extra modprobe calls are forked >> > by the kernel. >> >> You can tell systemd to load these modules before starting these >> containers to avoid blocking, no? > > That was exactly what I did to work around the panic until I was able to > get a patch together. The preload of the modules is still occurring, > but I was hoping to excise that workaround entirely. Or you can compile these modules into kernel, but I am not sure about the dependencies. :-D Thanks.
Re: [PATCH net] Panic when tc_lookup_action_n finds a partially initialized action.
On Wed, Oct 05, 2016 at 11:01:38AM -0700, Cong Wang wrote: > Does the attached patch make any sense now? Our pernet init doesn't > rely on act_base, so even we have some race, the worst case is after > we initialize the pernet netns for an action but its ops still not > visible, which seems fine (at least no crash). I tried to reproduce the panic with this latest patch, but I am unable to do so. The one difference I notice between this patch, and the one I sent to the list, is that with yours it takes much longer before we get any output from the simultaneous launch of these containers. Presumably that's the extra latency added by allowing many extra modprobe calls to get spawned by request_module(). -K
Re: [PATCH net] Panic when tc_lookup_action_n finds a partially initialized action.
Hi Cong, Thanks for the follow-up. On Thu, Oct 06, 2016 at 12:01:15PM -0700, Cong Wang wrote: > On Wed, Oct 5, 2016 at 11:11 PM, Krister Johansen > > pernet_operations pointer. The code in register_pernet_subsys() makes > > no attempt to check for duplicates. If we add a pointer that's already > > in the list, and subsequently call unregister, the results seem > > undefined. It looks like we'll remove the pernet_operations for the > > existing action, assuming we don't corrupt the list in the process. > > > > Is this actually safe? If so, what corner case is the act->type / > > act->kind protecting us from? > > ops->type and ops->kind should be unique too, user-space already > relies on this (tc action ls action xxx). The code exists probably just > for sanity check. With that in mind, would it make sense to change the check to a WARN/BUG or some kind of assertion? I mistakenly inferred that it was possible to legtimately end up in this scenario. > So please give that patch a try, let's see if we miss any other problem. Will do. I have not forgotten. I hope to have results for you in a few days. > > Part of the desire to inhibit extra modprobe calls is that if hundreds > > of these all start at once on boot, it's really unnecessary to have all > > of the rest of them wait while lots of extra modprobe calls are forked > > by the kernel. > > You can tell systemd to load these modules before starting these > containers to avoid blocking, no? That was exactly what I did to work around the panic until I was able to get a patch together. The preload of the modules is still occurring, but I was hoping to excise that workaround entirely. Thanks, -K
Re: [PATCH net] Panic when tc_lookup_action_n finds a partially initialized action.
On Wed, Oct 5, 2016 at 11:11 PM, Krister Johansen wrote: > > I'm not sure. The reason I didn't take this approach from the outset is > that all of TC's callers of tcf_register_action pass a pointer to a > static structure as their *ops argument. The existence of code that > checks the action for uniqueness suggests that it's possible for > tcf_register_action to get passed two identical tc_action_ops. If that > happens in the current code base, we'll also get passed a duplicate Each tc action module has its own unique ops, and kernel doesn't allow one module to register twice (either in parallel or not, see add_unformed_module()), so we should not have a duplicated case. > pernet_operations pointer. The code in register_pernet_subsys() makes > no attempt to check for duplicates. If we add a pointer that's already > in the list, and subsequently call unregister, the results seem > undefined. It looks like we'll remove the pernet_operations for the > existing action, assuming we don't corrupt the list in the process. > > Is this actually safe? If so, what corner case is the act->type / > act->kind protecting us from? ops->type and ops->kind should be unique too, user-space already relies on this (tc action ls action xxx). The code exists probably just for sanity check. So please give that patch a try, let's see if we miss any other problem. > >> (Sorry that I don't have the environment to reproduce your bug) > > I'm sorry that I didn't do a good job of explaining how we end up in > this situation in the first place. I can give a few more details, > because it may explain some of my concern about the request_module() > call. > > The system that encounters this bug launches a bunch of containers from > systemd on boot. Each container creates a new user, net, pid, and mount > namespace and begins its setup. When the networking in all of these > containers, each in a new netns, try to configure TC and no modules are > loaded we end up with this race. > > I can also reproduce by unloading the modules, and then launching a > bunch of processes that configure tc in new namespaces. > > Part of the desire to inhibit extra modprobe calls is that if hundreds > of these all start at once on boot, it's really unnecessary to have all > of the rest of them wait while lots of extra modprobe calls are forked > by the kernel. You can tell systemd to load these modules before starting these containers to avoid blocking, no? Thanks.
Re: [PATCH net] Panic when tc_lookup_action_n finds a partially initialized action.
On Wed, Oct 05, 2016 at 11:01:38AM -0700, Cong Wang wrote: > On Tue, Oct 4, 2016 at 11:52 PM, Krister Johansen > wrote: > > On Mon, Oct 03, 2016 at 11:22:33AM -0700, Cong Wang wrote: > >> Please try the attached patch. I also convert the read path to RCU > >> to avoid a possible deadlock. A quick test shows no lockdep splat. > > > > I tried this patch, but it doesn't solve the problem. I got a panic on > > my very first try: > > Thanks for testing it. Absolutely; thanks for helping to try to simplify this fix. > > The problem here is the same as before: by using RCU the race isn't > > fixed because the module is still discoverable from act_base before the > > pernet initialization is completed. > > > > You can see from the trap frame that the first two arguments to > > tcf_hash_check were 0. It couldn't look up the correct per-subsystem > > pointer because the id hadn't yet been registered. > > I thought the problem is that we don't do pernet ops registration and > action ops registration atomically therefore chose to use mutex+RCU, > but I was wrong, the problem here is just ordering, we need to finish > the pernet initialization before making action ops visible. > > If so, why not just reorder them? Does the attached patch make any > sense now? Our pernet init doesn't rely on act_base, so even we have > some race, the worst case is after we initialize the pernet netns for an > action but its ops still not visible, which seems fine (at least no crash). > > Or I still miss something here? I'm not sure. The reason I didn't take this approach from the outset is that all of TC's callers of tcf_register_action pass a pointer to a static structure as their *ops argument. The existence of code that checks the action for uniqueness suggests that it's possible for tcf_register_action to get passed two identical tc_action_ops. If that happens in the current code base, we'll also get passed a duplicate pernet_operations pointer. The code in register_pernet_subsys() makes no attempt to check for duplicates. If we add a pointer that's already in the list, and subsequently call unregister, the results seem undefined. It looks like we'll remove the pernet_operations for the existing action, assuming we don't corrupt the list in the process. Is this actually safe? If so, what corner case is the act->type / act->kind protecting us from? > (Sorry that I don't have the environment to reproduce your bug) I'm sorry that I didn't do a good job of explaining how we end up in this situation in the first place. I can give a few more details, because it may explain some of my concern about the request_module() call. The system that encounters this bug launches a bunch of containers from systemd on boot. Each container creates a new user, net, pid, and mount namespace and begins its setup. When the networking in all of these containers, each in a new netns, try to configure TC and no modules are loaded we end up with this race. I can also reproduce by unloading the modules, and then launching a bunch of processes that configure tc in new namespaces. Part of the desire to inhibit extra modprobe calls is that if hundreds of these all start at once on boot, it's really unnecessary to have all of the rest of them wait while lots of extra modprobe calls are forked by the kernel. > Thanks for your patience and testing! Thank you for taking the time to look through the fix and discuss alternatives. -K
Re: [PATCH net] Panic when tc_lookup_action_n finds a partially initialized action.
On Wed, Oct 5, 2016 at 11:01 AM, Cong Wang wrote: > On Tue, Oct 4, 2016 at 11:52 PM, Krister Johansen > wrote: >> On Mon, Oct 03, 2016 at 11:22:33AM -0700, Cong Wang wrote: >>> Please try the attached patch. I also convert the read path to RCU >>> to avoid a possible deadlock. A quick test shows no lockdep splat. >> >> I tried this patch, but it doesn't solve the problem. I got a panic on >> my very first try: > > Thanks for testing it. > > >> The problem here is the same as before: by using RCU the race isn't >> fixed because the module is still discoverable from act_base before the >> pernet initialization is completed. >> >> You can see from the trap frame that the first two arguments to >> tcf_hash_check were 0. It couldn't look up the correct per-subsystem >> pointer because the id hadn't yet been registered. > > I thought the problem is that we don't do pernet ops registration and > action ops registration atomically therefore chose to use mutex+RCU, > but I was wrong, the problem here is just ordering, we need to finish > the pernet initialization before making action ops visible. > > If so, why not just reorder them? Does the attached patch make any > sense now? Our pernet init doesn't rely on act_base, so even we have > some race, the worst case is after we initialize the pernet netns for an > action but its ops still not visible, which seems fine (at least no crash). BTW, I should remove the 'if' check for unregister_pernet_subsys() in tcf_unregister_action()... Otherwise the error path doesn't work. ;-)
Re: [PATCH net] Panic when tc_lookup_action_n finds a partially initialized action.
On Tue, Oct 4, 2016 at 11:52 PM, Krister Johansen wrote: > On Mon, Oct 03, 2016 at 11:22:33AM -0700, Cong Wang wrote: >> Please try the attached patch. I also convert the read path to RCU >> to avoid a possible deadlock. A quick test shows no lockdep splat. > > I tried this patch, but it doesn't solve the problem. I got a panic on > my very first try: Thanks for testing it. > The problem here is the same as before: by using RCU the race isn't > fixed because the module is still discoverable from act_base before the > pernet initialization is completed. > > You can see from the trap frame that the first two arguments to > tcf_hash_check were 0. It couldn't look up the correct per-subsystem > pointer because the id hadn't yet been registered. I thought the problem is that we don't do pernet ops registration and action ops registration atomically therefore chose to use mutex+RCU, but I was wrong, the problem here is just ordering, we need to finish the pernet initialization before making action ops visible. If so, why not just reorder them? Does the attached patch make any sense now? Our pernet init doesn't rely on act_base, so even we have some race, the worst case is after we initialize the pernet netns for an action but its ops still not visible, which seems fine (at least no crash). Or I still miss something here? (Sorry that I don't have the environment to reproduce your bug) Thanks for your patience and testing! diff --git a/net/sched/act_api.c b/net/sched/act_api.c index d09d068..6024920 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -341,22 +341,21 @@ int tcf_register_action(struct tc_action_ops *act, if (!act->act || !act->dump || !act->init || !act->walk || !act->lookup) return -EINVAL; + ret = register_pernet_subsys(ops); + if (ret) + return ret; + write_lock(&act_mod_lock); list_for_each_entry(a, &act_base, head) { if (act->type == a->type || (strcmp(act->kind, a->kind) == 0)) { write_unlock(&act_mod_lock); + unregister_pernet_subsys(ops); return -EEXIST; } } list_add_tail(&act->head, &act_base); write_unlock(&act_mod_lock); - ret = register_pernet_subsys(ops); - if (ret) { - tcf_unregister_action(act, ops); - return ret; - } - return 0; } EXPORT_SYMBOL(tcf_register_action); @@ -367,8 +366,6 @@ int tcf_unregister_action(struct tc_action_ops *act, struct tc_action_ops *a; int err = -ENOENT; - unregister_pernet_subsys(ops); - write_lock(&act_mod_lock); list_for_each_entry(a, &act_base, head) { if (a == act) { @@ -378,6 +375,8 @@ int tcf_unregister_action(struct tc_action_ops *act, } } write_unlock(&act_mod_lock); + if (!err) + unregister_pernet_subsys(ops); return err; } EXPORT_SYMBOL(tcf_unregister_action);
Re: [PATCH net] Panic when tc_lookup_action_n finds a partially initialized action.
On Mon, Oct 03, 2016 at 11:22:33AM -0700, Cong Wang wrote: > Please try the attached patch. I also convert the read path to RCU > to avoid a possible deadlock. A quick test shows no lockdep splat. I tried this patch, but it doesn't solve the problem. I got a panic on my very first try: SYSTEM MAP: /boot/System.map-4.7.5-1.el7.x86_64 DEBUG KERNEL: /usr/lib/debug/lib/modules/4.7.5-1.el7.x86_64/vmlinux (4.7.5-1.el7.x86_64) DUMPFILE: ./vmcore [PARTIAL DUMP] CPUS: 4 DATE: Tue Oct 4 22:43:45 2016 UPTIME: 00:02:32 LOAD AVERAGE: 0.20, 0.14, 0.06 TASKS: 1637 RELEASE: 4.7.5-1.el7.x86_64 VERSION: #1 SMP Tue Oct 4 21:58:08 PDT 2016 MACHINE: x86_64 (3392 Mhz) MEMORY: 4 GB PANIC: "BUG: unable to handle kernel NULL pointer dereference at (null)" PID: 20151 COMMAND: "test" TASK: 880138f89480 [THREAD_INFO: 88009f45] CPU: 2 STATE: TASK_RUNNING (PANIC) PID: 20151 TASK: 880138f89480 CPU: 2 COMMAND: "test" #0 [88009f453368] machine_kexec at 8105c2db #1 [88009f4533c8] __crash_kexec at 8ca2 #2 [88009f453498] crash_kexec at 8d9c #3 [88009f4534b8] oops_end at 810309d1 #4 [88009f4534e0] no_context at 8106a5de #5 [88009f453540] __bad_area_nosemaphore at 8106a92e #6 [88009f453590] bad_area at 81081b45 #7 [88009f4535b8] __do_page_fault at 8106b42d #8 [88009f453620] trace_do_page_fault at 8106b5a3 #9 [88009f453658] do_async_page_fault at 81063dda #10 [88009f453670] async_page_fault at 81735af8 [exception RIP: tcf_hash_check+18] RIP: 81649772 RSP: 88009f453720 RFLAGS: 00010246 RAX: RBX: 0001 RCX: 0001 RDX: 8801297ee440 RSI: RDI: RBP: 88009f453738 R8: a04020f0 R9: 88009f453770 R10: 8164a6a1 R11: 880129621f00 R12: 8801297ee440 R13: 8800ba0f9940 R14: 880093cd3e6c R15: ORIG_RAX: CS: 0010 SS: 0018 #11 [88009f453740] tcf_mirred_init at a04011cc [act_mirred] #12 [88009f4537c8] tcf_action_init_1 at 8164a7c9 #13 [88009f453868] tcf_action_init at 8164a881 #14 [88009f4539c0] tcf_exts_validate at 8164868a #15 [88009f4539e8] u32_set_parms at a04153f2 [cls_u32] #16 [88009f453a60] u32_change at a0416bc4 [cls_u32] #17 [88009f453b50] tc_ctl_tfilter at 8164900d #18 [88009f453c48] rtnetlink_rcv_msg at 8162b8f4 #19 [88009f453cc0] netlink_rcv_skb at 81650b77 #20 [88009f453ce8] rtnetlink_rcv at 8162b848 #21 [88009f453d00] netlink_unicast at 81650531 #22 [88009f453d48] netlink_sendmsg at 8165091e #23 [88009f453dc8] sock_sendmsg at 815fcae8 #24 [88009f453de8] SYSC_sendto at 815fcfa2 #25 [88009f453f18] sys_sendto at 815fdb4e #26 [88009f453f28] do_syscall_64 at 81003b12 RIP: 004cbfba RSP: 00c8200a4eb8 RFLAGS: 0212 RAX: ffda RBX: 00c82180 RCX: 004cbfba RDX: 008c RSI: 00c82007a900 RDI: 0007 RBP: R8: 00c8200fd844 R9: 000c R10: R11: 0212 R12: 00c82007a900 R13: 0003 R14: 0012492492492492 R15: 0039 ORIG_RAX: 002c CS: 0033 SS: 002b The problem here is the same as before: by using RCU the race isn't fixed because the module is still discoverable from act_base before the pernet initialization is completed. You can see from the trap frame that the first two arguments to tcf_hash_check were 0. It couldn't look up the correct per-subsystem pointer because the id hadn't yet been registered. > diff --git a/net/sched/act_api.c b/net/sched/act_api.c > index d09d068..4aac846 100644 > --- a/net/sched/act_api.c > +++ b/net/sched/act_api.c > int tcf_register_action(struct tc_action_ops *act, > struct pernet_operations *ops) > @@ -341,22 +356,23 @@ int tcf_register_action(struct tc_action_ops *act, > if (!act->act || !act->dump || !act->init || !act->walk || !act->lookup) > return -EINVAL; > > - write_lock(&act_mod_lock); > + mutex_lock(&act_mod_lock); > list_for_each_entry(a, &act_base, head) { > if (act->type == a->type || (strcmp(act->kind, a->kind) == 0)) { > - write_unlock(&act_mod_lock); > + mutex_unlock(&act_mod_lock); > return -EEXIST; > } > } > - list_add_tail(&act->head, &act_base); > - write_unlock(&act_mod_lock); > + list_add_tail_rcu(&act->head, &act_base); > > ret = register_pernet_subsys(ops); >
Re: [PATCH net] Panic when tc_lookup_action_n finds a partially initialized action.
Hi Cong, Thanks for the feedback. On Mon, Oct 03, 2016 at 11:22:33AM -0700, Cong Wang wrote: > On Sat, Oct 1, 2016 at 8:13 PM, Krister Johansen > wrote: > > A tc_action_ops structure is visibile as soon as it is placed in the > > act_base list. When tcf_regsiter_action adds an item to this list and > > drops act_mod_lock, registration is not complete until > > register_pernet_subsys() finishes. > > Hmm, good catch, but does the fix have to be so complicated? There were two reasons that the patch I submitted was more complicated than your proposal. The first is simply my own lack of knowledge. I didn't see many other net subsystems that held locks across the call to register_pernet_subsys(). I avoided doing so out of caution / paranoia. The other reason for blocking if a register_pernet_subsys() was already pending is the behavior of this code when the lookup fails. The code in tcf_action_init_1() calls request_module() when tc_lookup_action_n() fails. In the cases that I observed, this could lead to hundreds modprobe processes running for essentially the same few modules. Only one of these calls will succeed. Since the call to request_module() will sleep until the modprobe process exits, it didn't seem unreasonable to block other threads in the same code path. Instead of blocking on a redundant modprobe call, it blocks pending the completion of a modprobe that's already in progress. I admit that the patch I submitted didn't close this window entirely, but in the tests that I ran I was able to see the number of concurrent modprobe processes go from dozens down to just a few. > How about moving register_pernet_subsys() under act_mod_lock? > Similar is needed for unregister too of course. This also means > we need to convert act_mod_lock to a mutex which allows blocking. > Fortunately, we don't have to take act_mod_lock in any atomic context. If it's permissible to hold act_mod_lock across the call to register_pernet_subsys, then perhaps this could instead be simplified to use mutex_lock_interruptible() instead of RCU locking. The blocking lock would prevent other operations from triggering a modprobe until the outstanding load completes. However, the downside is that any request_module() would block all other lookups. My attempt to get around that problem was to record on the action ops whether a pernet operation was in progress. > Please try the attached patch. I also convert the read path to RCU > to avoid a possible deadlock. A quick test shows no lockdep splat. I'll give it a try, but it may take me a few days to report back with results. In the meantime, let's try to reach consensus on an acceptable solution. Thanks again, -K
Re: [PATCH net] Panic when tc_lookup_action_n finds a partially initialized action.
On Sun, Oct 02, 2016 at 09:18:06PM -0400, Jamal Hadi Salim wrote: > On 16-10-01 11:13 PM, Krister Johansen wrote: > >A tc_action_ops structure is visibile as soon as it is placed in the > >act_base list. When tcf_regsiter_action adds an item to this list and > >drops act_mod_lock, registration is not complete until > >register_pernet_subsys() finishes. > > > >If two threads attempt to modify a tc action in a way that triggers a > >module load, the thread that wins the race ends up defeferencing a NULL > >pointer after tcf_action_init_1() invokes a_o->init(). In the > >particular case that this submitter encountered, the panic occurred in > >tcf_gact_init() when net_generic() returned a NULL tc_action_net > >pointer. The gact_net_id needed to fetch the correct pointer was not > >yet set, because the register_pernet_subsys() call was pending in > >another thread. > > > >Fixes: ddf97ccdd7cb ("net_sched: add network namespace support for tc > >actions") > >Signed-off-by: Krister Johansen > > Looks reasonable to me but will let Cong a closer look since he added > that code. Thanks, I appreicate you taking a look. I'll follow up with Cong. -K
Re: [PATCH net] Panic when tc_lookup_action_n finds a partially initialized action.
On Sat, Oct 1, 2016 at 8:13 PM, Krister Johansen wrote: > A tc_action_ops structure is visibile as soon as it is placed in the > act_base list. When tcf_regsiter_action adds an item to this list and > drops act_mod_lock, registration is not complete until > register_pernet_subsys() finishes. Hmm, good catch, but does the fix have to be so complicated? How about moving register_pernet_subsys() under act_mod_lock? Similar is needed for unregister too of course. This also means we need to convert act_mod_lock to a mutex which allows blocking. Fortunately, we don't have to take act_mod_lock in any atomic context. Please try the attached patch. I also convert the read path to RCU to avoid a possible deadlock. A quick test shows no lockdep splat. Thanks! diff --git a/net/sched/act_api.c b/net/sched/act_api.c index d09d068..4aac846 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -330,7 +330,22 @@ void tcf_hashinfo_destroy(const struct tc_action_ops *ops, EXPORT_SYMBOL(tcf_hashinfo_destroy); static LIST_HEAD(act_base); -static DEFINE_RWLOCK(act_mod_lock); +static DEFINE_MUTEX(act_mod_lock); + +static int __tcf_unregister_action(struct tc_action_ops *act) +{ + struct tc_action_ops *a; + int err = -ENOENT; + + list_for_each_entry(a, &act_base, head) { + if (a == act) { + list_del_rcu(&act->head); + err = 0; + break; + } + } + return err; +} int tcf_register_action(struct tc_action_ops *act, struct pernet_operations *ops) @@ -341,22 +356,23 @@ int tcf_register_action(struct tc_action_ops *act, if (!act->act || !act->dump || !act->init || !act->walk || !act->lookup) return -EINVAL; - write_lock(&act_mod_lock); + mutex_lock(&act_mod_lock); list_for_each_entry(a, &act_base, head) { if (act->type == a->type || (strcmp(act->kind, a->kind) == 0)) { - write_unlock(&act_mod_lock); + mutex_unlock(&act_mod_lock); return -EEXIST; } } - list_add_tail(&act->head, &act_base); - write_unlock(&act_mod_lock); + list_add_tail_rcu(&act->head, &act_base); ret = register_pernet_subsys(ops); if (ret) { - tcf_unregister_action(act, ops); + __tcf_unregister_action(act); + mutex_unlock(&act_mod_lock); return ret; } + mutex_unlock(&act_mod_lock); return 0; } EXPORT_SYMBOL(tcf_register_action); @@ -364,20 +380,13 @@ EXPORT_SYMBOL(tcf_register_action); int tcf_unregister_action(struct tc_action_ops *act, struct pernet_operations *ops) { - struct tc_action_ops *a; int err = -ENOENT; + mutex_lock(&act_mod_lock); unregister_pernet_subsys(ops); + err = __tcf_unregister_action(act); + mutex_unlock(&act_mod_lock); - write_lock(&act_mod_lock); - list_for_each_entry(a, &act_base, head) { - if (a == act) { - list_del(&act->head); - err = 0; - break; - } - } - write_unlock(&act_mod_lock); return err; } EXPORT_SYMBOL(tcf_unregister_action); @@ -388,15 +397,15 @@ static struct tc_action_ops *tc_lookup_action_n(char *kind) struct tc_action_ops *a, *res = NULL; if (kind) { - read_lock(&act_mod_lock); - list_for_each_entry(a, &act_base, head) { + rcu_read_lock(); + list_for_each_entry_rcu(a, &act_base, head) { if (strcmp(kind, a->kind) == 0) { if (try_module_get(a->owner)) res = a; break; } } - read_unlock(&act_mod_lock); + rcu_read_unlock(); } return res; } @@ -407,15 +416,15 @@ static struct tc_action_ops *tc_lookup_action(struct nlattr *kind) struct tc_action_ops *a, *res = NULL; if (kind) { - read_lock(&act_mod_lock); - list_for_each_entry(a, &act_base, head) { + rcu_read_lock(); + list_for_each_entry_rcu(a, &act_base, head) { if (nla_strcmp(kind, a->kind) == 0) { if (try_module_get(a->owner)) res = a; break; } } - read_unlock(&act_mod_lock); + rcu_read_unlock(); } return res; }
Re: [PATCH net] Panic when tc_lookup_action_n finds a partially initialized action.
On 16-10-01 11:13 PM, Krister Johansen wrote: A tc_action_ops structure is visibile as soon as it is placed in the act_base list. When tcf_regsiter_action adds an item to this list and drops act_mod_lock, registration is not complete until register_pernet_subsys() finishes. If two threads attempt to modify a tc action in a way that triggers a module load, the thread that wins the race ends up defeferencing a NULL pointer after tcf_action_init_1() invokes a_o->init(). In the particular case that this submitter encountered, the panic occurred in tcf_gact_init() when net_generic() returned a NULL tc_action_net pointer. The gact_net_id needed to fetch the correct pointer was not yet set, because the register_pernet_subsys() call was pending in another thread. Fixes: ddf97ccdd7cb ("net_sched: add network namespace support for tc actions") Signed-off-by: Krister Johansen Looks reasonable to me but will let Cong a closer look since he added that code. cheers, jamal