Re: [PATCH v2] net/sock: Update sk rcu iterator macro.
On Mon, Oct 23, 2017 at 11:39:22AM -0400, Tim Hansen wrote: >Mark hlist nodes in sk rcu iterator as protected by the rcu. >hlist_next_rcu accomplishes this and silences the warnings >sparse throws for net/ipv4/udp.c and net/ipv6/udp.c. > >Found with make C=1 net/ipv4/udp.o on linux-next tag >next-20171009. > >Signed-off-by: Tim Hansen>--- > include/net/sock.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > >diff --git a/include/net/sock.h b/include/net/sock.h >index 64e5ac41b9cf..96cb7b7e4924 100644 >--- a/include/net/sock.h >+++ b/include/net/sock.h >@@ -737,10 +737,10 @@ static inline void sk_add_bind_node(struct sock *sk, > * > */ > #define sk_for_each_entry_offset_rcu(tpos, pos, head, offset)\ >- for (pos = rcu_dereference((head)->first); \ >+ for (pos = rcu_dereference(hlist_next_rcu((head)->first)); \ See below. >pos != NULL &&\ > ({ tpos = (typeof(*tpos) *)((void *)pos - offset); 1;}); \ >- pos = rcu_dereference(pos->next)) >+ pos = rcu_dereference(hlist_next_rcu(pos->next))) This will return the next-next node instead of just the next one. You probably want just hlist_next_rcu(pos) here. -- Thanks, Sasha
Re: [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.
On Mon, Oct 09, 2017 at 04:23:57PM -0700, Alexei Starovoitov wrote: >On Mon, Oct 09, 2017 at 11:15:40PM +0000, Levin, Alexander (Sasha Levin) wrote: >> On Mon, Oct 09, 2017 at 04:06:20PM -0700, Alexei Starovoitov wrote: >> >On Mon, Oct 09, 2017 at 08:26:34PM +, Levin, Alexander (Sasha Levin) >> >wrote: >> >> On Mon, Oct 09, 2017 at 10:15:42AM -0700, Alexei Starovoitov wrote: >> >> >On Mon, Oct 09, 2017 at 11:37:59AM -0400, Tim Hansen wrote: >> >> >> Fix BUG() calls to use BUG_ON(conditional) macros. >> >> >> >> >> >> This was found using make coccicheck M=net/core on linux next >> >> >> tag next-2017092 >> >> >> >> >> >> Signed-off-by: Tim Hansen <devtimhan...@gmail.com> >> >> >> --- >> >> >> net/core/skbuff.c | 15 ++- >> >> >> 1 file changed, 6 insertions(+), 9 deletions(-) >> >> >> >> >> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c >> >> >> index d98c2e3ce2bf..34ce4c1a0f3c 100644 >> >> >> --- a/net/core/skbuff.c >> >> >> +++ b/net/core/skbuff.c >> >> >> @@ -1350,8 +1350,7 @@ struct sk_buff *skb_copy(const struct sk_buff >> >> >> *skb, gfp_t gfp_mask) >> >> >>/* Set the tail pointer and length */ >> >> >>skb_put(n, skb->len); >> >> >> >> >> >> - if (skb_copy_bits(skb, -headerlen, n->head, headerlen + >> >> >> skb->len)) >> >> >> - BUG(); >> >> >> + BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + >> >> >> skb->len)); >> >> > >> >> >I'm concerned with this change. >> >> >1. Calling non-trivial bit of code inside the macro is a poor coding >> >> >style (imo) >> >> >2. BUG_ON != BUG. Some archs like mips and ppc have HAVE_ARCH_BUG_ON and >> >> >implementation >> >> >of BUG and BUG_ON look quite different. >> >> >> >> For these archs, wouldn't it then be more efficient to use BUG_ON rather >> >> than BUG()? >> > >> >why more efficient? any data to prove that? >> >> Just guessing. >> >> Either way, is there a particular reason for not using BUG_ON() here >> besides that it's implementation is "quite different"? >> >> >I'm pointing that the change is not equivalent and >> >this code has been around forever (pre-git days), so I see >> >no reason to risk changing it. >> >> Do you know that BUG_ON() is broken on any archs? >> >> If not, "this code has been around forever" is really not an excuse to >> not touch code. >> >> If BUG_ON() behavior is broken somewhere, then it needs to get fixed. > >no idea whether it's broken. My main objection is #1. >imo it's a very poor coding style to put functions with >side-effects into macros. Especially debug/bug/warn-like. This, however, seems to be an accepted coding style in the net/ subsys. Look a few lines lower in the very same file to find: BUG_ON(skb_copy_bits(from, 0, skb_put(to, len), len)); Side effects ahoy ;) >For example llvm has DEBUG() macro and everything inside >will disappear depending on compilation flags. >I wouldn't be surprised if somebody for the name >of security (to avoid crash on BUG_ON) will replace >BUG/BUG_ON with some other implementation or nop >and will have real bugs, since skb_copy_bits() is somehow >not called or called in different context. This was already discussed, with the conclusion that BUG() can never be disabled, since, as you described, it'll lead to very catastrophic results. See i.e.: commit b06dd879f5db33c1d7f5ab516ea671627f99c0c9 Author: Josh Triplett <j...@joshtriplett.org> Date: Mon Apr 7 15:39:14 2014 -0700 x86: always define BUG() and HAVE_ARCH_BUG, even with !CONFIG_BUG Anyway, as you said, this boils down to coding style nitpicking. I guess that my only comment here would be that it shpid go one way or the other and we document the decision somewhere (either per subsys, or for the entire tree). -- Thanks, Sasha
Re: [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.
On Mon, Oct 09, 2017 at 04:06:20PM -0700, Alexei Starovoitov wrote: >On Mon, Oct 09, 2017 at 08:26:34PM +0000, Levin, Alexander (Sasha Levin) wrote: >> On Mon, Oct 09, 2017 at 10:15:42AM -0700, Alexei Starovoitov wrote: >> >On Mon, Oct 09, 2017 at 11:37:59AM -0400, Tim Hansen wrote: >> >> Fix BUG() calls to use BUG_ON(conditional) macros. >> >> >> >> This was found using make coccicheck M=net/core on linux next >> >> tag next-2017092 >> >> >> >> Signed-off-by: Tim Hansen <devtimhan...@gmail.com> >> >> --- >> >> net/core/skbuff.c | 15 ++- >> >> 1 file changed, 6 insertions(+), 9 deletions(-) >> >> >> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c >> >> index d98c2e3ce2bf..34ce4c1a0f3c 100644 >> >> --- a/net/core/skbuff.c >> >> +++ b/net/core/skbuff.c >> >> @@ -1350,8 +1350,7 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, >> >> gfp_t gfp_mask) >> >> /* Set the tail pointer and length */ >> >> skb_put(n, skb->len); >> >> >> >> - if (skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len)) >> >> - BUG(); >> >> + BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len)); >> > >> >I'm concerned with this change. >> >1. Calling non-trivial bit of code inside the macro is a poor coding style >> >(imo) >> >2. BUG_ON != BUG. Some archs like mips and ppc have HAVE_ARCH_BUG_ON and >> >implementation >> >of BUG and BUG_ON look quite different. >> >> For these archs, wouldn't it then be more efficient to use BUG_ON rather >> than BUG()? > >why more efficient? any data to prove that? Just guessing. Either way, is there a particular reason for not using BUG_ON() here besides that it's implementation is "quite different"? >I'm pointing that the change is not equivalent and >this code has been around forever (pre-git days), so I see >no reason to risk changing it. Do you know that BUG_ON() is broken on any archs? If not, "this code has been around forever" is really not an excuse to not touch code. If BUG_ON() behavior is broken somewhere, then it needs to get fixed. -- Thanks, Sasha
Re: [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.
On Mon, Oct 09, 2017 at 10:15:42AM -0700, Alexei Starovoitov wrote: >On Mon, Oct 09, 2017 at 11:37:59AM -0400, Tim Hansen wrote: >> Fix BUG() calls to use BUG_ON(conditional) macros. >> >> This was found using make coccicheck M=net/core on linux next >> tag next-2017092 >> >> Signed-off-by: Tim Hansen>> --- >> net/core/skbuff.c | 15 ++- >> 1 file changed, 6 insertions(+), 9 deletions(-) >> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c >> index d98c2e3ce2bf..34ce4c1a0f3c 100644 >> --- a/net/core/skbuff.c >> +++ b/net/core/skbuff.c >> @@ -1350,8 +1350,7 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, >> gfp_t gfp_mask) >> /* Set the tail pointer and length */ >> skb_put(n, skb->len); >> >> -if (skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len)) >> -BUG(); >> +BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len)); > >I'm concerned with this change. >1. Calling non-trivial bit of code inside the macro is a poor coding style >(imo) >2. BUG_ON != BUG. Some archs like mips and ppc have HAVE_ARCH_BUG_ON and >implementation >of BUG and BUG_ON look quite different. For these archs, wouldn't it then be more efficient to use BUG_ON rather than BUG()? -- Thanks, Sasha
[PATCH v3] net: inet: diag: expose sockets cgroup classid
From: "Levin, Alexander (Sasha Levin)" <alexander.le...@one.verizon.com> This is useful for directly looking up a task based on class id rather than having to scan through all open file descriptors. Signed-off-by: Sasha Levin <alexander.le...@verizon.com> --- include/uapi/linux/inet_diag.h | 1 + net/ipv4/inet_diag.c | 11 +++ 2 files changed, 12 insertions(+) diff --git a/include/uapi/linux/inet_diag.h b/include/uapi/linux/inet_diag.h index bbe201047df6..678496897a68 100644 --- a/include/uapi/linux/inet_diag.h +++ b/include/uapi/linux/inet_diag.h @@ -142,6 +142,7 @@ enum { INET_DIAG_PAD, INET_DIAG_MARK, INET_DIAG_BBRINFO, + INET_DIAG_CLASS_ID, __INET_DIAG_MAX, }; diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c index 3828b3a805cd..67325d5832d7 100644 --- a/net/ipv4/inet_diag.c +++ b/net/ipv4/inet_diag.c @@ -274,6 +274,17 @@ int inet_sk_diag_fill(struct sock *sk, struct inet_connection_sock *icsk, goto errout; } + if (ext & (1 << (INET_DIAG_CLASS_ID - 1))) { + u32 classid = 0; + +#ifdef CONFIG_SOCK_CGROUP_DATA + classid = sock_cgroup_classid(>sk_cgrp_data); +#endif + + if (nla_put_u32(skb, INET_DIAG_CLASS_ID, classid)) + goto errout; + } + out: nlmsg_end(skb, nlh); return 0; -- 2.11.0
Re: [PATCH v2] net: inet: diag: expose sockets cgroup classid
On Wed, Aug 16, 2017 at 01:15:57PM -0700, Cong Wang wrote: >On Wed, Aug 16, 2017 at 1:13 PM, Levin, Alexander (Sasha Levin) ><alexander.le...@verizon.com> wrote: >> Ping? > >I guess you missed the comment saying you need to check >the return value of nla_put_u32(). Indeed I did. Odd, I don't see the mail locally, but I can find it in patchwork I'll send a v3. -- Thanks, Sasha
Re: [PATCH v2] net: inet: diag: expose sockets cgroup classid
Ping? On Thu, Jul 27, 2017 at 02:13:52PM -0400, Sasha Levin wrote: >This is useful for directly looking up a task based on class id rather than >having to scan through all open file descriptors. > >Signed-off-by: Sasha Levin>--- > >Changes in V2: > - Addressed comments from Cong Wang (use nla_put_u32()) > > include/uapi/linux/inet_diag.h | 1 + > net/ipv4/inet_diag.c | 10 ++ > 2 files changed, 11 insertions(+) > >diff --git a/include/uapi/linux/inet_diag.h b/include/uapi/linux/inet_diag.h >index bbe201047df6..678496897a68 100644 >--- a/include/uapi/linux/inet_diag.h >+++ b/include/uapi/linux/inet_diag.h >@@ -142,6 +142,7 @@ enum { > INET_DIAG_PAD, > INET_DIAG_MARK, > INET_DIAG_BBRINFO, >+ INET_DIAG_CLASS_ID, > __INET_DIAG_MAX, > }; > >diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c >index 3828b3a805cd..2c2445d4bb58 100644 >--- a/net/ipv4/inet_diag.c >+++ b/net/ipv4/inet_diag.c >@@ -274,6 +274,16 @@ int inet_sk_diag_fill(struct sock *sk, struct >inet_connection_sock *icsk, > goto errout; > } > >+ if (ext & (1 << (INET_DIAG_CLASS_ID - 1))) { >+ u32 classid = 0; >+ >+#ifdef CONFIG_SOCK_CGROUP_DATA >+ classid = sock_cgroup_classid(>sk_cgrp_data); >+#endif >+ >+ nla_put_u32(skb, INET_DIAG_CLASS_ID, classid); >+ } >+ > out: > nlmsg_end(skb, nlh); > return 0; >-- >2.11.0 -- Thanks, Sasha
Re: [net-next PATCH 11/12] net: add notifier hooks for devmap bpf map
On Mon, Jul 17, 2017 at 09:30:02AM -0700, John Fastabend wrote: >@@ -341,9 +368,11 @@ static int dev_map_update_elem(struct bpf_map *map, void >*key, void *value, >* Remembering the driver side flush operation will happen before the >* net device is removed. >*/ >+ mutex_lock(_map_list_mutex); > old_dev = xchg(>netdev_map[i], dev); > if (old_dev) > call_rcu(_dev->rcu, __dev_map_entry_free); >+ mutex_unlock(_map_list_mutex); > > return 0; > } This function gets called under rcu critical section, where we can't grab mutexes: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747 in_atomic(): 1, irqs_disabled(): 0, pid: 16315, name: syz-executor1 1 lock held by syz-executor1/16315: #0: (rcu_read_lock){..}, at: [] map_delete_elem kernel/bpf/syscall.c:577 [inline] #0: (rcu_read_lock){..}, at: [] SYSC_bpf kernel/bpf/syscall.c:1427 [inline] #0: (rcu_read_lock){..}, at: [] SyS_bpf+0x1d32/0x4ba0 kernel/bpf/syscall.c:1388 Preemption disabled at: [] map_delete_elem kernel/bpf/syscall.c:582 [inline] [] SYSC_bpf kernel/bpf/syscall.c:1427 [inline] [] SyS_bpf+0x1d41/0x4ba0 kernel/bpf/syscall.c:1388 CPU: 2 PID: 16315 Comm: syz-executor1 Not tainted 4.13.0-rc2-next-20170727 #235 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x11d/0x1e5 lib/dump_stack.c:52 ___might_sleep+0x3cc/0x520 kernel/sched/core.c:6001 __might_sleep+0x95/0x190 kernel/sched/core.c:5954 __mutex_lock_common kernel/locking/mutex.c:747 [inline] __mutex_lock+0x146/0x19b0 kernel/locking/mutex.c:893 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908 dev_map_delete_elem+0x82/0x110 kernel/bpf/devmap.c:325 map_delete_elem kernel/bpf/syscall.c:585 [inline] SYSC_bpf kernel/bpf/syscall.c:1427 [inline] SyS_bpf+0x1deb/0x4ba0 kernel/bpf/syscall.c:1388 do_syscall_64+0x26a/0x800 arch/x86/entry/common.c:287 entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:0x452309 RSP: 002b:7f8d83d66c08 EFLAGS: 0216 ORIG_RAX: 0141 RAX: ffda RBX: 00718000 RCX: 00452309 RDX: 0010 RSI: 20007000 RDI: 0003 RBP: 0270 R08: R09: R10: R11: 0216 R12: 004b85e4 R13: R14: 0003 R15: 20007000 -- Thanks, Sasha
[PATCH v2] net: inet: diag: expose sockets cgroup classid
This is useful for directly looking up a task based on class id rather than having to scan through all open file descriptors. Signed-off-by: Sasha Levin--- Changes in V2: - Addressed comments from Cong Wang (use nla_put_u32()) include/uapi/linux/inet_diag.h | 1 + net/ipv4/inet_diag.c | 10 ++ 2 files changed, 11 insertions(+) diff --git a/include/uapi/linux/inet_diag.h b/include/uapi/linux/inet_diag.h index bbe201047df6..678496897a68 100644 --- a/include/uapi/linux/inet_diag.h +++ b/include/uapi/linux/inet_diag.h @@ -142,6 +142,7 @@ enum { INET_DIAG_PAD, INET_DIAG_MARK, INET_DIAG_BBRINFO, + INET_DIAG_CLASS_ID, __INET_DIAG_MAX, }; diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c index 3828b3a805cd..2c2445d4bb58 100644 --- a/net/ipv4/inet_diag.c +++ b/net/ipv4/inet_diag.c @@ -274,6 +274,16 @@ int inet_sk_diag_fill(struct sock *sk, struct inet_connection_sock *icsk, goto errout; } + if (ext & (1 << (INET_DIAG_CLASS_ID - 1))) { + u32 classid = 0; + +#ifdef CONFIG_SOCK_CGROUP_DATA + classid = sock_cgroup_classid(>sk_cgrp_data); +#endif + + nla_put_u32(skb, INET_DIAG_CLASS_ID, classid); + } + out: nlmsg_end(skb, nlh); return 0; -- 2.11.0
Re: [PATCH] net: inet: diag: expose sockets cgroup classid
On Wed, Jul 26, 2017 at 03:01:32PM -0700, Cong Wang wrote: >On Wed, Jul 26, 2017 at 10:22 AM, Levin, Alexander (Sasha Levin) ><alexander.le...@verizon.com> wrote: >> + if (ext & (1 << (INET_DIAG_CLASS_ID - 1))) { >> + u32 classid = 0; >> + >> +#ifdef CONFIG_SOCK_CGROUP_DATA >> + classid = sock_cgroup_classid(>sk_cgrp_data); >> +#endif > > >If CONFIG_SOCK_CGROUP_DATA is not enabled, should we put 0 >or put nothing (that is, skipping this nla_put())? My logic was that if CONFIG_SOCK_CGROUP_DATA is disabled, then all sockets have the same default classid ==> 0, rather than having to deal with whether that config is enabled or not. >> + >> + nla_put(skb, INET_DIAG_CLASS_ID, sizeof(classid), ); > >nla_put_u32() Will fix, thanks! -- Thanks, Sasha
[PATCH] net: inet: diag: expose sockets cgroup classid
This is useful for directly looking up a task based on class id rather than having to scan through all open file descriptors. Signed-off-by: Sasha Levin--- include/uapi/linux/inet_diag.h | 1 + net/ipv4/inet_diag.c | 10 ++ 2 files changed, 11 insertions(+) diff --git a/include/uapi/linux/inet_diag.h b/include/uapi/linux/inet_diag.h index bbe201047df6..678496897a68 100644 --- a/include/uapi/linux/inet_diag.h +++ b/include/uapi/linux/inet_diag.h @@ -142,6 +142,7 @@ enum { INET_DIAG_PAD, INET_DIAG_MARK, INET_DIAG_BBRINFO, + INET_DIAG_CLASS_ID, __INET_DIAG_MAX, }; diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c index 3828b3a805cd..ffc6dad9780a 100644 --- a/net/ipv4/inet_diag.c +++ b/net/ipv4/inet_diag.c @@ -274,6 +274,16 @@ int inet_sk_diag_fill(struct sock *sk, struct inet_connection_sock *icsk, goto errout; } + if (ext & (1 << (INET_DIAG_CLASS_ID - 1))) { + u32 classid = 0; + +#ifdef CONFIG_SOCK_CGROUP_DATA + classid = sock_cgroup_classid(>sk_cgrp_data); +#endif + + nla_put(skb, INET_DIAG_CLASS_ID, sizeof(classid), ); + } + out: nlmsg_end(skb, nlh); return 0; -- 2.11.0
[PATCH] wireless: wext: terminate ifr name coming from userspace
ifr name is assumed to be a valid string by the kernel, but nothing was forcing username to pass a valid string. In turn, this would cause panics as we tried to access the string past it's valid memory. Signed-off-by: Sasha Levin--- net/core/dev_ioctl.c | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) index 82fd4c9c4a1b..7657ad6bc13d 100644 --- a/net/core/dev_ioctl.c +++ b/net/core/dev_ioctl.c @@ -424,6 +424,8 @@ int dev_ioctl(struct net *net, unsigned int cmd, void __user *arg) if (copy_from_user(, arg, sizeof(iwr))) return -EFAULT; + iwr.ifr_name[sizeof(iwr.ifr_name) - 1] = 0; + return wext_handle_ioctl(net, , cmd, arg); } -- 2.11.0
Re: [PATCH 00/17] v3 net generic subsystem refcount conversions
On Mon, Jul 03, 2017 at 02:28:56AM -0700, Eric Dumazet wrote: >On Fri, 2017-06-30 at 13:07 +0300, Elena Reshetova wrote: >> Changes in v3: >> Rebased on top of the net-next tree. >> >> Changes in v2: >> No changes in patches apart from rebases, but now by >> default refcount_t = atomic_t (*) and uses all atomic standard operations >> unless CONFIG_REFCOUNT_FULL is enabled. This is a compromise for the >> systems that are critical on performance (such as net) and cannot accept even >> slight delay on the refcounter operations. >> >> This series, for core network subsystem components, replaces atomic_t >> reference >> counters with the new refcount_t type and API (see include/linux/refcount.h). >> By doing this we prevent intentional or accidental >> underflows or overflows that can led to use-after-free vulnerabilities. >> These patches contain only generic net pieces. Other changes will be sent >> separately. >> >> The patches are fully independent and can be cherry-picked separately. >> The big patches, such as conversions for sock structure, need a very detailed >> look from maintainers: refcount managing is quite complex in them and while >> it seems that they would benefit from the change, extra checking is needed. >> The biggest corner issue is the fact that refcount_inc() does not increment >> from zero. >> >> If there are no objections to the patches, please merge them via respective >> trees. >> >> * The respective change is currently merged into -next as >> "locking/refcount: Create unchecked atomic_t implementation". >> >> Elena Reshetova (17): >> net: convert inet_peer.refcnt from atomic_t to refcount_t >> net: convert neighbour.refcnt from atomic_t to refcount_t >> net: convert neigh_params.refcnt from atomic_t to refcount_t >> net: convert nf_bridge_info.use from atomic_t to refcount_t >> net: convert sk_buff.users from atomic_t to refcount_t >> net: convert sk_buff_fclones.fclone_ref from atomic_t to refcount_t >> net: convert sock.sk_wmem_alloc from atomic_t to refcount_t >> net: convert sock.sk_refcnt from atomic_t to refcount_t >> net: convert ip_mc_list.refcnt from atomic_t to refcount_t >> net: convert in_device.refcnt from atomic_t to refcount_t >> net: convert netpoll_info.refcnt from atomic_t to refcount_t >> net: convert unix_address.refcnt from atomic_t to refcount_t >> net: convert fib_rule.refcnt from atomic_t to refcount_t >> net: convert inet_frag_queue.refcnt from atomic_t to refcount_t >> net: convert net.passive from atomic_t to refcount_t >> net: convert netlbl_lsm_cache.refcount from atomic_t to refcount_t >> net: convert packet_fanout.sk_ref from atomic_t to refcount_t > > >Can you take a look at this please ? > >[ 64.601749] [ cut here ] >[ 64.601757] WARNING: CPU: 0 PID: 6476 at lib/refcount.c:184 >refcount_sub_and_test+0x75/0xa0 >[ 64.601758] Modules linked in: w1_therm wire cdc_acm ehci_pci ehci_hcd >mlx4_en ib_uverbs mlx4_ib ib_core mlx4_core >[ 64.601769] CPU: 0 PID: 6476 Comm: ip Tainted: GW >4.12.0-smp-DEV #274 >[ 64.601770] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0 06/22/2016 >[ 64.601771] task: 8837bf482040 task.stack: 8837bdc08000 >[ 64.601773] RIP: 0010:refcount_sub_and_test+0x75/0xa0 >[ 64.601774] RSP: 0018:8837bdc0f5c0 EFLAGS: 00010286 >[ 64.601776] RAX: 0026 RBX: 0001 RCX: > >[ 64.601777] RDX: 0026 RSI: 0096 RDI: >ed06f7b81eae >[ 64.601778] RBP: 8837bdc0f5d0 R08: 0004 R09: >fbfff4a54c25 >[ 64.601779] R10: cbc500e5 R11: a52a6128 R12: >881febcf6f24 >[ 64.601779] R13: 881fbf4eaf00 R14: 881febcf6f80 R15: >8837d7a4ed00 >[ 64.601781] FS: 7ff5a2f6b700() GS:881fff80() >knlGS: >[ 64.601782] CS: 0010 DS: ES: CR0: 80050033 >[ 64.601783] CR2: 7ffcdc70d000 CR3: 001f9c91e000 CR4: >001406f0 >[ 64.601783] Call Trace: >[ 64.601786] refcount_dec_and_test+0x11/0x20 >[ 64.601790] fib_nl_delrule+0xc39/0x1630 [snip] I'm seeing a similar one coming from sctp: refcount_t: underflow; use-after-free. [ cut here ] WARNING: CPU: 3 PID: 15570 at lib/refcount.c:186 refcount_sub_and_test.cold.13+0x18/0x21 lib/refcount.c:186 Kernel panic - not syncing: panic_on_warn set ... CPU: 3 PID: 15570 Comm: syz-executor0 Not tainted 4.12.0-next-20170706+ #186 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x11d/0x1ef lib/dump_stack.c:52 panic+0x1bc/0x3ad kernel/panic.c:180 __warn.cold.6+0x2f/0x2f kernel/panic.c:541 report_bug+0x20d/0x2d0 lib/bug.c:183 fixup_bug+0x3f/0x90 arch/x86/kernel/traps.c:190 do_trap_no_signal arch/x86/kernel/traps.c:224 [inline] do_trap+0x132/0x390 arch/x86/kernel/traps.c:273
Re: [PATCH v3 net-next 1/4] tcp: ULP infrastructure
On Mon, Jun 26, 2017 at 07:30:19AM -0700, Dave Watson wrote: >On 06/25/17 02:42 AM, Levin, Alexander (Sasha Levin) wrote: >> On Wed, Jun 14, 2017 at 11:37:14AM -0700, Dave Watson wrote: >> >Add the infrustructure for attaching Upper Layer Protocols (ULPs) over TCP >> >sockets. Based on a similar infrastructure in tcp_cong. The idea is that >> >any >> >ULP can add its own logic by changing the TCP proto_ops structure to its own >> >methods. >> > >> >Example usage: >> > >> >setsockopt(sock, SOL_TCP, TCP_ULP, "tls", sizeof("tls")); >> > >> >modules will call: >> >tcp_register_ulp(_tls_ulp_ops); >> > >> >to register/unregister their ulp, with an init function and name. >> > >> >A list of registered ulps will be returned by tcp_get_available_ulp, which >> >is >> >hooked up to /proc. Example: >> > >> >$ cat /proc/sys/net/ipv4/tcp_available_ulp >> >tls >> > >> >There is currently no functionality to remove or chain ULPs, but >> >it should be possible to add these in the future if needed. >> > >> >Signed-off-by: Boris Pismenny <bor...@mellanox.com> >> >Signed-off-by: Dave Watson <davejwat...@fb.com> >> >> Hey Dave, >> >> I'm seeing the following while fuzzing, which was bisected to this commit: >> >> == >> BUG: KASAN: null-ptr-deref in copy_to_user include/linux/uaccess.h:168 >> [inline] >> BUG: KASAN: null-ptr-deref in do_tcp_getsockopt.isra.33+0x24f/0x1e30 >> net/ipv4/tcp.c:3057 >> Read of size 4 at addr 0020 by task syz-executor1/15452 > >At a glance, this looks like it was fixed already by > >https://www.mail-archive.com/netdev@vger.kernel.org/msg175226.html > >Can you recheck with that patch, or verify that you already have it? >Thanks. I've already tried this patch, it doesn't fix the issue I've reported. -- Thanks, Sasha
Re: [PATCH v3 net-next 1/4] tcp: ULP infrastructure
On Wed, Jun 14, 2017 at 11:37:14AM -0700, Dave Watson wrote: >Add the infrustructure for attaching Upper Layer Protocols (ULPs) over TCP >sockets. Based on a similar infrastructure in tcp_cong. The idea is that any >ULP can add its own logic by changing the TCP proto_ops structure to its own >methods. > >Example usage: > >setsockopt(sock, SOL_TCP, TCP_ULP, "tls", sizeof("tls")); > >modules will call: >tcp_register_ulp(_tls_ulp_ops); > >to register/unregister their ulp, with an init function and name. > >A list of registered ulps will be returned by tcp_get_available_ulp, which is >hooked up to /proc. Example: > >$ cat /proc/sys/net/ipv4/tcp_available_ulp >tls > >There is currently no functionality to remove or chain ULPs, but >it should be possible to add these in the future if needed. > >Signed-off-by: Boris Pismenny>Signed-off-by: Dave Watson Hey Dave, I'm seeing the following while fuzzing, which was bisected to this commit: == BUG: KASAN: null-ptr-deref in copy_to_user include/linux/uaccess.h:168 [inline] BUG: KASAN: null-ptr-deref in do_tcp_getsockopt.isra.33+0x24f/0x1e30 net/ipv4/tcp.c:3057 Read of size 4 at addr 0020 by task syz-executor1/15452 CPU: 0 PID: 15452 Comm: syz-executor1 Not tainted 4.12.0-rc6-next-20170623+ #173 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x11d/0x1e5 lib/dump_stack.c:52 kasan_report_error mm/kasan/report.c:349 [inline] kasan_report+0x15e/0x370 mm/kasan/report.c:408 check_memory_region_inline mm/kasan/kasan.c:260 [inline] check_memory_region+0x14b/0x1a0 mm/kasan/kasan.c:267 kasan_check_read+0x11/0x20 mm/kasan/kasan.c:272 copy_to_user include/linux/uaccess.h:168 [inline] do_tcp_getsockopt.isra.33+0x24f/0x1e30 net/ipv4/tcp.c:3057 tcp_getsockopt+0xb0/0xd0 net/ipv4/tcp.c:3194 sock_common_getsockopt+0x95/0xd0 net/core/sock.c:2863 SYSC_getsockopt net/socket.c:1869 [inline] SyS_getsockopt+0x180/0x360 net/socket.c:1851 do_syscall_64+0x267/0x740 arch/x86/entry/common.c:284 entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:0x451759 RSP: 002b:7f5dc2b1fc08 EFLAGS: 0216 ORIG_RAX: 0037 RAX: ffda RBX: 00718000 RCX: 00451759 RDX: 001f RSI: 0006 RDI: 0005 RBP: 0c30 R08: 207bf000 R09: R10: 2ffc R11: 0216 R12: 004b824b R13: R14: 0005 R15: 0006 == Disabling lock debugging due to kernel taint Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 15452 Comm: syz-executor1 Tainted: GB 4.12.0-rc6-next-20170623+ #173 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x11d/0x1e5 lib/dump_stack.c:52 panic+0x1bc/0x3ad kernel/panic.c:180 kasan_end_report+0x47/0x4f mm/kasan/report.c:176 kasan_report_error mm/kasan/report.c:356 [inline] kasan_report+0x167/0x370 mm/kasan/report.c:408 check_memory_region_inline mm/kasan/kasan.c:260 [inline] check_memory_region+0x14b/0x1a0 mm/kasan/kasan.c:267 kasan_check_read+0x11/0x20 mm/kasan/kasan.c:272 copy_to_user include/linux/uaccess.h:168 [inline] do_tcp_getsockopt.isra.33+0x24f/0x1e30 net/ipv4/tcp.c:3057 tcp_getsockopt+0xb0/0xd0 net/ipv4/tcp.c:3194 sock_common_getsockopt+0x95/0xd0 net/core/sock.c:2863 SYSC_getsockopt net/socket.c:1869 [inline] SyS_getsockopt+0x180/0x360 net/socket.c:1851 do_syscall_64+0x267/0x740 arch/x86/entry/common.c:284 entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:0x451759 RSP: 002b:7f5dc2b1fc08 EFLAGS: 0216 ORIG_RAX: 0037 RAX: ffda RBX: 00718000 RCX: 00451759 RDX: 001f RSI: 0006 RDI: 0005 RBP: 0c30 R08: 207bf000 R09: R10: 2ffc R11: 0216 R12: 004b824b R13: R14: 0005 R15: 0006 Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: 0x2480 from 0x8100 (relocation range: 0x8000-0xbfff) Rebooting in 86400 seconds.. -- Thanks, Sasha
net: icmp vs udp_poll race?
Hi all, On the latest linux-next I'm seeing issues that look like an icmp socket destruction racing with poll(). It manifests in two ways, first: BUG: KASAN: slab-out-of-bounds in skb_queue_empty include/linux/skbuff.h:1197 [inline] BUG: KASAN: slab-out-of-bounds in udp_poll+0x5fb/0x6f0 net/ipv4/udp.c:2443 Read of size 8 at addr 88006941a200 by task syz-executor5/9052 CPU: 2 PID: 9052 Comm: syz-executor5 Not tainted 4.12.0-rc3-next-20170601+ #47 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x115/0x1d1 lib/dump_stack.c:52 print_address_description+0xe7/0x370 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x1b0/0x450 mm/kasan/report.c:408 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:429 skb_queue_empty include/linux/skbuff.h:1197 [inline] udp_poll+0x5fb/0x6f0 net/ipv4/udp.c:2443 sock_poll+0x169/0x410 net/socket.c:1101 do_pollfd fs/select.c:825 [inline] do_poll fs/select.c:875 [inline] do_sys_poll+0x7a7/0x13b0 fs/select.c:969 SYSC_poll fs/select.c:1027 [inline] SyS_poll+0x106/0x460 fs/select.c:1015 do_syscall_64+0x275/0x810 arch/x86/entry/common.c:284 entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:0x451429 RSP: 002b:7fee2df0dc08 EFLAGS: 0216 ORIG_RAX: 0007 RAX: ffda RBX: 2fb0 RCX: 00451429 RDX: 001f RSI: 000a RDI: 2fb0 RBP: 00718000 R08: R09: R10: R11: 0216 R12: R13: 000a R14: 03c4 R15: 7fee2df0e700 Allocated by task 9052: save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 save_stack+0x43/0xd0 mm/kasan/kasan.c:513 set_track mm/kasan/kasan.c:525 [inline] kasan_kmalloc+0xae/0xe0 mm/kasan/kasan.c:617 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:555 slab_post_alloc_hook mm/slab.h:456 [inline] slab_alloc_node mm/slub.c:2712 [inline] slab_alloc mm/slub.c:2720 [inline] kmem_cache_alloc+0x12f/0x610 mm/slub.c:2725 sk_prot_alloc+0x6e/0x300 net/core/sock.c:1422 sk_alloc+0x82/0x880 net/core/sock.c:1484 inet_create+0x519/0x11b0 net/ipv4/af_inet.c:318 __sock_create+0x52e/0xa50 net/socket.c:1249 sock_create net/socket.c:1289 [inline] SYSC_socket net/socket.c:1319 [inline] SyS_socket+0x105/0x260 net/socket.c:1299 do_syscall_64+0x275/0x810 arch/x86/entry/common.c:284 return_from_SYSCALL_64+0x0/0x7a Freed by task 8076: save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 save_stack+0x43/0xd0 mm/kasan/kasan.c:513 set_track mm/kasan/kasan.c:525 [inline] kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:590 slab_free_hook mm/slub.c:1357 [inline] slab_free_freelist_hook mm/slub.c:1379 [inline] slab_free mm/slub.c:2955 [inline] kmem_cache_free+0xec/0x630 mm/slub.c:2977 sk_prot_free net/core/sock.c:1465 [inline] __sk_destruct+0x6a1/0xb40 net/core/sock.c:1546 sk_destruct+0x57/0xb0 net/core/sock.c:1554 __sk_free+0x62/0x260 net/core/sock.c:1562 sk_free+0x28/0x40 net/core/sock.c:1573 sock_put include/net/sock.h:1655 [inline] sk_common_release+0x241/0x3c0 net/core/sock.c:2902 ping_close+0x15/0x20 net/ipv4/ping.c:295 inet_release+0x108/0x240 net/ipv4/af_inet.c:425 sock_release+0x96/0x260 net/socket.c:597 SYSC_socketpair net/socket.c:1436 [inline] SyS_socketpair+0x522/0x710 net/socket.c:1340 do_syscall_64+0x275/0x810 arch/x86/entry/common.c:284 return_from_SYSCALL_64+0x0/0x7a The buggy address belongs to the object at 880069419c40 which belongs to the cache PING of size 1392 The buggy address is located 80 bytes to the right of 1392-byte region [880069419c40, 88006941a1b0) The buggy address belongs to the page: page:ea0001a50600 count:1 mapcount:0 mapping: (null) index:0x88006941d440 compound_mapcount: 0 flags: 0x5fffc008100(slab|head) raw: 05fffc008100 88006941d440 000100120005 raw: 88006c5ba490 88006c5ba490 88006b197c40 page dumped because: kasan: bad access detected Memory state around the buggy address: 88006941a100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 88006941a180: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc >88006941a200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ 88006941a280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc 88006941a300: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb And second: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 3 PID: 12664 Comm: syz-executor7 Not tainted 4.12.0-rc3-next-20170601+ #47 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x115/0x1d1 lib/dump_stack.c:52 register_lock_class+0x5a5/0x2ce0
Re: [PATCH] net: thunderx: correct bound check in nic_config_loopback
On 07/31/2016 12:41 PM, Sunil Kovvuri wrote: > Thanks for finding. > A much better fix would be, > > - if (lbk->vf_id > MAX_LMAC) > + if (lbk->vf_id >= nic->num_vf_en) > return -1; > > where 'num_vf_en' reflects the exact number of physical interfaces or > LMACs on the system. Right. I see quite a few more places that compare to MAX_LMAC vs num_vf_en. What was the reasoning behind it then? Thanks, Sasha
[PATCH] net: thunderx: correct bound check in nic_config_loopback
Off by one in nic_config_loopback would access an invalid arrat variable when vf id == MAX_LMAC. Signed-off-by: Sasha Levin--- drivers/net/ethernet/cavium/thunder/nic_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/cavium/thunder/nic_main.c b/drivers/net/ethernet/cavium/thunder/nic_main.c index 16ed203..a70f50d 100644 --- a/drivers/net/ethernet/cavium/thunder/nic_main.c +++ b/drivers/net/ethernet/cavium/thunder/nic_main.c @@ -615,7 +615,7 @@ static int nic_config_loopback(struct nicpf *nic, struct set_loopback *lbk) { int bgx_idx, lmac_idx; - if (lbk->vf_id > MAX_LMAC) + if (lbk->vf_id >= MAX_LMAC) return -1; bgx_idx = NIC_GET_BGX_FROM_VF_LMAC_MAP(nic->vf_lmac_map[lbk->vf_id]); -- 2.7.4