Re: [PATCH v2 bpf-next 2/9] bpf: Add bpf helper bpf_tcp_enter_cwr

2019-02-23 Thread Eric Dumazet
On 02/22/2019 05:06 PM, brakmo wrote: > From: Martin KaFai Lau > > This patch adds a new bpf helper BPF_FUNC_tcp_enter_cwr > "int bpf_tcp_enter_cwr(struct bpf_tcp_sock *tp)". > It is added to BPF_PROG_TYPE_CGROUP_SKB which can be attached > to the egress path where the bpf prog is called by >

Re: [PATCH v2 bpf-next 2/9] bpf: Add bpf helper bpf_tcp_enter_cwr

2019-02-24 Thread Eric Dumazet
On 02/23/2019 07:08 PM, Martin Lau wrote: > On Sat, Feb 23, 2019 at 05:32:14PM -0800, Eric Dumazet wrote: >> >> >> On 02/22/2019 05:06 PM, brakmo wrote: >>> From: Martin KaFai Lau >>> >>> This patch adds a new bpf helper BPF_FUNC_tcp_enter_cwr &g

Re: [PATCH net] sit: use ipv6_mod_enabled to check if ipv6 is disabled

2019-02-24 Thread Eric Dumazet
On 02/24/2019 08:12 PM, Hangbin Liu wrote: > ipv6_mod_enabled() is more safe and gentle to check if ipv6 is disabled > at running time. > Why is it better exactly ? IPv6 can be enabled on the host, but disabled per device /proc/sys/net/ipv6/conf/{name}/disable_ipv6 > Fixes: 173656accaf5 (

Re: [PATCH bpf] bpf: properly check TCP_CONGESTION optlen

2019-02-25 Thread Eric Dumazet
On Mon, Feb 25, 2019 at 3:07 AM Daniel Borkmann wrote: > > On 02/24/2019 12:11 AM, Alexei Starovoitov wrote: > > On Sat, Feb 23, 2019 at 12:48:53PM -0800, Eric Dumazet wrote: > >> On 02/23/2019 12:38 PM, Alexei Starovoitov wrote: > >>> On Sat, Feb 23, 2019 at 11:

Re: [PATCH net-next] tcp: remove unused parameter of tcp_sacktag_bsearch()

2019-02-25 Thread Eric Dumazet
dering") since it removed fack_count Signed-off-by: Eric Dumazet Thanks.

Re: [PATCH] tun: fix blocking read

2019-02-25 Thread Eric Dumazet
On 02/24/2019 10:12 PM, David Miller wrote: > From: Timur Celik > Date: Sat, 23 Feb 2019 12:53:13 +0100 > >> This patch moves setting of the current state into the loop. Otherwise >> the task may end up in a busy wait loop if none of the break conditions >> are met. >> >> Signed-off-by: Timur

Re: [PATCH net] sit: use ipv6_mod_enabled to check if ipv6 is disabled

2019-02-25 Thread Eric Dumazet
On 02/25/2019 12:17 AM, Hangbin Liu wrote: > On Sun, Feb 24, 2019 at 08:24:51PM -0800, Eric Dumazet wrote: >> >> >> On 02/24/2019 08:12 PM, Hangbin Liu wrote: >>> ipv6_mod_enabled() is more safe and gentle to check if ipv6 is disabled >>> at running time

Re: [PATCH v2 bpf-next 4/9] bpf: add bpf helper bpf_skb_ecn_set_ce

2019-02-25 Thread Eric Dumazet
On 02/25/2019 02:10 AM, Daniel Borkmann wrote: > My understanding is that before doing any writes into skb, we should make > sure the data area is private to us (and offset in linear data). In tc BPF > (ingress, egress) we use bpf_try_make_writable() helper for this, others > like act_{pedit,sk

Re: [PATCH] tun: remove unnecessary memory barrier

2019-02-25 Thread Eric Dumazet
thanks. Reviewed-by: Eric Dumazet

Re: [PATCH] rtnetlink: Synchronze net in rtnl_unregister()

2019-02-25 Thread Eric Dumazet
On 02/25/2019 01:27 PM, Dmitry Safonov wrote: > rtnl_unregister() unsets handler from table, which is protected > by rtnl_lock or RCU. At this moment only dump handlers access the table > with rcu_lock(). Every other user accesses under rtnl. > > Callers may expect that rtnl_unregister() preven

Re: [PATCH] rtnetlink: Synchronze net in rtnl_unregister()

2019-02-25 Thread Eric Dumazet
On 02/25/2019 03:21 PM, Dmitry Safonov wrote: > Hi Eric, > > On 2/25/19 11:09 PM, Eric Dumazet wrote: >> On 02/25/2019 01:27 PM, Dmitry Safonov wrote: >>> While it's possible to document that rtnl_unregister() requires >>> synchronize_net() afterwards - u

Re: [PATCH net] sit: use ipv6_mod_enabled to check if ipv6 is disabled

2019-02-25 Thread Eric Dumazet
On 02/25/2019 08:08 PM, Hangbin Liu wrote: > Hi David, > On Mon, Feb 25, 2019 at 07:15:26PM -0700, David Ahern wrote: >> On 2/25/19 6:55 PM, Hangbin Liu wrote: >>> Just as I said, this issue only occurs when IPv6 is disabled at boot time >>> as there is no IPv6 route entry. Disable ipv6 on speci

[BUG] net/sched : qlen can not really be per cpu ?

2019-02-25 Thread Eric Dumazet
HTB + pfifo_fast as a leaf qdisc hits badly the following warning in htb_activate() : WARN_ON(cl->level || !cl->leaf.q || !cl->leaf.q->q.qlen); This is because pfifo_fast does not update sch->q.qlen, but per cpu counters. So cl->leaf.q->q.qlen is zero. HFSC, CBQ, DRR, QFQ have the same problem

Re: [PATCH] tcp: fix __tcp_transmit_skb's comment text

2019-02-26 Thread Eric Dumazet
On 02/26/2019 12:41 AM, Geliang Tang wrote: > The function name tcp_do_sendmsg has been renamed. But it still > appears in __tcp_transmit_skb's comment text. This patch changes > it to tcp_sendmsg_locked. > > Signed-off-by: Geliang Tang > --- > net/ipv4/tcp_output.c | 2 +- > 1 file changed,

Re: [PATCH net] tcp: repaired skbs must init their tso_segs

2019-02-26 Thread Eric Dumazet
On 02/26/2019 01:23 AM, Andrei Vagin wrote: > > Thank you Eric. I saw a few test fails when tcp_peek_sndq() > returned more data than we expected. I have executed the test with this > fix in a loop and it works without any problem. Without this fix, it > fails after a few iteration. > > https:

Re: [PATCH RFC] net: Validate size of non-TSO packets in validate_xmit_skb().

2019-02-26 Thread Eric Dumazet
On 02/26/2019 02:56 AM, Michael Chan wrote: > There have been reports of oversize UDP packets being sent to the > driver to be transmitted, causing error conditions. The issue is > likely caused by the dst of the SKB switching between 'lo' with > 64K MTU and the hardware device with a smaller M

Re: [PATCH] net: netem: fix skb length BUG_ON in __skb_to_sgvec

2019-02-26 Thread Eric Dumazet
On 02/26/2019 05:02 AM, Sheng Lan wrote: > > > >> On Mon, 25 Feb 2019 22:49:39 +0800 >> Sheng Lan wrote: >> >>> From: Sheng Lan >>> Subject: [PATCH] net: netem: fix skb length BUG_ON in __skb_to_sgvec >>> >>> It can be reproduced by following steps: >>> 1. virtio_net NIC is configured with

Re: [PATCH] net: netem: fix skb length BUG_ON in __skb_to_sgvec

2019-02-26 Thread Eric Dumazet
On 02/26/2019 07:59 AM, Stephen Hemminger wrote: > > > Maybe the fix is to stop TSO fragment from overwriting by doing something > like: > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > index 730bc44dbad9..5fe91d0224f6 100644 > --- a/net/ipv4/tcp_output.c > +++ b/net/ipv4/tc

[PATCH net-next 0/5] tcp: cleanups for linux-5.1

2019-02-26 Thread Eric Dumazet
This small patch series cleanups few things, and add a small timewait optimization for hosts not using md5. Eric Dumazet (5): tcp: get rid of tcp_check_send_head() tcp: get rid of __tcp_add_write_queue_tail() tcp: convert tcp_md5_needed to static_branch API tcp: use tcp_md5_needed for

[PATCH net-next 2/5] tcp: get rid of __tcp_add_write_queue_tail()

2019-02-26 Thread Eric Dumazet
This helper is only used from tcp_add_write_queue_tail(), and does not make the code more readable. Signed-off-by: Eric Dumazet --- include/net/tcp.h | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index

[PATCH net-next 5/5] tcp: remove tcp_queue argument from tso_fragment()

2019-02-26 Thread Eric Dumazet
tso_fragment() is only called for packets still in write queue. Remove the tcp_queue parameter to make this more obvious, even if the comment clearly states this. Signed-off-by: Eric Dumazet --- net/ipv4/tcp_output.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff

[PATCH net-next 3/5] tcp: convert tcp_md5_needed to static_branch API

2019-02-26 Thread Eric Dumazet
We prefer static_branch_unlikely() over static_key_false() these days. Signed-off-by: Eric Dumazet --- include/net/tcp.h | 4 ++-- net/ipv4/tcp.c| 2 +- net/ipv4/tcp_ipv4.c | 2 +- net/ipv4/tcp_output.c | 4 ++-- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a

[PATCH net-next 1/5] tcp: get rid of tcp_check_send_head()

2019-02-26 Thread Eric Dumazet
This helper is used only once, and its name is no longer relevant. Signed-off-by: Eric Dumazet --- include/net/tcp.h | 6 -- net/ipv4/tcp.c| 3 ++- 2 files changed, 2 insertions(+), 7 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index

[PATCH net-next 4/5] tcp: use tcp_md5_needed for timewait sockets

2019-02-26 Thread Eric Dumazet
This might speedup tcp_twsk_destructor() a bit, avoiding a cache line miss. Signed-off-by: Eric Dumazet --- net/ipv4/tcp_minisocks.c | 21 + 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index

[PATCH v2] iov_iter: optimize page_copy_sane()

2019-02-26 Thread Eric Dumazet
copying the data is not free, since the freeing of the skb (and associated page frags put_page()) can happen after cache lines have been evicted. Signed-off-by: Eric Dumazet Cc: Al Viro --- lib/iov_iter.c | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/lib

Re: [BUG] net/sched : qlen can not really be per cpu ?

2019-02-26 Thread Eric Dumazet
On 02/25/2019 10:42 PM, Eric Dumazet wrote: > HTB + pfifo_fast as a leaf qdisc hits badly the following warning in > htb_activate() : > > WARN_ON(cl->level || !cl->leaf.q || !cl->leaf.q->q.qlen); > > This is because pfifo_fast does not update sch->q.ql

Re: [BUG] net/sched : qlen can not really be per cpu ?

2019-02-26 Thread Eric Dumazet
On 02/26/2019 03:51 PM, Cong Wang wrote: > On Tue, Feb 26, 2019 at 3:19 PM Eric Dumazet wrote: >> >> >> >> On 02/25/2019 10:42 PM, Eric Dumazet wrote: >>> HTB + pfifo_fast as a leaf qdisc hits badly the following warning in >>> htb_activate() : &g

Re: [PATCH] net: netem: fix skb length BUG_ON in __skb_to_sgvec

2019-02-27 Thread Eric Dumazet
On 02/27/2019 03:26 AM, Sheng Lan wrote: > > I traced again and found that the skb was not sent, master skb was still in > write queue, > because the function tcp_transmit_skb() returns 1 (NET_XMIT_DROP), thus it > can be retransmit. > I found the error value NET_XMIT_DROP returns from netem

Re: [BUG] net/sched : qlen can not really be per cpu ?

2019-02-27 Thread Eric Dumazet
On 02/26/2019 04:56 PM, Eric Dumazet wrote: > > > On 02/26/2019 03:51 PM, Cong Wang wrote: >> On Tue, Feb 26, 2019 at 3:19 PM Eric Dumazet wrote: >>> >>> >>> >>> On 02/25/2019 10:42 PM, Eric Dumazet wrote: >>>> HTB +

Re: [BUG] net/sched : qlen can not really be per cpu ?

2019-02-27 Thread Eric Dumazet
On 02/27/2019 06:46 PM, Cong Wang wrote: > Hmm, looking into this, do we really need to check cl->leaf.q->q.qlen > in htb_activate() for pfifo_fast? htb_activate() is only called when > qdisc_enqueue() returns NET_XMIT_SUCCESS, so for pfifo_fast > that is always qlen!=0, right? > > So somethin

Re: [PATCH v2] net: netem: fix skb length BUG_ON in __skb_to_sgvec

2019-02-28 Thread Eric Dumazet
plicate packet. > > Fixes: 35d889d1 ("sch_netem: fix skb leak in netem_enqueue()") > Signed-off-by: Sheng Lan > Reported-by: Qin Ji > Suggested-by: Eric Dumazet > > --- Signed-off-by: Eric Dumazet Thanks.

[PATCH net] net: sched: put back q.qlen into a single location

2019-02-28 Thread Eric Dumazet
y is to have a legacy pfifo_fast version that would be used when used a a child qdisc, since the parent qdisc needs a spinlock anyway. But then, future lockless qdiscs would also have the same problem. Fixes: 7e66016f2c65 ("net: sched: helpers to sum qlen and qlen for per cpu logic") Signed-o

[PATCH net-next 2/2] net: support 64bit rates for getsockopt(SO_MAX_PACING_RATE)

2019-02-28 Thread Eric Dumazet
is 64bit. Signed-off-by: Eric Dumazet --- net/core/sock.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/net/core/sock.c b/net/core/sock.c index 23aa02ffd6b811a294e232af0bd25f84d3c848cc..782343bb925b643348cc906a70b97caa0388178d 100644 --- a/net/core/sock.c +++ b

[PATCH net-next 1/2] net: support 64bit values for setsockopt(SO_MAX_PACING_RATE)

2019-02-28 Thread Eric Dumazet
64bit kernels now support 64bit pacing rates. This commit changes setsockopt() to accept 64bit values provided by applications. Old applications providing 32bit value are still supported, but limited to the old 34Gbit limitation. Signed-off-by: Eric Dumazet --- net/core/sock.c | 18

[PATCH net-next 0/2] net: 64bit support for SO_MAX_PACING_RATE

2019-02-28 Thread Eric Dumazet
64bit kernels adopted 64bit type for sk_max_pacing_rate in linux-4.20 We can change how we implement SO_MAX_PACING_RATE socket option to support 64bit values to/from user space as well. Eric Dumazet (2): net: support 64bit values for setsockopt(SO_MAX_PACING_RATE) net: support 64bit rates

Re: [PATCH] bpf: enable program stats

2019-03-01 Thread Eric Dumazet
On 03/01/2019 02:03 PM, Guenter Roeck wrote: > Hi, > > On Mon, Feb 25, 2019 at 02:28:39PM -0800, Alexei Starovoitov wrote: >> JITed BPF programs are indistinguishable from kernel functions, but unlike >> kernel code BPF code can be changed often. >> Typical approach of "perf record" + "perf rep

Re: [PATCH] bpf: enable program stats

2019-03-01 Thread Eric Dumazet
On 03/01/2019 02:03 PM, Guenter Roeck wrote: > Hi, > > On Mon, Feb 25, 2019 at 02:28:39PM -0800, Alexei Starovoitov wrote: >> JITed BPF programs are indistinguishable from kernel functions, but unlike >> kernel code BPF code can be changed often. >> Typical approach of "perf record" + "perf rep

[PATCH net] bpf: fix u64_stats_init() usage in bpf_prog_alloc()

2019-03-01 Thread Eric Dumazet
We need to iterate through all possible cpus. Fixes: 492ecee892c2 ("bpf: enable program stats") Signed-off-by: Eric Dumazet Reported-by: Guenter Roeck Tested-by: Guenter Roeck --- kernel/bpf/core.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/kernel/bp

Re: [PATCH] aio: prevent the final fput() in the middle of vfs_poll() (Re: KASAN: use-after-free Read in unix_dgram_poll)

2019-03-03 Thread Eric Dumazet
ror)) { > fput(req->file); > return apt.error; > Very nice changelog Al, thanks for fixing this. Reviewed-by: Eric Dumazet

Re: [PATCH] tcp: detect use sendpage for slab-based objects

2019-03-04 Thread Eric Dumazet
On 03/04/2019 04:58 AM, Vasily Averin wrote: > On 2/21/19 7:00 PM, Eric Dumazet wrote: >> On Thu, Feb 21, 2019 at 7:30 AM Vasily Averin wrote: >>> >>> There was few incidents when XFS over network block device generates >>> IO requests with slab-based metad

Re: [PATCH bpf-next] bpf: fix memory leak in bpf_lwt_xmit_reroute

2019-03-04 Thread Eric Dumazet
On 03/04/2019 02:37 PM, Peter Oskolkov wrote: > On Mon, Mar 4, 2019 at 1:03 PM David Ahern wrote: >> >> On 3/4/19 1:39 PM, Peter Oskolkov wrote: >>> I found the problem: skb->inner_protocol was not set, so software GSO >>> fallback failed. I have a patch that fixes the issue: IPIP+GRE+TCP >>>

Re: [PATCH] tcp: detect use sendpage for slab-based objects

2019-03-05 Thread Eric Dumazet
Resent in plain text mode for the lists. On Tue, Mar 5, 2019 at 7:08 AM Eric Dumazet wrote: > > > > On Tue, Mar 5, 2019 at 6:24 AM Vasily Averin wrote: >> >> On 3/4/19 6:51 PM, Eric Dumazet wrote: >> > On 03/04/2019 04:58 AM, Vasily Averin wrote: >> >

Re: [PATCH] tcp: detect use sendpage for slab-based objects

2019-03-05 Thread Eric Dumazet
On Tue, Mar 5, 2019 at 7:11 AM Eric Dumazet wrote: > > > > > My original suggestion was to use VM_WARN_ONCE() so that the debug checks > > would > > be compiled out by the compiler, unless you compile a debug kernel. > > > > Something like : > > >

[PATCH net] xsk: fix potential crash in xsk_diag_put_umem()

2019-03-05 Thread Eric Dumazet
0080050033 CR2: 01d22000 CR3: 8fa13000 CR4: 001406f0 Fixes: a36b38aa2af6 ("xsk: add sock_diag interface for AF_XDP") Signed-off-by: Eric Dumazet Reported-by: syzbot Cc: Björn Töpel Cc: Daniel Borkmann Cc: Magnus Karlsson --- net/xdp/xsk_diag.c | 4 ++--

Re: kernel BUG at include/linux/mm.h:LINE! (5)

2019-03-05 Thread Eric Dumazet
On 03/04/2019 02:23 PM, syzbot wrote: > Hello, > > syzbot found the following crash on: > > HEAD commit:    9e9322e5d28e selftest/net: Remove duplicate header > git tree:   net-next > console output: https://syzkaller.appspot.com/x/log.txt?x=1351623320 > kernel config:  https://syzkall

Re: [PATCH] tcp: detecting the misuse of .sendpage for Slab objects

2019-03-06 Thread Eric Dumazet
ion is TCP Fast Open >* (passive side) where data is allowed to be sent before a connection >* is fully established. > SGTM David, this probably can be merged into net tree. Signed-off-by: Eric Dumazet

[PATCH net] fou, fou6: avoid uninit-value in gue_err() and gue6_err()

2019-03-06 Thread Eric Dumazet
+0x53f/0x93a kernel/softirq.c:293 Fixes: 26fc181e6cac ("fou, fou6: do not assume linear skbs") Signed-off-by: Eric Dumazet Reported-by: syzbot Cc: Stefano Brivio Cc: Sabrina Dubroca --- net/ipv4/fou.c | 4 ++-- net/ipv6/fou6.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletion

Re: [PATCH v2] xfrm: Reset secpath in xfrm failure

2019-03-06 Thread Eric Dumazet
On 03/06/2019 01:55 PM, Myungho Jung wrote: > In esp4_gro_receive() and esp6_gro_receive(), secpath can be allocated > without adding xfrm state to xvec. Then, sp->xvec[sp->len - 1] would > fail and result in dereferencing invalid pointer in esp4_gso_segment() > and esp6_gso_segment(). Reset sec

Re: TAHI testing fails for IPv6 Fragments in Kernel 4.9

2019-03-06 Thread Eric Dumazet
On 03/06/2019 02:28 PM, David Miller wrote: > From: Captain Wiggum > Date: Wed, 6 Mar 2019 15:26:43 -0700 > >> We are using the TAHI Self-test tools from IPv6 Ready Logo Program: >> https://www.ipv6ready.org/?page=documents&tag=ipv6-core-protocols >> >> The test passed up to 4.9.133, then fail

[PATCH net] net/hsr: fix possible crash in add_timer()

2019-03-07 Thread Eric Dumazet
RDI: 0003 RBP: 0073bf00 R08: R09: R10: R11: 0246 R12: 7fc2019bf6d4 R13: 004c4a60 R14: 004dd218 R15: Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamles

Re: [PATCH net] vxlan: Fix GRO cells race condition between receive and link delete

2019-03-08 Thread Eric Dumazet
gt; This is now done in the same way as commit 8e816df87997 ("geneve: Use GRO > cells infrastructure.") originally implemented for GENEVE. > > Reported-by: Jianlin Shi > Fixes: 58ce31cca1ff ("vxlan: GRO support at tunnel layer") > Signed-off-by: Stefano Brivio > Reviewed-by: Sabrina Dubroca Nice catch, thanks. Reviewed-by: Eric Dumazet

Re: [PATCH net] pptp: dst_release sk_dst_cache in pptp_sock_destruct

2019-03-08 Thread Eric Dumazet
On 03/07/2019 11:25 PM, Xin Long wrote: > sk_setup_caps() is called to set sk->sk_dst_cache in pptp_connect, > so we have to dst_release(sk->sk_dst_cache) in pptp_sock_destruct, > otherwise, the dst refcnt will leak. > > It can be reproduced by this syz log: > > r1 = socket$pptp(0x18, 0x1, 0

Re: [PATCH net] pptp: dst_release sk_dst_cache in pptp_sock_destruct

2019-03-08 Thread Eric Dumazet
On 03/08/2019 09:17 AM, Eric Dumazet wrote: > > > On 03/07/2019 11:25 PM, Xin Long wrote: >> sk_setup_caps() is called to set sk->sk_dst_cache in pptp_connect, >> so we have to dst_release(sk->sk_dst_cache) in pptp_sock_destruct, >> otherwise, the dst

[BUG] BPF splat on latest kernels

2019-03-08 Thread Eric Dumazet
Running test_progs on a LOCKDEP enabled kernel (latest David Miller net tree) I got the following splat. It is not immediately obvious to me. Any idea ? [ 4169.908826] == [ 4169.914996] WARNING: possible circular locking dependency detected [ 4

Re: [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures

2019-03-08 Thread Eric Dumazet
On 03/08/2019 01:09 PM, Guillaume Nault wrote: > Commit 7716682cc58e ("tcp/dccp: fix another race at listener > dismantle") let inet_csk_reqsk_queue_add() fail, and adjusted > {tcp,dccp}_check_req() accordingly. However, TFO and syncookies > weren't modified, thus leaking allocated resources on

Re: [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures

2019-03-08 Thread Eric Dumazet
On 03/08/2019 02:22 PM, Guillaume Nault wrote: > On Fri, Mar 08, 2019 at 01:33:02PM -0800, Eric Dumazet wrote: >> >> >> On 03/08/2019 01:09 PM, Guillaume Nault wrote: >>> @@ -216,7 +216,12 @@ struct sock *tcp_get_cookie_sock(struct sock

Re: [PATCH net] tcp: handle inet_csk_reqsk_queue_add() failures

2019-03-08 Thread Eric Dumazet
On 03/08/2019 02:40 PM, Guillaume Nault wrote: > On Fri, Mar 08, 2019 at 02:34:07PM -0800, Eric Dumazet wrote: >> >> >> On 03/08/2019 02:22 PM, Guillaume Nault wrote: >>> On Fri, Mar 08, 2019 at 01:33:02PM -0800, Eric Dumazet wrote: >>>> >>>&

Re: [BUG] BPF splat on latest kernels

2019-03-08 Thread Eric Dumazet
On 03/08/2019 04:29 PM, Alexei Starovoitov wrote: > On Fri, Mar 8, 2019 at 12:33 PM Eric Dumazet wrote: >> >> Running test_progs on a LOCKDEP enabled kernel (latest David Miller net tree) >> >> I got the following splat. >> >> It is not immediately obvio

[PATCH net] ip: fix ip_mc_may_pull() return value

2019-03-09 Thread Eric Dumazet
ned-off-by: Eric Dumazet Reported-by: syzbot Cc: Linus Lüssing --- include/linux/igmp.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/igmp.h b/include/linux/igmp.h index cc85f4524dbfab28d03723c2fcb65c23730dee54..9c94b2ea789ceb9a06d9da0d8b07d28801732930 10

[PATCH net] net/x25: fix use-after-free in x25_device_event()

2019-03-10 Thread Eric Dumazet
ex:0x0 flags: 0x1fffc000200(slab) raw: 01fffc000200 ea0002806788 ea00027f0188 88812c3f07c0 raw: 8880a030e000 0001000c page dumped because: kasan: bad access detected Signed-off-by: Eric Dumazet Reported-by: syzbot+04bab

[PATCH net] vxlan: test dev->flags & IFF_UP before calling gro_cells_receive()

2019-03-10 Thread Eric Dumazet
fter-free and/or crashes. Fixes: d342894c5d2f ("vxlan: virtual extensible lan") Signed-off-by: Eric Dumazet --- drivers/net/vxlan.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index a3c46d78d216bf088545d7550170545c1771ab79..3d7167

[PATCH net] gro_cells: make sure device is up in gro_cells_receive()

2019-03-10 Thread Eric Dumazet
0 R14: e8c64b80 R15: e8c64b75 FS: () GS:8880ae80() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: f4ca0b9e CR3: 94941000 CR4: 001406f0 Fixes: c9e6bc644e55 ("net: add gro_cells infrastructure"

Re: [PATCH net-next] sched: add dualpi2 scheduler module

2019-03-11 Thread Eric Dumazet
FYI, net-next tree is currently closed. On 03/11/2019 08:14 AM, Olga Albisser wrote: > DUALPI2 provides extremely low latency & loss to traffic that uses a > scalable congestion controller (e.g. L4S, DCTCP) without degrading the > performance of 'classic' traffic (e.g. Reno, Cubic etc.). It is int

Re: [PATCH net-next] sched: add dualpi2 scheduler module

2019-03-11 Thread Eric Dumazet
On 03/11/2019 08:14 AM, Olga Albisser wrote: > + > +static u32 get_ecn_field(struct sk_buff *skb) > +{ > + switch (skb->protocol) { tc_skb_protocol(skb) > + case cpu_to_be16(ETH_P_IP): Theoretically you have to use pskb_may_pull() before assuming network header is in the linear part

Re: [PATCH net-next] sched: add dualpi2 scheduler module

2019-03-11 Thread Eric Dumazet
tdev mailing list. > > Regards, > Olga > > On Mon, Mar 11, 2019 at 4:41 PM Eric Dumazet <mailto:eric.duma...@gmail.com>> wrote: > > FYI, net-next tree is currently closed. > > On 03/11/2019 08:14 AM, Olga Albisser wrote: > > DUALPI2 p

Re: [PATCH net-next] sched: add dualpi2 scheduler module

2019-03-11 Thread Eric Dumazet
On 03/11/2019 09:03 AM, Eric Dumazet wrote: > > > On 03/11/2019 08:14 AM, Olga Albisser wrote: > >> + >> +static u32 get_ecn_field(struct sk_buff *skb) >> +{ >> +switch (skb->protocol) { > > tc_skb_protocol(skb) > >> +case cpu

[PATCH net] net/x25: reset state in x25_connect()

2019-03-11 Thread Eric Dumazet
95d6ebd53c79 ("net/x25: fix use-after-free in x25_device_event()") Signed-off-by: Eric Dumazet Cc: andrew hendry Reported-by: syzbot --- net/x25/af_x25.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c index 27171ac6fe3b3be975dbca831f2453f637aa8e63..20a5113

Re: [PATCH] tcp: Don't access TCP_SKB_CB before initializing it

2019-03-11 Thread Eric Dumazet
] Read of size 1 at addr 88006adbc208 by task > test_ip6_datagr/1799 > > Setting end_seq is actually no more necessary in tcp_filter as it gets > initialized later on in tcp_vX_fill_cb. > > Cc: Eric Dumazet > Fixes: eeea10b83a13 ("tcp: add tcp_v4_fill_cb()/tcp_v4_res

Re: TAHI testing fails for IPv6 Fragments in Kernel 4.9

2019-03-11 Thread Eric Dumazet
On 03/11/2019 09:12 PM, Captain Wiggum wrote: > Hi All, > > To summarize this thread, we test for IPv6 Ready Logo using Self-test > Tools (TAHI Project) here: > https://www.ipv6ready.org/?page=documents&tag=ipv6-core-protocols > > 4.9.133 and previous passed 100%. Beginning with 4.9.134 it fai

[PATCH net] l2tp: fix infoleak in l2tp_ip6_recvmsg()

2019-03-12 Thread Eric Dumazet
ize 32 starts at 8880ae62fbb0 Data copied to user address 2000 Fixes: a32e0eec7042 ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6") Signed-off-by: Eric Dumazet Reported-by: syzbot --- net/l2tp/l2tp_ip6.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions

Re: [PATCH net] net: enforce xmit_recursion for devices with a queue

2019-03-14 Thread Eric Dumazet
On 03/14/2019 03:15 AM, Sabrina Dubroca wrote: > Commit 745e20f1b626 ("net: add a recursion limit in xmit path") > introduced a recursion limit, but it only applies to devices without a > queue. Virtual devices with a queue (either because they don't have > the IFF_NO_QUEUE flag, or because the

Re: [PATCH net] net: enforce xmit_recursion for devices with a queue

2019-03-14 Thread Eric Dumazet
On 03/14/2019 07:15 AM, Sabrina Dubroca wrote: > 2019-03-14, 05:58:03 -0700, Eric Dumazet wrote: >> >> >> On 03/14/2019 03:15 AM, Sabrina Dubroca wrote: >>> Commit 745e20f1b626 ("net: add a recursion limit in xmit path") >>> introduced a recursio

Re: [PATCH net] net: enforce xmit_recursion for devices with a queue

2019-03-14 Thread Eric Dumazet
On 03/14/2019 10:40 AM, Sabrina Dubroca wrote: > 2019-03-14, 07:56:10 -0700, Eric Dumazet wrote: >> >> >> On 03/14/2019 07:15 AM, Sabrina Dubroca wrote: >>> 2019-03-14, 05:58:03 -0700, Eric Dumazet wrote: >>>> >>>> >>>> On 03/

[PATCH net] tun: properly test for IFF_UP

2019-03-14 Thread Eric Dumazet
does not have to check dev->flags & IFF_UP Virtual drivers do not have this guarantee, and must therefore make the check themselves. Fixes: 1bd4978a88ac ("tun: honor IFF_UP in tun_get_user()") Signed-off-by: Eric Dumazet Reported-by: syzbot --- drivers/net/tun.c | 15 +++-

Re: [PATCH] vxlan: remove the redundant gro_cells_destroy() calling.

2019-03-15 Thread Eric Dumazet
On 03/15/2019 03:06 AM, Zhiqiang Liu wrote: > From: "Suanming.Mou" > > With ad6c9986bcb6, GRO cells will be destroyed in vxlan_uninit. > > Fixes: ad6c9986bcb6 ("vxlan: Fix GRO cells race condition between receive and > link delete") > This is a net-next candidate . The Fixes: tag is not n

Re: [PATCH v2] vxlan: remove the redundant gro_cells_destroy() calling.

2019-03-15 Thread Eric Dumazet
On 03/15/2019 08:28 AM, Stefano Brivio wrote: > On Fri, 15 Mar 2019 23:18:52 +0800 > Zhiqiang Liu wrote: > >> In vxlan_destroy_tunnels func, unregister_netdevice_queue is called after >> gro_cells_destroy func. However, in unregister_netdevice_queue func, the >> gro_cells_destroy func will als

[PATCH net] net: rose: fix a possible stack overflow

2019-03-15 Thread Eric Dumazet
00 00 00 00 00 00 00 00 00 00 00 88808b1ffc80: 00 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1 01 f2 01 Signed-off-by: Eric Dumazet Reported-by: syzbot --- net/rose/rose_subr.c | 21 - 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/net/rose/rose_subr.c b

Re: [PATCH v2] vxlan: remove the redundant gro_cells_destroy() calling.

2019-03-15 Thread Eric Dumazet
On 03/15/2019 11:02 AM, David Miller wrote: > From: Eric Dumazet > Date: Fri, 15 Mar 2019 09:06:25 -0700 > >> >> >> On 03/15/2019 08:28 AM, Stefano Brivio wrote: >>> On Fri, 15 Mar 2019 23:18:52 +0800 >>> Zhiqiang Liu wrote: >>> >>

Re: [PATCH v2] vxlan: remove the redundant gro_cells_destroy() calling.

2019-03-15 Thread Eric Dumazet
On 03/15/2019 02:08 PM, Stefano Brivio wrote: > On Fri, 15 Mar 2019 11:56:01 -0700 > Eric Dumazet wrote: > >> On 03/15/2019 11:02 AM, David Miller wrote: >>> From: Eric Dumazet >>> Date: Fri, 15 Mar 2019 09:06:25 -0700 >>> >>>>

Re: [PATCH net] tun: properly test for IFF_UP

2019-03-16 Thread Eric Dumazet
On 03/14/2019 08:19 PM, Eric Dumazet wrote: > Same reasons than the ones explained in commit 4179cb5a4c92 > ("vxlan: test dev->flags & IFF_UP before calling netif_rx()") > > netif_rx_ni() or napi_gro_frags() must be called under a strict contract. > &

[PATCH net] tun: add a missing rcu_read_unlock() in error path

2019-03-16 Thread Eric Dumazet
In my latest patch I missed one rcu_read_unlock(), in case device is down. Fixes: 4477138fa0ae ("tun: properly test for IFF_UP") Signed-off-by: Eric Dumazet Reported-by: syzbot --- drivers/net/tun.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/tun.c b/drivers/net/t

[PATCH net] tcp: do not use ipv6 header for ipv4 flow

2019-03-19 Thread Eric Dumazet
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet --- net/ipv6/tcp_ipv6.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 57ef69a1088908fc624ecfca99a728fa296ae0bf..44d431849d391d6903d263ae547f

[PATCH net] dccp: do not use ipv6 header for ipv4 flow

2019-03-19 Thread Eric Dumazet
When a dual stack dccp listener accepts an ipv4 flow, it should not attempt to use an ipv6 header or inet6_iif() helper. Fixes: 3df80d9320bc ("[DCCP]: Introduce DCCPv6") Signed-off-by: Eric Dumazet --- net/dccp/ipv6.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff -

Re: [PATCH net-next] ipv6: Add icmp_echo_ignore_multicast support for ICMPv6

2019-03-19 Thread Eric Dumazet
On 03/19/2019 05:45 AM, Stephen Suryaputra wrote: > IPv4 has icmp_echo_ignore_broadcast to prevent responding to broadcast pings. > IPv6 needs a similar mechanism. > ... > diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h > index 87aa2a6d9125..bd83ddedc014 100644 > --- a

Re: [PATCH net-next 1/2] packet: rework packet_pick_tx_queue() to use common code selection

2019-03-19 Thread Eric Dumazet
On 03/19/2019 06:25 AM, Paolo Abeni wrote: > +u16 __netdev_pick_tx(struct net_device *dev, struct sk_buff *skb, > + struct net_device *sb_dev) > { > struct sock *sk = skb->sk; > int queue_index = sk_tx_queue_get(sk); > @@ -3729,6 +3729,7 @@ static u16 __netdev_pick

[PATCH net-next] tcp: add tcp_inet6_sk() helper

2019-03-19 Thread Eric Dumazet
TCP ipv6 fast path dereferences a pointer to get to the inet6 part of a tcp socket, but given the fixed memory placement, we can do better and avoid a possible cache line miss. This also reduces register pressure, since we let the compiler know about this memory placement. Signed-off-by: Eric

Re: [PATCH net-next] ipv6: Add icmp_echo_ignore_multicast support for ICMPv6

2019-03-19 Thread Eric Dumazet
ic const struct bin_table bin_net_ipv6_icmp_table[] = { > { CTL_INT, NET_IPV6_ICMP_RATELIMIT,"ratelimit" }, > {} > }; > > I will fix that as well. > No you do not want to 'fix' this. We no longer add binary syctls (in kernel/sysctl_binary.c) , they are deprecated.

Re: [PATCH net-next] tcp: free request sock directly upon TFO or syncookies error

2019-03-19 Thread Eric Dumazet
gt; Define __reqsk_free() for these situations where we know nobody's > referencing the socket, even though ->rsk_refcnt might be non-null. > Now we can consolidate the error path of tcp_get_cookie_sock() and > tcp_conn_request(). > > Signed-off-by: Guillaume Nault SGTM thanks Signed-off-by: Eric Dumazet

Re: [PATCH net-next] net/tls: Replace kfree_skb() with consume_skb()

2019-03-20 Thread Eric Dumazet
On 03/20/2019 06:51 PM, Vakul Garg wrote: > To free the skb in normal course of processing, consume_skb() should be > used. Only for failure paths, skb_free() is intended to be used. > > https://www.kernel.org/doc/htmldocs/networking/API-consume-skb.html > > Signed-off-by: Vakul Garg > --- .

Re: [RFC bpf-next v2 1/9] net: introduce __init_skb{,_data,_shinfo} helpers

2019-03-20 Thread Eric Dumazet
On 03/20/2019 08:39 PM, Alexei Starovoitov wrote: > I think you need to convince Dave and Eric that > above surgery is necessary to do the hack in patch 6 with > +static DEFINE_PER_CPU(struct sk_buff, bpf_flow_skb); > Yes, this is a huge code churn. Honestly I believe we are going too far in

Re: [BUG] BPF splat on latest kernels

2019-03-20 Thread Eric Dumazet
On 03/08/2019 04:29 PM, Alexei Starovoitov wrote: > On Fri, Mar 8, 2019 at 12:33 PM Eric Dumazet wrote: >> >> Running test_progs on a LOCKDEP enabled kernel (latest David Miller net tree) >> >> I got the following splat. >> >> It is not immediately obvio

Re: [PATCH net-next 1/2] net: sched: add empty status flag for NOLOCK qdisc

2019-03-21 Thread Eric Dumazet
On 03/21/2019 03:14 AM, Paolo Abeni wrote: > The queue is marked not empty after acquiring the seqlock, > and it's up to the NOLOCK qdisc clearing such flag on dequeue. > Since the empty status lays on the same cache-line of the > seqlock, it's always hot on cache during the updates. > > This m

[PATCH net-next 3/3] tcp: add one skb cache for rx

2019-03-21 Thread Eric Dumazet
back to this cpu. Signed-off-by: Eric Dumazet --- include/net/sock.h | 6 ++ net/ipv4/af_inet.c | 4 net/ipv4/tcp.c | 4 net/ipv4/tcp_ipv4.c | 11 +-- net/ipv6/tcp_ipv6.c | 12 +--- 5 files changed, 32 insertions(+), 5 deletions(-) diff --git a/include/net

[PATCH net-next 0/3] tcp: add rx/tx cache to reduce lock contention

2019-03-21 Thread Eric Dumazet
msg() time, do not free the skb but put it in a tcp socket cache so that it can be freed by the cpu feeding the incoming packets in BH. This increased the performance of small RPC benchmark by about 10 % on a host with 112 hyperthreads. Eric Dumazet (3): net: convert rps_needed and rfs_nee

[PATCH net-next 2/3] tcp: add one skb cache for tx

2019-03-21 Thread Eric Dumazet
. This patch uses an extra pointer in socket structure, so that we try to reuse the same skb and avoid these expensive costs. We cache at most one skb per socket so this should be safe as far as memory pressure is concerned. Signed-off-by: Eric Dumazet --- include/net/sock.h | 5 + net/ipv4

[PATCH net-next 1/3] net: convert rps_needed and rfs_needed to new static branch api

2019-03-21 Thread Eric Dumazet
We prefer static_branch_unlikely() over static_key_false() these days. Signed-off-by: Eric Dumazet --- include/linux/netdevice.h | 4 ++-- include/net/sock.h | 2 +- net/core/dev.c | 10 +- net/core/net-sysfs.c | 4 ++-- net/core/sysctl_net_core.c | 8

Re: [PATCH net-next 3/3] tcp: add one skb cache for rx

2019-03-21 Thread Eric Dumazet
On 03/21/2019 03:17 PM, Eric Dumazet wrote: > Often times, recvmsg() system calls and BH handling for a particular > TCP socket are done on different cpus. ... > Note that if rps/rfs is used, we do not enable this feature, because > there is high chance that the same cpu is handl

[PATCH v2 net-next 1/3] net: convert rps_needed and rfs_needed to new static branch api

2019-03-21 Thread Eric Dumazet
We prefer static_branch_unlikely() over static_key_false() these days. Signed-off-by: Eric Dumazet --- drivers/net/tun.c | 2 +- include/linux/netdevice.h | 4 ++-- include/net/sock.h | 2 +- net/core/dev.c | 10 +- net/core/net-sysfs.c | 4

[PATCH v2 net-next 0/3] tcp: add rx/tx cache to reduce lock contention

2019-03-21 Thread Eric Dumazet
make sure the prior clone has been freed. - Really test rps_needed in sk_eat_skb() as claimed. - Fixed rps_needed use in drivers/net/tun.c Eric Dumazet (3): net: convert rps_needed and rfs_needed to new static branch api tcp: add one skb cache for tx tcp: add one skb ca

[PATCH v2 net-next 3/3] tcp: add one skb cache for rx

2019-03-21 Thread Eric Dumazet
back to this cpu. Signed-off-by: Eric Dumazet --- include/net/sock.h | 6 ++ net/ipv4/af_inet.c | 4 net/ipv4/tcp.c | 4 net/ipv4/tcp_ipv4.c | 11 +-- net/ipv6/tcp_ipv6.c | 12 +--- 5 files changed, 32 insertions(+), 5 deletions(-) diff --git a/include/net

<    3   4   5   6   7   8   9   10   11   12   >