On 02/22/2019 05:06 PM, brakmo wrote:
> From: Martin KaFai Lau
>
> This patch adds a new bpf helper BPF_FUNC_tcp_enter_cwr
> "int bpf_tcp_enter_cwr(struct bpf_tcp_sock *tp)".
> It is added to BPF_PROG_TYPE_CGROUP_SKB which can be attached
> to the egress path where the bpf prog is called by
>
On 02/23/2019 07:08 PM, Martin Lau wrote:
> On Sat, Feb 23, 2019 at 05:32:14PM -0800, Eric Dumazet wrote:
>>
>>
>> On 02/22/2019 05:06 PM, brakmo wrote:
>>> From: Martin KaFai Lau
>>>
>>> This patch adds a new bpf helper BPF_FUNC_tcp_enter_cwr
&g
On 02/24/2019 08:12 PM, Hangbin Liu wrote:
> ipv6_mod_enabled() is more safe and gentle to check if ipv6 is disabled
> at running time.
>
Why is it better exactly ?
IPv6 can be enabled on the host, but disabled per device
/proc/sys/net/ipv6/conf/{name}/disable_ipv6
> Fixes: 173656accaf5 (
On Mon, Feb 25, 2019 at 3:07 AM Daniel Borkmann wrote:
>
> On 02/24/2019 12:11 AM, Alexei Starovoitov wrote:
> > On Sat, Feb 23, 2019 at 12:48:53PM -0800, Eric Dumazet wrote:
> >> On 02/23/2019 12:38 PM, Alexei Starovoitov wrote:
> >>> On Sat, Feb 23, 2019 at 11:
dering")
since it removed fack_count
Signed-off-by: Eric Dumazet
Thanks.
On 02/24/2019 10:12 PM, David Miller wrote:
> From: Timur Celik
> Date: Sat, 23 Feb 2019 12:53:13 +0100
>
>> This patch moves setting of the current state into the loop. Otherwise
>> the task may end up in a busy wait loop if none of the break conditions
>> are met.
>>
>> Signed-off-by: Timur
On 02/25/2019 12:17 AM, Hangbin Liu wrote:
> On Sun, Feb 24, 2019 at 08:24:51PM -0800, Eric Dumazet wrote:
>>
>>
>> On 02/24/2019 08:12 PM, Hangbin Liu wrote:
>>> ipv6_mod_enabled() is more safe and gentle to check if ipv6 is disabled
>>> at running time
On 02/25/2019 02:10 AM, Daniel Borkmann wrote:
> My understanding is that before doing any writes into skb, we should make
> sure the data area is private to us (and offset in linear data). In tc BPF
> (ingress, egress) we use bpf_try_make_writable() helper for this, others
> like act_{pedit,sk
thanks.
Reviewed-by: Eric Dumazet
On 02/25/2019 01:27 PM, Dmitry Safonov wrote:
> rtnl_unregister() unsets handler from table, which is protected
> by rtnl_lock or RCU. At this moment only dump handlers access the table
> with rcu_lock(). Every other user accesses under rtnl.
>
> Callers may expect that rtnl_unregister() preven
On 02/25/2019 03:21 PM, Dmitry Safonov wrote:
> Hi Eric,
>
> On 2/25/19 11:09 PM, Eric Dumazet wrote:
>> On 02/25/2019 01:27 PM, Dmitry Safonov wrote:
>>> While it's possible to document that rtnl_unregister() requires
>>> synchronize_net() afterwards - u
On 02/25/2019 08:08 PM, Hangbin Liu wrote:
> Hi David,
> On Mon, Feb 25, 2019 at 07:15:26PM -0700, David Ahern wrote:
>> On 2/25/19 6:55 PM, Hangbin Liu wrote:
>>> Just as I said, this issue only occurs when IPv6 is disabled at boot time
>>> as there is no IPv6 route entry. Disable ipv6 on speci
HTB + pfifo_fast as a leaf qdisc hits badly the following warning in
htb_activate() :
WARN_ON(cl->level || !cl->leaf.q || !cl->leaf.q->q.qlen);
This is because pfifo_fast does not update sch->q.qlen, but per cpu counters.
So cl->leaf.q->q.qlen is zero.
HFSC, CBQ, DRR, QFQ have the same problem
On 02/26/2019 12:41 AM, Geliang Tang wrote:
> The function name tcp_do_sendmsg has been renamed. But it still
> appears in __tcp_transmit_skb's comment text. This patch changes
> it to tcp_sendmsg_locked.
>
> Signed-off-by: Geliang Tang
> ---
> net/ipv4/tcp_output.c | 2 +-
> 1 file changed,
On 02/26/2019 01:23 AM, Andrei Vagin wrote:
>
> Thank you Eric. I saw a few test fails when tcp_peek_sndq()
> returned more data than we expected. I have executed the test with this
> fix in a loop and it works without any problem. Without this fix, it
> fails after a few iteration.
>
> https:
On 02/26/2019 02:56 AM, Michael Chan wrote:
> There have been reports of oversize UDP packets being sent to the
> driver to be transmitted, causing error conditions. The issue is
> likely caused by the dst of the SKB switching between 'lo' with
> 64K MTU and the hardware device with a smaller M
On 02/26/2019 05:02 AM, Sheng Lan wrote:
>
>
>
>> On Mon, 25 Feb 2019 22:49:39 +0800
>> Sheng Lan wrote:
>>
>>> From: Sheng Lan
>>> Subject: [PATCH] net: netem: fix skb length BUG_ON in __skb_to_sgvec
>>>
>>> It can be reproduced by following steps:
>>> 1. virtio_net NIC is configured with
On 02/26/2019 07:59 AM, Stephen Hemminger wrote:
>
>
> Maybe the fix is to stop TSO fragment from overwriting by doing something
> like:
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 730bc44dbad9..5fe91d0224f6 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tc
This small patch series cleanups few things, and add a small
timewait optimization for hosts not using md5.
Eric Dumazet (5):
tcp: get rid of tcp_check_send_head()
tcp: get rid of __tcp_add_write_queue_tail()
tcp: convert tcp_md5_needed to static_branch API
tcp: use tcp_md5_needed for
This helper is only used from tcp_add_write_queue_tail(), and does
not make the code more readable.
Signed-off-by: Eric Dumazet
---
include/net/tcp.h | 7 +--
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index
tso_fragment() is only called for packets still in write queue.
Remove the tcp_queue parameter to make this more obvious,
even if the comment clearly states this.
Signed-off-by: Eric Dumazet
---
net/ipv4/tcp_output.c | 13 ++---
1 file changed, 6 insertions(+), 7 deletions(-)
diff
We prefer static_branch_unlikely() over static_key_false() these days.
Signed-off-by: Eric Dumazet
---
include/net/tcp.h | 4 ++--
net/ipv4/tcp.c| 2 +-
net/ipv4/tcp_ipv4.c | 2 +-
net/ipv4/tcp_output.c | 4 ++--
4 files changed, 6 insertions(+), 6 deletions(-)
diff --git a
This helper is used only once, and its name is no longer relevant.
Signed-off-by: Eric Dumazet
---
include/net/tcp.h | 6 --
net/ipv4/tcp.c| 3 ++-
2 files changed, 2 insertions(+), 7 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index
This might speedup tcp_twsk_destructor() a bit,
avoiding a cache line miss.
Signed-off-by: Eric Dumazet
---
net/ipv4/tcp_minisocks.c | 21 +
1 file changed, 13 insertions(+), 8 deletions(-)
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index
copying the data
is not free, since the freeing of the skb (and associated
page frags put_page()) can happen after cache lines have been evicted.
Signed-off-by: Eric Dumazet
Cc: Al Viro
---
lib/iov_iter.c | 17 +++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/lib
On 02/25/2019 10:42 PM, Eric Dumazet wrote:
> HTB + pfifo_fast as a leaf qdisc hits badly the following warning in
> htb_activate() :
>
> WARN_ON(cl->level || !cl->leaf.q || !cl->leaf.q->q.qlen);
>
> This is because pfifo_fast does not update sch->q.ql
On 02/26/2019 03:51 PM, Cong Wang wrote:
> On Tue, Feb 26, 2019 at 3:19 PM Eric Dumazet wrote:
>>
>>
>>
>> On 02/25/2019 10:42 PM, Eric Dumazet wrote:
>>> HTB + pfifo_fast as a leaf qdisc hits badly the following warning in
>>> htb_activate() :
&g
On 02/27/2019 03:26 AM, Sheng Lan wrote:
>
> I traced again and found that the skb was not sent, master skb was still in
> write queue,
> because the function tcp_transmit_skb() returns 1 (NET_XMIT_DROP), thus it
> can be retransmit.
> I found the error value NET_XMIT_DROP returns from netem
On 02/26/2019 04:56 PM, Eric Dumazet wrote:
>
>
> On 02/26/2019 03:51 PM, Cong Wang wrote:
>> On Tue, Feb 26, 2019 at 3:19 PM Eric Dumazet wrote:
>>>
>>>
>>>
>>> On 02/25/2019 10:42 PM, Eric Dumazet wrote:
>>>> HTB +
On 02/27/2019 06:46 PM, Cong Wang wrote:
> Hmm, looking into this, do we really need to check cl->leaf.q->q.qlen
> in htb_activate() for pfifo_fast? htb_activate() is only called when
> qdisc_enqueue() returns NET_XMIT_SUCCESS, so for pfifo_fast
> that is always qlen!=0, right?
>
> So somethin
plicate packet.
>
> Fixes: 35d889d1 ("sch_netem: fix skb leak in netem_enqueue()")
> Signed-off-by: Sheng Lan
> Reported-by: Qin Ji
> Suggested-by: Eric Dumazet
>
> ---
Signed-off-by: Eric Dumazet
Thanks.
y is to have a legacy pfifo_fast version that would
be used when used a a child qdisc, since the parent qdisc needs
a spinlock anyway. But then, future lockless qdiscs would also
have the same problem.
Fixes: 7e66016f2c65 ("net: sched: helpers to sum qlen and qlen for per cpu
logic")
Signed-o
is 64bit.
Signed-off-by: Eric Dumazet
---
net/core/sock.c | 10 --
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/net/core/sock.c b/net/core/sock.c
index
23aa02ffd6b811a294e232af0bd25f84d3c848cc..782343bb925b643348cc906a70b97caa0388178d
100644
--- a/net/core/sock.c
+++ b
64bit kernels now support 64bit pacing rates.
This commit changes setsockopt() to accept 64bit
values provided by applications.
Old applications providing 32bit value are still supported,
but limited to the old 34Gbit limitation.
Signed-off-by: Eric Dumazet
---
net/core/sock.c | 18
64bit kernels adopted 64bit type for sk_max_pacing_rate in linux-4.20
We can change how we implement SO_MAX_PACING_RATE socket option
to support 64bit values to/from user space as well.
Eric Dumazet (2):
net: support 64bit values for setsockopt(SO_MAX_PACING_RATE)
net: support 64bit rates
On 03/01/2019 02:03 PM, Guenter Roeck wrote:
> Hi,
>
> On Mon, Feb 25, 2019 at 02:28:39PM -0800, Alexei Starovoitov wrote:
>> JITed BPF programs are indistinguishable from kernel functions, but unlike
>> kernel code BPF code can be changed often.
>> Typical approach of "perf record" + "perf rep
On 03/01/2019 02:03 PM, Guenter Roeck wrote:
> Hi,
>
> On Mon, Feb 25, 2019 at 02:28:39PM -0800, Alexei Starovoitov wrote:
>> JITed BPF programs are indistinguishable from kernel functions, but unlike
>> kernel code BPF code can be changed often.
>> Typical approach of "perf record" + "perf rep
We need to iterate through all possible cpus.
Fixes: 492ecee892c2 ("bpf: enable program stats")
Signed-off-by: Eric Dumazet
Reported-by: Guenter Roeck
Tested-by: Guenter Roeck
---
kernel/bpf/core.c | 8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/kernel/bp
ror)) {
> fput(req->file);
> return apt.error;
>
Very nice changelog Al, thanks for fixing this.
Reviewed-by: Eric Dumazet
On 03/04/2019 04:58 AM, Vasily Averin wrote:
> On 2/21/19 7:00 PM, Eric Dumazet wrote:
>> On Thu, Feb 21, 2019 at 7:30 AM Vasily Averin wrote:
>>>
>>> There was few incidents when XFS over network block device generates
>>> IO requests with slab-based metad
On 03/04/2019 02:37 PM, Peter Oskolkov wrote:
> On Mon, Mar 4, 2019 at 1:03 PM David Ahern wrote:
>>
>> On 3/4/19 1:39 PM, Peter Oskolkov wrote:
>>> I found the problem: skb->inner_protocol was not set, so software GSO
>>> fallback failed. I have a patch that fixes the issue: IPIP+GRE+TCP
>>>
Resent in plain text mode for the lists.
On Tue, Mar 5, 2019 at 7:08 AM Eric Dumazet wrote:
>
>
>
> On Tue, Mar 5, 2019 at 6:24 AM Vasily Averin wrote:
>>
>> On 3/4/19 6:51 PM, Eric Dumazet wrote:
>> > On 03/04/2019 04:58 AM, Vasily Averin wrote:
>> >
On Tue, Mar 5, 2019 at 7:11 AM Eric Dumazet wrote:
>
> >
> > My original suggestion was to use VM_WARN_ONCE() so that the debug checks
> > would
> > be compiled out by the compiler, unless you compile a debug kernel.
> >
> > Something like :
> >
>
0080050033
CR2: 01d22000 CR3: 8fa13000 CR4: 001406f0
Fixes: a36b38aa2af6 ("xsk: add sock_diag interface for AF_XDP")
Signed-off-by: Eric Dumazet
Reported-by: syzbot
Cc: Björn Töpel
Cc: Daniel Borkmann
Cc: Magnus Karlsson
---
net/xdp/xsk_diag.c | 4 ++--
On 03/04/2019 02:23 PM, syzbot wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit: 9e9322e5d28e selftest/net: Remove duplicate header
> git tree: net-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1351623320
> kernel config: https://syzkall
ion is TCP Fast Open
>* (passive side) where data is allowed to be sent before a connection
>* is fully established.
>
SGTM
David, this probably can be merged into net tree.
Signed-off-by: Eric Dumazet
+0x53f/0x93a kernel/softirq.c:293
Fixes: 26fc181e6cac ("fou, fou6: do not assume linear skbs")
Signed-off-by: Eric Dumazet
Reported-by: syzbot
Cc: Stefano Brivio
Cc: Sabrina Dubroca
---
net/ipv4/fou.c | 4 ++--
net/ipv6/fou6.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletion
On 03/06/2019 01:55 PM, Myungho Jung wrote:
> In esp4_gro_receive() and esp6_gro_receive(), secpath can be allocated
> without adding xfrm state to xvec. Then, sp->xvec[sp->len - 1] would
> fail and result in dereferencing invalid pointer in esp4_gso_segment()
> and esp6_gso_segment(). Reset sec
On 03/06/2019 02:28 PM, David Miller wrote:
> From: Captain Wiggum
> Date: Wed, 6 Mar 2019 15:26:43 -0700
>
>> We are using the TAHI Self-test tools from IPv6 Ready Logo Program:
>> https://www.ipv6ready.org/?page=documents&tag=ipv6-core-protocols
>>
>> The test passed up to 4.9.133, then fail
RDI: 0003
RBP: 0073bf00 R08: R09:
R10: R11: 0246 R12: 7fc2019bf6d4
R13: 004c4a60 R14: 004dd218 R15:
Fixes: f421436a591d ("net/hsr: Add support for the High-availability Seamles
gt; This is now done in the same way as commit 8e816df87997 ("geneve: Use GRO
> cells infrastructure.") originally implemented for GENEVE.
>
> Reported-by: Jianlin Shi
> Fixes: 58ce31cca1ff ("vxlan: GRO support at tunnel layer")
> Signed-off-by: Stefano Brivio
> Reviewed-by: Sabrina Dubroca
Nice catch, thanks.
Reviewed-by: Eric Dumazet
On 03/07/2019 11:25 PM, Xin Long wrote:
> sk_setup_caps() is called to set sk->sk_dst_cache in pptp_connect,
> so we have to dst_release(sk->sk_dst_cache) in pptp_sock_destruct,
> otherwise, the dst refcnt will leak.
>
> It can be reproduced by this syz log:
>
> r1 = socket$pptp(0x18, 0x1, 0
On 03/08/2019 09:17 AM, Eric Dumazet wrote:
>
>
> On 03/07/2019 11:25 PM, Xin Long wrote:
>> sk_setup_caps() is called to set sk->sk_dst_cache in pptp_connect,
>> so we have to dst_release(sk->sk_dst_cache) in pptp_sock_destruct,
>> otherwise, the dst
Running test_progs on a LOCKDEP enabled kernel (latest David Miller net tree)
I got the following splat.
It is not immediately obvious to me. Any idea ?
[ 4169.908826] ==
[ 4169.914996] WARNING: possible circular locking dependency detected
[ 4
On 03/08/2019 01:09 PM, Guillaume Nault wrote:
> Commit 7716682cc58e ("tcp/dccp: fix another race at listener
> dismantle") let inet_csk_reqsk_queue_add() fail, and adjusted
> {tcp,dccp}_check_req() accordingly. However, TFO and syncookies
> weren't modified, thus leaking allocated resources on
On 03/08/2019 02:22 PM, Guillaume Nault wrote:
> On Fri, Mar 08, 2019 at 01:33:02PM -0800, Eric Dumazet wrote:
>>
>>
>> On 03/08/2019 01:09 PM, Guillaume Nault wrote:
>>> @@ -216,7 +216,12 @@ struct sock *tcp_get_cookie_sock(struct sock
On 03/08/2019 02:40 PM, Guillaume Nault wrote:
> On Fri, Mar 08, 2019 at 02:34:07PM -0800, Eric Dumazet wrote:
>>
>>
>> On 03/08/2019 02:22 PM, Guillaume Nault wrote:
>>> On Fri, Mar 08, 2019 at 01:33:02PM -0800, Eric Dumazet wrote:
>>>>
>>>&
On 03/08/2019 04:29 PM, Alexei Starovoitov wrote:
> On Fri, Mar 8, 2019 at 12:33 PM Eric Dumazet wrote:
>>
>> Running test_progs on a LOCKDEP enabled kernel (latest David Miller net tree)
>>
>> I got the following splat.
>>
>> It is not immediately obvio
ned-off-by: Eric Dumazet
Reported-by: syzbot
Cc: Linus Lüssing
---
include/linux/igmp.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/igmp.h b/include/linux/igmp.h
index
cc85f4524dbfab28d03723c2fcb65c23730dee54..9c94b2ea789ceb9a06d9da0d8b07d28801732930
10
ex:0x0
flags: 0x1fffc000200(slab)
raw: 01fffc000200 ea0002806788 ea00027f0188 88812c3f07c0
raw: 8880a030e000 0001000c
page dumped because: kasan: bad access detected
Signed-off-by: Eric Dumazet
Reported-by: syzbot+04bab
fter-free and/or crashes.
Fixes: d342894c5d2f ("vxlan: virtual extensible lan")
Signed-off-by: Eric Dumazet
---
drivers/net/vxlan.c | 11 +++
1 file changed, 11 insertions(+)
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index
a3c46d78d216bf088545d7550170545c1771ab79..3d7167
0 R14: e8c64b80 R15: e8c64b75
FS: () GS:8880ae80() knlGS:
CS: 0010 DS: ES: CR0: 80050033
CR2: f4ca0b9e CR3: 94941000 CR4: 001406f0
Fixes: c9e6bc644e55 ("net: add gro_cells infrastructure"
FYI, net-next tree is currently closed.
On 03/11/2019 08:14 AM, Olga Albisser wrote:
> DUALPI2 provides extremely low latency & loss to traffic that uses a
> scalable congestion controller (e.g. L4S, DCTCP) without degrading the
> performance of 'classic' traffic (e.g. Reno, Cubic etc.). It is int
On 03/11/2019 08:14 AM, Olga Albisser wrote:
> +
> +static u32 get_ecn_field(struct sk_buff *skb)
> +{
> + switch (skb->protocol) {
tc_skb_protocol(skb)
> + case cpu_to_be16(ETH_P_IP):
Theoretically you have to use pskb_may_pull() before assuming network header is
in the linear part
tdev mailing list.
>
> Regards,
> Olga
>
> On Mon, Mar 11, 2019 at 4:41 PM Eric Dumazet <mailto:eric.duma...@gmail.com>> wrote:
>
> FYI, net-next tree is currently closed.
>
> On 03/11/2019 08:14 AM, Olga Albisser wrote:
> > DUALPI2 p
On 03/11/2019 09:03 AM, Eric Dumazet wrote:
>
>
> On 03/11/2019 08:14 AM, Olga Albisser wrote:
>
>> +
>> +static u32 get_ecn_field(struct sk_buff *skb)
>> +{
>> +switch (skb->protocol) {
>
> tc_skb_protocol(skb)
>
>> +case cpu
95d6ebd53c79 ("net/x25: fix use-after-free in x25_device_event()")
Signed-off-by: Eric Dumazet
Cc: andrew hendry
Reported-by: syzbot
---
net/x25/af_x25.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index
27171ac6fe3b3be975dbca831f2453f637aa8e63..20a5113
] Read of size 1 at addr 88006adbc208 by task
> test_ip6_datagr/1799
>
> Setting end_seq is actually no more necessary in tcp_filter as it gets
> initialized later on in tcp_vX_fill_cb.
>
> Cc: Eric Dumazet
> Fixes: eeea10b83a13 ("tcp: add tcp_v4_fill_cb()/tcp_v4_res
On 03/11/2019 09:12 PM, Captain Wiggum wrote:
> Hi All,
>
> To summarize this thread, we test for IPv6 Ready Logo using Self-test
> Tools (TAHI Project) here:
> https://www.ipv6ready.org/?page=documents&tag=ipv6-core-protocols
>
> 4.9.133 and previous passed 100%. Beginning with 4.9.134 it fai
ize 32 starts at 8880ae62fbb0
Data copied to user address 2000
Fixes: a32e0eec7042 ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6")
Signed-off-by: Eric Dumazet
Reported-by: syzbot
---
net/l2tp/l2tp_ip6.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions
On 03/14/2019 03:15 AM, Sabrina Dubroca wrote:
> Commit 745e20f1b626 ("net: add a recursion limit in xmit path")
> introduced a recursion limit, but it only applies to devices without a
> queue. Virtual devices with a queue (either because they don't have
> the IFF_NO_QUEUE flag, or because the
On 03/14/2019 07:15 AM, Sabrina Dubroca wrote:
> 2019-03-14, 05:58:03 -0700, Eric Dumazet wrote:
>>
>>
>> On 03/14/2019 03:15 AM, Sabrina Dubroca wrote:
>>> Commit 745e20f1b626 ("net: add a recursion limit in xmit path")
>>> introduced a recursio
On 03/14/2019 10:40 AM, Sabrina Dubroca wrote:
> 2019-03-14, 07:56:10 -0700, Eric Dumazet wrote:
>>
>>
>> On 03/14/2019 07:15 AM, Sabrina Dubroca wrote:
>>> 2019-03-14, 05:58:03 -0700, Eric Dumazet wrote:
>>>>
>>>>
>>>> On 03/
does not have to check dev->flags & IFF_UP
Virtual drivers do not have this guarantee, and must
therefore make the check themselves.
Fixes: 1bd4978a88ac ("tun: honor IFF_UP in tun_get_user()")
Signed-off-by: Eric Dumazet
Reported-by: syzbot
---
drivers/net/tun.c | 15 +++-
On 03/15/2019 03:06 AM, Zhiqiang Liu wrote:
> From: "Suanming.Mou"
>
> With ad6c9986bcb6, GRO cells will be destroyed in vxlan_uninit.
>
> Fixes: ad6c9986bcb6 ("vxlan: Fix GRO cells race condition between receive and
> link delete")
>
This is a net-next candidate .
The Fixes: tag is not n
On 03/15/2019 08:28 AM, Stefano Brivio wrote:
> On Fri, 15 Mar 2019 23:18:52 +0800
> Zhiqiang Liu wrote:
>
>> In vxlan_destroy_tunnels func, unregister_netdevice_queue is called after
>> gro_cells_destroy func. However, in unregister_netdevice_queue func, the
>> gro_cells_destroy func will als
00 00 00 00 00 00 00 00 00 00 00
88808b1ffc80: 00 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1 01 f2 01
Signed-off-by: Eric Dumazet
Reported-by: syzbot
---
net/rose/rose_subr.c | 21 -
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/net/rose/rose_subr.c b
On 03/15/2019 11:02 AM, David Miller wrote:
> From: Eric Dumazet
> Date: Fri, 15 Mar 2019 09:06:25 -0700
>
>>
>>
>> On 03/15/2019 08:28 AM, Stefano Brivio wrote:
>>> On Fri, 15 Mar 2019 23:18:52 +0800
>>> Zhiqiang Liu wrote:
>>>
>>
On 03/15/2019 02:08 PM, Stefano Brivio wrote:
> On Fri, 15 Mar 2019 11:56:01 -0700
> Eric Dumazet wrote:
>
>> On 03/15/2019 11:02 AM, David Miller wrote:
>>> From: Eric Dumazet
>>> Date: Fri, 15 Mar 2019 09:06:25 -0700
>>>
>>>>
On 03/14/2019 08:19 PM, Eric Dumazet wrote:
> Same reasons than the ones explained in commit 4179cb5a4c92
> ("vxlan: test dev->flags & IFF_UP before calling netif_rx()")
>
> netif_rx_ni() or napi_gro_frags() must be called under a strict contract.
>
&
In my latest patch I missed one rcu_read_unlock(), in case
device is down.
Fixes: 4477138fa0ae ("tun: properly test for IFF_UP")
Signed-off-by: Eric Dumazet
Reported-by: syzbot
---
drivers/net/tun.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/tun.c b/drivers/net/t
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet
---
net/ipv6/tcp_ipv6.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index
57ef69a1088908fc624ecfca99a728fa296ae0bf..44d431849d391d6903d263ae547f
When a dual stack dccp listener accepts an ipv4 flow,
it should not attempt to use an ipv6 header or
inet6_iif() helper.
Fixes: 3df80d9320bc ("[DCCP]: Introduce DCCPv6")
Signed-off-by: Eric Dumazet
---
net/dccp/ipv6.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff -
On 03/19/2019 05:45 AM, Stephen Suryaputra wrote:
> IPv4 has icmp_echo_ignore_broadcast to prevent responding to broadcast pings.
> IPv6 needs a similar mechanism.
>
...
> diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h
> index 87aa2a6d9125..bd83ddedc014 100644
> --- a
On 03/19/2019 06:25 AM, Paolo Abeni wrote:
> +u16 __netdev_pick_tx(struct net_device *dev, struct sk_buff *skb,
> + struct net_device *sb_dev)
> {
> struct sock *sk = skb->sk;
> int queue_index = sk_tx_queue_get(sk);
> @@ -3729,6 +3729,7 @@ static u16 __netdev_pick
TCP ipv6 fast path dereferences a pointer to get to the inet6
part of a tcp socket, but given the fixed memory placement,
we can do better and avoid a possible cache line miss.
This also reduces register pressure, since we let the compiler
know about this memory placement.
Signed-off-by: Eric
ic const struct bin_table bin_net_ipv6_icmp_table[] = {
> { CTL_INT, NET_IPV6_ICMP_RATELIMIT,"ratelimit" },
> {}
> };
>
> I will fix that as well.
>
No you do not want to 'fix' this.
We no longer add binary syctls (in kernel/sysctl_binary.c) , they are
deprecated.
gt; Define __reqsk_free() for these situations where we know nobody's
> referencing the socket, even though ->rsk_refcnt might be non-null.
> Now we can consolidate the error path of tcp_get_cookie_sock() and
> tcp_conn_request().
>
> Signed-off-by: Guillaume Nault
SGTM thanks
Signed-off-by: Eric Dumazet
On 03/20/2019 06:51 PM, Vakul Garg wrote:
> To free the skb in normal course of processing, consume_skb() should be
> used. Only for failure paths, skb_free() is intended to be used.
>
> https://www.kernel.org/doc/htmldocs/networking/API-consume-skb.html
>
> Signed-off-by: Vakul Garg
> ---
.
On 03/20/2019 08:39 PM, Alexei Starovoitov wrote:
> I think you need to convince Dave and Eric that
> above surgery is necessary to do the hack in patch 6 with
> +static DEFINE_PER_CPU(struct sk_buff, bpf_flow_skb);
>
Yes, this is a huge code churn.
Honestly I believe we are going too far in
On 03/08/2019 04:29 PM, Alexei Starovoitov wrote:
> On Fri, Mar 8, 2019 at 12:33 PM Eric Dumazet wrote:
>>
>> Running test_progs on a LOCKDEP enabled kernel (latest David Miller net tree)
>>
>> I got the following splat.
>>
>> It is not immediately obvio
On 03/21/2019 03:14 AM, Paolo Abeni wrote:
> The queue is marked not empty after acquiring the seqlock,
> and it's up to the NOLOCK qdisc clearing such flag on dequeue.
> Since the empty status lays on the same cache-line of the
> seqlock, it's always hot on cache during the updates.
>
> This m
back to this cpu.
Signed-off-by: Eric Dumazet
---
include/net/sock.h | 6 ++
net/ipv4/af_inet.c | 4
net/ipv4/tcp.c | 4
net/ipv4/tcp_ipv4.c | 11 +--
net/ipv6/tcp_ipv6.c | 12 +---
5 files changed, 32 insertions(+), 5 deletions(-)
diff --git a/include/net
msg() time, do not free the skb but put it in a tcp socket cache
so that it can be freed by the cpu feeding the incoming packets in BH.
This increased the performance of small RPC benchmark by about 10 % on a host
with 112 hyperthreads.
Eric Dumazet (3):
net: convert rps_needed and rfs_nee
.
This patch uses an extra pointer in socket structure, so that we try to reuse
the same skb and avoid these expensive costs.
We cache at most one skb per socket so this should be safe as far as
memory pressure is concerned.
Signed-off-by: Eric Dumazet
---
include/net/sock.h | 5 +
net/ipv4
We prefer static_branch_unlikely() over static_key_false() these days.
Signed-off-by: Eric Dumazet
---
include/linux/netdevice.h | 4 ++--
include/net/sock.h | 2 +-
net/core/dev.c | 10 +-
net/core/net-sysfs.c | 4 ++--
net/core/sysctl_net_core.c | 8
On 03/21/2019 03:17 PM, Eric Dumazet wrote:
> Often times, recvmsg() system calls and BH handling for a particular
> TCP socket are done on different cpus.
...
> Note that if rps/rfs is used, we do not enable this feature, because
> there is high chance that the same cpu is handl
We prefer static_branch_unlikely() over static_key_false() these days.
Signed-off-by: Eric Dumazet
---
drivers/net/tun.c | 2 +-
include/linux/netdevice.h | 4 ++--
include/net/sock.h | 2 +-
net/core/dev.c | 10 +-
net/core/net-sysfs.c | 4
make sure the prior
clone has been freed.
- Really test rps_needed in sk_eat_skb() as claimed.
- Fixed rps_needed use in drivers/net/tun.c
Eric Dumazet (3):
net: convert rps_needed and rfs_needed to new static branch api
tcp: add one skb cache for tx
tcp: add one skb ca
back to this cpu.
Signed-off-by: Eric Dumazet
---
include/net/sock.h | 6 ++
net/ipv4/af_inet.c | 4
net/ipv4/tcp.c | 4
net/ipv4/tcp_ipv4.c | 11 +--
net/ipv6/tcp_ipv6.c | 12 +---
5 files changed, 32 insertions(+), 5 deletions(-)
diff --git a/include/net
701 - 800 of 7364 matches
Mail list logo