Re: [PATCH v2 1/3] unix: fix use-after-free in unix_dgram_poll()

2015-10-02 Thread Mathias Krause
On 2 October 2015 at 22:43, Jason Baron wrote: > The unix_dgram_poll() routine calls sock_poll_wait() not only for the wait > queue associated with the socket s that we are poll'ing against, but also > calls > sock_poll_wait() for a remote peer socket p, if it is connected. Thus, > if we call pol

Re: [PATCH] ceph:Remove unused goto labels in decode crush map functions

2015-10-02 Thread Ilya Dryomov
On Fri, Oct 2, 2015 at 9:48 PM, Nicholas Krause wrote: > This removes unused goto labels in decode crush map functions related > to error paths due to them never being used on any error path for these > particular functions in the file, osdmap.c. > > Signed-off-by: Nicholas Krause > --- > net/ce

Re: [PATCH net-next v2 4/4] openvswitch: IPv6 support for ovs_tunnel_get_egress_info

2015-10-02 Thread Jesse Gross
On Fri, Oct 2, 2015 at 12:32 PM, Pravin Shelar wrote: > On Thu, Oct 1, 2015 at 11:00 PM, Jiri Benc wrote: >> On Thu, 1 Oct 2015 17:11:56 -0700, Pravin Shelar wrote: >>> I dont see point of adding this code when IPv6 sampling not support by >>> the patch series. >> >> It was requested by Jesse: >>

Re: Slow ramp-up for single-stream TCP throughput on 4.2 kernel.

2015-10-02 Thread Ben Greear
Gah, seems 'cubic' related. That is the default tcp cong ctrl I was using (same in 3.17, for that matter). Most other rate-ctrls vastly out-perform it. On 10/02/2015 04:42 PM, Ben Greear wrote: I'm seeing something that looks more dodgy than normal. Gah, seems 'cubic' related. That is the

Re: DSA driver - how to glue to a PCI based NIC's mdio?

2015-10-02 Thread Tim Harvey
On Wed, Sep 30, 2015 at 2:40 PM, Andrew Lunn wrote: >> > information to the NIC's device driver. Better would be to have a >> > small shim driver which is loaded on your PCI_ID/DEVICE_ID. That would >> > instantiate the NIC driver, and insert a DSA platform device. >> >> I was thinking of this as

Slow ramp-up for single-stream TCP throughput on 4.2 kernel.

2015-10-02 Thread Ben Greear
I'm seeing something that looks more dodgy than normal. Test case id ath10k station uploading to ath10k AP. AP is always running 4.2 kernel in this case, and both systems are using the same ath10k firmware. I have tuned the stack: echo 400 > /proc/sys/net/core/wmem_max echo 4096 87380 5000

Re: v5 of seccomp filter c/r patches

2015-10-02 Thread Kees Cook
On Fri, Oct 2, 2015 at 3:57 PM, Daniel Borkmann wrote: > On 10/03/2015 12:44 AM, Tycho Andersen wrote: >> >> On Fri, Oct 02, 2015 at 02:10:24PM -0700, Kees Cook wrote: > > ... >> >> Ok, how about, >> >> struct sock_filter insns[BPF_MAXINSNS]; >> insn_cnt = ptrace(PTRACE_SECCOMP_GET_FILTER, pid, in

Re: v5 of seccomp filter c/r patches

2015-10-02 Thread Daniel Borkmann
On 10/03/2015 12:44 AM, Tycho Andersen wrote: On Fri, Oct 02, 2015 at 02:10:24PM -0700, Kees Cook wrote: ... Ok, how about, struct sock_filter insns[BPF_MAXINSNS]; insn_cnt = ptrace(PTRACE_SECCOMP_GET_FILTER, pid, insns, i); Would also be good that when the storage buffer (insns) is NULL, it

Re: v5 of seccomp filter c/r patches

2015-10-02 Thread Tycho Andersen
On Sat, Oct 03, 2015 at 12:57:49AM +0200, Daniel Borkmann wrote: > On 10/03/2015 12:44 AM, Tycho Andersen wrote: > >On Fri, Oct 02, 2015 at 02:10:24PM -0700, Kees Cook wrote: > ... > >Ok, how about, > > > >struct sock_filter insns[BPF_MAXINSNS]; > >insn_cnt = ptrace(PTRACE_SECCOMP_GET_FILTER, pid,

Re: v5 of seccomp filter c/r patches

2015-10-02 Thread Tycho Andersen
On Fri, Oct 02, 2015 at 03:52:03PM -0700, Andy Lutomirski wrote: > On Fri, Oct 2, 2015 at 3:44 PM, Tycho Andersen > wrote: > > On Fri, Oct 02, 2015 at 02:10:24PM -0700, Kees Cook wrote: > >> On Fri, Oct 2, 2015 at 9:27 AM, Tycho Andersen > >> wrote: > >> > Hi all, > >> > > >> > Here's v5 of the s

Re: v5 of seccomp filter c/r patches

2015-10-02 Thread Andy Lutomirski
On Fri, Oct 2, 2015 at 3:44 PM, Tycho Andersen wrote: > On Fri, Oct 02, 2015 at 02:10:24PM -0700, Kees Cook wrote: >> On Fri, Oct 2, 2015 at 9:27 AM, Tycho Andersen >> wrote: >> > Hi all, >> > >> > Here's v5 of the seccomp filter c/r set. The individual patch notes have >> > changes, but two high

Re: v5 of seccomp filter c/r patches

2015-10-02 Thread Tycho Andersen
On Fri, Oct 02, 2015 at 02:10:24PM -0700, Kees Cook wrote: > On Fri, Oct 2, 2015 at 9:27 AM, Tycho Andersen > wrote: > > Hi all, > > > > Here's v5 of the seccomp filter c/r set. The individual patch notes have > > changes, but two highlights are: > > > > * This series is now based on http://patchw

Re: [PATCH] ovs: do not allocate memory from offline numa node

2015-10-02 Thread Pravin Shelar
On Fri, Oct 2, 2015 at 3:18 AM, Konstantin Khlebnikov wrote: > When openvswitch tries allocate memory from offline numa node 0: > stats = kmem_cache_alloc_node(flow_stats_cache, GFP_KERNEL | __GFP_ZERO, 0) > It catches VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES || !node_online(nid)) > [ replaced wit

Re: v5 of seccomp filter c/r patches

2015-10-02 Thread Andy Lutomirski
On Fri, Oct 2, 2015 at 3:06 PM, Kees Cook wrote: > On Fri, Oct 2, 2015 at 3:04 PM, Andy Lutomirski wrote: >> On Fri, Oct 2, 2015 at 3:02 PM, Kees Cook wrote: >>> On Fri, Oct 2, 2015 at 2:29 PM, Andy Lutomirski wrote: On Fri, Oct 2, 2015 at 2:10 PM, Kees Cook wrote: > On Fri, Oct 2, 20

Re: v5 of seccomp filter c/r patches

2015-10-02 Thread Kees Cook
On Fri, Oct 2, 2015 at 3:04 PM, Andy Lutomirski wrote: > On Fri, Oct 2, 2015 at 3:02 PM, Kees Cook wrote: >> On Fri, Oct 2, 2015 at 2:29 PM, Andy Lutomirski wrote: >>> On Fri, Oct 2, 2015 at 2:10 PM, Kees Cook wrote: On Fri, Oct 2, 2015 at 9:27 AM, Tycho Andersen wrote: > Hi all,

Re: v5 of seccomp filter c/r patches

2015-10-02 Thread Andy Lutomirski
On Fri, Oct 2, 2015 at 3:02 PM, Kees Cook wrote: > On Fri, Oct 2, 2015 at 2:29 PM, Andy Lutomirski wrote: >> On Fri, Oct 2, 2015 at 2:10 PM, Kees Cook wrote: >>> On Fri, Oct 2, 2015 at 9:27 AM, Tycho Andersen >>> wrote: Hi all, Here's v5 of the seccomp filter c/r set. The individ

Re: v5 of seccomp filter c/r patches

2015-10-02 Thread Kees Cook
On Fri, Oct 2, 2015 at 2:29 PM, Andy Lutomirski wrote: > On Fri, Oct 2, 2015 at 2:10 PM, Kees Cook wrote: >> On Fri, Oct 2, 2015 at 9:27 AM, Tycho Andersen >> wrote: >>> Hi all, >>> >>> Here's v5 of the seccomp filter c/r set. The individual patch notes have >>> changes, but two highlights are:

Re: [PATCH net-next V14 3/3] openvswitch: 802.1ad: Flow handling, actions, vlan parsing and netlink attributes

2015-10-02 Thread Pravin Shelar
On Fri, Oct 2, 2015 at 2:48 PM, Thomas F Herbert wrote: > On 9/30/15 11:33 PM, Thomas F Herbert wrote: >> >> Add support for 802.1ad including the ability to push and pop double >> tagged vlans. Add support for 802.1ad to netlink parsing and flow >> conversion. Uses double nested encap attributes

[PATCH net] openvswitch: Fix ovs_vport_get_stats()

2015-10-02 Thread Pravin B Shelar
Not every device has dev->tstats set. So when OVS tries to calculate vport stats it causes kernel panic. Following patch fixes it by using standard API to get net-device stats. ---8<--- Unable to handle kernel paging request at virtual address 766b4008 Internal error: Oops: 9605 [#1] PREEMPT S

Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists

2015-10-02 Thread Andrew Morton
On Fri, 2 Oct 2015 15:40:39 +0200 Jesper Dangaard Brouer wrote: > > Thus, I need introducing new code like this patch and at the same time > > have to reduce the number of instruction-cache misses/usage. In this > > case we solve the problem by kmem_cache_free_bulk() not getting called > > too

Re: [PATCH net-next V14 3/3] openvswitch: 802.1ad: Flow handling, actions, vlan parsing and netlink attributes

2015-10-02 Thread Thomas F Herbert
On 9/30/15 11:33 PM, Thomas F Herbert wrote: Add support for 802.1ad including the ability to push and pop double tagged vlans. Add support for 802.1ad to netlink parsing and flow conversion. Uses double nested encap attributes to represent double tagged vlan. Inner TPID encoded along with ctci i

Re: Soft lockup issue in Linux 4.1.9

2015-10-02 Thread Eric Dumazet
On Fri, 2015-10-02 at 23:04 +0200, Thomas Gleixner wrote: > On Fri, 2 Oct 2015, Eric Dumazet wrote: > > On Fri, 2015-10-02 at 22:04 +0200, Thomas Gleixner wrote: > > > > > What makes sure, that the timer cannot be readded while that timer > > > callback is running? > > > > What is exactly your qu

Re: v5 of seccomp filter c/r patches

2015-10-02 Thread Andy Lutomirski
On Fri, Oct 2, 2015 at 2:10 PM, Kees Cook wrote: > On Fri, Oct 2, 2015 at 9:27 AM, Tycho Andersen > wrote: >> Hi all, >> >> Here's v5 of the seccomp filter c/r set. The individual patch notes have >> changes, but two highlights are: >> >> * This series is now based on http://patchwork.ozlabs.org/

Re: [PATCH 08/12] nfnetlink: use y2038 safe timestamp

2015-10-02 Thread Arnd Bergmann
On Friday 02 October 2015 14:53:55 Pablo Neira Ayuso wrote: > On Wed, Sep 30, 2015 at 01:26:38PM +0200, Arnd Bergmann wrote: > > The __build_packet_message function fills a nfulnl_msg_packet_timestamp > > structure that uses 64-bit seconds and is therefore y2038 safe, but > > it uses an intermediat

Re: v5 of seccomp filter c/r patches

2015-10-02 Thread Kees Cook
On Fri, Oct 2, 2015 at 9:27 AM, Tycho Andersen wrote: > Hi all, > > Here's v5 of the seccomp filter c/r set. The individual patch notes have > changes, but two highlights are: > > * This series is now based on http://patchwork.ozlabs.org/patch/525492/ and > will need to be built with that patch

Re: Soft lockup issue in Linux 4.1.9

2015-10-02 Thread Thomas Gleixner
On Fri, 2 Oct 2015, Eric Dumazet wrote: > On Fri, 2015-10-02 at 22:04 +0200, Thomas Gleixner wrote: > > > What makes sure, that the timer cannot be readded while that timer > > callback is running? > > What is exactly your question ? CPU0CPU1 timer expires callback

Re: Soft lockup issue in Linux 4.1.9

2015-10-02 Thread Eric Dumazet
On Fri, 2015-10-02 at 22:04 +0200, Thomas Gleixner wrote: > What makes sure, that the timer cannot be readded while that timer > callback is running? What is exactly your question ? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.

proposal

2015-10-02 Thread Mr
I wish to discuss a very confidential business proposition worth $48Million USD with you that will be of immense benefit to the both of us, but I want your consent before sending details. Mr Wing -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to maj

[PATCH v2 0/3] af_unix: fix use-after-free

2015-10-02 Thread Jason Baron
Hi, These patches are against mainline, I can re-base to net-next, just let me know. They have been tested against: https://lkml.org/lkml/2015/9/13/195, which causes the use-after-free quite quickly and here: https://lkml.org/lkml/2015/10/2/693. Thanks, -Jason Jason Baron (3): unix: fix use

[PATCH v2 2/3] af_unix: Convert gc_flags to flags

2015-10-02 Thread Jason Baron
Convert gc_flags to flags in preparation for the subsequent patch, which will make use of a flag bit for a non-gc purpose. Signed-off-by: Jason Baron --- include/net/af_unix.h | 2 +- net/unix/garbage.c| 12 ++-- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/include

[PATCH v2 1/3] unix: fix use-after-free in unix_dgram_poll()

2015-10-02 Thread Jason Baron
The unix_dgram_poll() routine calls sock_poll_wait() not only for the wait queue associated with the socket s that we are poll'ing against, but also calls sock_poll_wait() for a remote peer socket p, if it is connected. Thus, if we call poll()/select()/epoll() for the socket s, there are then a cou

[PATCH v2 3/3] af_unix: optimize the unix_dgram_recvmsg()

2015-10-02 Thread Jason Baron
Now that connect() permanently registers a callback routine, we can induce extra overhead in unix_dgram_recvmsg(), which unconditionally wakes up its peer_wait queue on every receive. This patch makes the wakeup there conditional on there being waiters interested in wait events. Signed-off-by: Jas

Re: [PATCH] unix: fix use-after-free with unix_dgram_poll()

2015-10-02 Thread Rainer Weikusat
Jason Baron writes: > On 10/02/2015 03:30 PM, Rainer Weikusat wrote: >> Jason Baron writes: >>> From: Jason Baron >>> >>> The unix_dgram_poll() routine calls sock_poll_wait() not only for the wait >>> queue associated with the socket s that we've called poll() on, but it also >>> calls sock_poll

Re: Soft lockup issue in Linux 4.1.9

2015-10-02 Thread Thomas Gleixner
On Thu, 1 Oct 2015, Eric Dumazet wrote: > On Thu, Oct 1, 2015 at 4:43 AM, Holger Hoffstätte > wrote: > > On 10/01/15 13:29, Eric Dumazet wrote: > > >> commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af > >> Author: Eric Dumazet > >> Date: Thu Aug 13 15:44:51 2015 -0700 > >> > >> inet: fix pot

Re: [PATCH net-next v2] net: Add support for filtering neigh dump by master device

2015-10-02 Thread David Ahern
On 10/2/15 11:18 AM, Eric W. Biederman wrote: What is the thinking here because it sure looks like you are busily adding layer two functionality you swore you did not want. Interfaces are enslaved to a VRF device, but neighbor entries are installed with a reference to the actual interface not

Re: [PATCH] unix: fix use-after-free with unix_dgram_poll()

2015-10-02 Thread Rainer Weikusat
Rainer Weikusat writes: > Jason Baron writes: >> From: Jason Baron >> >> The unix_dgram_poll() routine calls sock_poll_wait() not only for the wait >> queue associated with the socket s that we've called poll() on, but it also >> calls sock_poll_wait() for a remote peer socket's wait queue, if i

Re: [PATCH] unix: fix use-after-free with unix_dgram_poll()

2015-10-02 Thread Jason Baron
On 10/02/2015 03:30 PM, Rainer Weikusat wrote: > Jason Baron writes: >> From: Jason Baron >> >> The unix_dgram_poll() routine calls sock_poll_wait() not only for the wait >> queue associated with the socket s that we've called poll() on, but it also >> calls sock_poll_wait() for a remote peer soc

Re: [PATCH] unix: fix use-after-free with unix_dgram_poll()

2015-10-02 Thread Rainer Weikusat
Jason Baron writes: > From: Jason Baron > > The unix_dgram_poll() routine calls sock_poll_wait() not only for the wait > queue associated with the socket s that we've called poll() on, but it also > calls sock_poll_wait() for a remote peer socket's wait queue, if it's > connected. > Thus, if we

Re: [PATCH net-next v2 4/4] openvswitch: IPv6 support for ovs_tunnel_get_egress_info

2015-10-02 Thread Pravin Shelar
On Thu, Oct 1, 2015 at 11:00 PM, Jiri Benc wrote: > On Thu, 1 Oct 2015 17:11:56 -0700, Pravin Shelar wrote: >> I dont see point of adding this code when IPv6 sampling not support by >> the patch series. > > It was requested by Jesse: > http://article.gmane.org/gmane.linux.network/380348 > I don't

Re: Soft lockup issue in Linux 4.1.9

2015-10-02 Thread Wolfgang Walter
Am Freitag, 2. Oktober 2015, 09:17:16 schrieb Holger Hoffstätte: > On 10/02/15 08:52, Andre Tomt wrote: > > On 01. okt. 2015 13:52, Eric Dumazet wrote: > >> On Thu, Oct 1, 2015 at 4:43 AM, Holger Hoffstätte > >> > >> wrote: > >>> On 10/01/15 13:29, Eric Dumazet wrote: > commit 83fccfc3940c4a

[PATCH] unix: fix use-after-free with unix_dgram_poll()

2015-10-02 Thread Jason Baron
From: Jason Baron The unix_dgram_poll() routine calls sock_poll_wait() not only for the wait queue associated with the socket s that we've called poll() on, but it also calls sock_poll_wait() for a remote peer socket's wait queue, if it's connected. Thus, if we call poll()/select()/epoll() for th

[PATCH net-next 04/17] tcp: call sk_mark_napi_id() on the child, not the listener

2015-10-02 Thread Eric Dumazet
This fixes a typo : We want to store the NAPI id on child socket. Presumably nobody really uses busy polling, on short lived flows. Fixes: 3d97379a67486 ("tcp: move sk_mark_napi_id() at the right place") Signed-off-by: Eric Dumazet --- net/ipv4/tcp_ipv4.c | 2 +- net/ipv6/tcp_ipv6.c | 2 +- 2 fi

[PATCH net-next 05/17] tcp/dccp: init sk_prot and call sk_node_init() in reqsk_alloc()

2015-10-02 Thread Eric Dumazet
We plan to use generic functions to insert request sockets into ehash table. sk_prot needs to be set (to retrieve sk_prot->h.hashinfo) sk_node needs to be cleared. Signed-off-by: Eric Dumazet --- include/net/request_sock.h | 22 -- 1 file changed, 12 insertions(+), 10 deleti

[PATCH net-next 07/17] tcp: remove BUG_ON() in tcp_check_req()

2015-10-02 Thread Eric Dumazet
Once listener is lockless, its sk_state can change anytime. Signed-off-by: Eric Dumazet --- net/ipv4/tcp_minisocks.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 897e34273ba3..9adf1e2c3170 100644 --- a/net/ipv4/tcp_minisocks.c +++

[PATCH net-next 02/17] tcp: move qlen/young out of struct listen_sock

2015-10-02 Thread Eric Dumazet
qlen_inc & young_inc were protected by listener lock, while qlen_dec & young_dec were atomic fields. Everything needs to be atomic for upcoming lockless listener. Also move qlen/young in request_sock_queue as we'll get rid of struct listen_sock eventually. Signed-off-by: Eric Dumazet --- inclu

[PATCH net-next 08/17] tcp: get_openreq[46]() changes

2015-10-02 Thread Eric Dumazet
When request sockets are no longer in a per listener hash table but on regular TCP ehash, we need to access listener uid through req->rsk_listener get_openreq6() also gets a const for its request socket argument. Signed-off-by: Eric Dumazet --- include/net/tcp.h | 1 - net/ipv4/tcp_ipv4.c | 8

[PATCH net-next 06/17] tcp: cleanup tcp_v[46]_inbound_md5_hash()

2015-10-02 Thread Eric Dumazet
We'll soon have to call tcp_v[46]_inbound_md5_hash() twice. Also add const attribute to the socket, as it might be the unlocked listener for SYN packets. Signed-off-by: Eric Dumazet --- net/ipv4/tcp_ipv4.c | 16 ++-- net/ipv6/tcp_ipv6.c | 10 ++ 2 files changed, 12 insertions

[PATCH net-next 10/17] tcp/dccp: install syn_recv requests into ehash table

2015-10-02 Thread Eric Dumazet
In this patch, we insert request sockets into TCP/DCCP regular ehash table (where ESTABLISHED and TIMEWAIT sockets are) instead of using the per listener hash table. ACK packets find SYN_RECV pseudo sockets without having to find and lock the listener. In nominal conditions, this halves pressure

[PATCH net-next 09/17] tcp/dccp: remove inet_csk_reqsk_queue_added() timeout argument

2015-10-02 Thread Eric Dumazet
This is no longer used. Signed-off-by: Eric Dumazet --- include/net/inet_connection_sock.h | 3 +-- net/ipv4/inet_connection_sock.c| 2 +- net/ipv6/inet6_connection_sock.c | 2 +- 3 files changed, 3 insertions(+), 4 deletions(-) diff --git a/include/net/inet_connection_sock.h b/include/n

[PATCH net-next 11/17] tcp/dccp: shrink struct listen_sock

2015-10-02 Thread Eric Dumazet
We no longer use hash_rnd, nr_table_entries and syn_table[] For a listener with a backlog of 10 millions sockets, this saves 80 MBytes of vmalloced memory. Signed-off-by: Eric Dumazet --- include/net/request_sock.h | 3 --- net/core/request_sock.c| 14 +++--- 2 files changed, 3 ins

[PATCH net-next 12/17] ipv6: remove obsolete inet6 functions

2015-10-02 Thread Eric Dumazet
inet6_csk_search_req() and inet6_csk_reqsk_queue_hash_add() no longer exist. Signed-off-by: Eric Dumazet --- include/net/inet6_connection_sock.h | 9 - 1 file changed, 9 deletions(-) diff --git a/include/net/inet6_connection_sock.h b/include/net/inet6_connection_sock.h index 79b2a4c09c

[PATCH net-next 13/17] tcp: attach SYNACK messages to request sockets instead of listener

2015-10-02 Thread Eric Dumazet
If a listen backlog is very big (to avoid syncookies), then the listener sk->sk_wmem_alloc is the main source of false sharing, as we need to touch it twice per SYNACK re-transmit and TX completion. (One SYN packet takes listener lock once, but up to 6 SYNACK are generated) By attaching the skb t

[PATCH net-next 14/17] tcp/dccp: remove struct listen_sock

2015-10-02 Thread Eric Dumazet
It is enough to check listener sk_state, no need for an extra condition. max_qlen_log can be moved into struct request_sock_queue We can remove syn_wait_lock and the alignment it enforced. Signed-off-by: Eric Dumazet --- include/net/request_sock.h | 26 --- net/core/re

[PATCH net-next 00/17] tcp/dccp: lockless listener

2015-10-02 Thread Eric Dumazet
TCP listener refactoring : this is becoming interesting ! This patch series takes the steps to use normal TCP/DCCP ehash table to store SYN_RECV requests, instead of the private per-listener hash table we had until now. SYNACK skb are now attached to their syn_recv request socket, so that we no l

[PATCH net-next 03/17] tcp: move synflood_warned into struct request_sock_queue

2015-10-02 Thread Eric Dumazet
long term plan is to remove struct listen_sock when its hash table is no longer there. Signed-off-by: Eric Dumazet --- include/net/request_sock.h | 2 +- net/ipv4/tcp_input.c | 7 +++ 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/include/net/request_sock.h b/include/ne

[PATCH net-next 15/17] tcp: remove max_qlen_log

2015-10-02 Thread Eric Dumazet
This control variable was set at first listen(fd, backlog) call, but not updated if application tried to increase or decrease backlog. It made sense at the time listener had a non resizeable hash table. Also rounding to powers of two was not very friendly. Signed-off-by: Eric Dumazet --- includ

[PATCH net-next 01/17] tcp: add a spinlock to protect struct request_sock_queue

2015-10-02 Thread Eric Dumazet
struct request_sock_queue fields are currently protected by the listener 'lock' (not a real spinlock) We need to add a private spinlock instead, so that softirq handlers creating children do not have to worry with backlog notion that the listener 'lock' carries. Signed-off-by: Eric Dumazet ---

[PATCH net-next 17/17] tcp: do not lock listener to process SYN packets

2015-10-02 Thread Eric Dumazet
Everything should now be ready to finally allow SYN packets processing without holding listener lock. Tested: 3.5 Mpps SYNFLOOD. Plenty of cpu cycles available. Next bottleneck is the refcount taken on listener, that could be avoided if we remove SLAB_DESTROY_BY_RCU strict semantic for listeners

[PATCH net-next 16/17] tcp/dccp: add a reschedule point in inet_csk_listen_stop()

2015-10-02 Thread Eric Dumazet
If a listener with thousands of children in accept queue is dismantled, it can take a while to close all of them. Signed-off-by: Eric Dumazet --- net/ipv4/inet_connection_sock.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c

Re: [PATCH net] ppp: don't override sk->sk_state in pppoe_flush_dev()

2015-10-02 Thread Guillaume Nault
On Fri, Oct 02, 2015 at 11:01:45AM +0300, Denys Fedoryshchenko wrote: > Here is similar panic after patch applied (it might be different bug), got > over netconsole: > > [126348.617115] CPU: 0 PID: 5254 Comm: accel-pppd Not tainted > 4.2.2-build-0087 #2 > [126348.617632] Hardware name: Intel Cor

Re: [RFC PATCH 3/3] net: dsa: exit probe if no switch were found

2015-10-02 Thread Florian Fainelli
On 02/10/15 05:10, Neil Armstrong wrote: > On 10/01/2015 06:32 PM, Andrew Lunn wrote: >> On Thu, Oct 01, 2015 at 05:27:32PM +0200, Neil Armstrong wrote: >>> On 09/30/2015 10:21 AM, Neil Armstrong wrote: If no switch were found in dsa_setup_dst, return -ENODEV and exit the dsa_probe cleanl

Re: [PATCH net] fib_rules: fix fib rule dumps across multiple skbs

2015-10-02 Thread roopa
On 10/2/15, 10:18 AM, Roland Dreier wrote: > On Tue, Sep 22, 2015 at 9:40 PM, Roopa Prabhu > wrote: >> + err = fib_nl_fill_rule(skb, rule, NETLINK_CB(cb->skb).portid, >> + cb->nlh->nlmsg_seq, RTM_NEWRULE, >> +

Re: [PATCH net-next V14 0/3] openvswitch: Add support for 802.1ad

2015-10-02 Thread Pravin Shelar
On Wed, Sep 30, 2015 at 8:32 PM, Thomas F Herbert wrote: > Although the Open Flow specification specified support for 802.1AD (qinq) > as well as push and pop vlan headers, So far Open vSwitch has only > supported a single tag header. This patch implements 802.1AD in the kernel > module. > > This

Re: [PATCH net-next v2] net: Add support for filtering neigh dump by master device

2015-10-02 Thread Eric W. Biederman
David Ahern writes: > Add support for filtering neighbor dumps by master device by adding > the NDA_MASTER attribute to the dump request. A new netlink flag, > NLM_F_DUMP_FILTERED, is added to indicate the kernel supports the > request and output is filtered as requested. *Scratches my head* I

Re: [PATCH 1/2] regmap: Allow installing custom reg_update_bits function

2015-10-02 Thread Mark Brown
On Thu, Oct 01, 2015 at 08:29:19AM -0400, Jon Ringle wrote: > On Thu, 1 Oct 2015, Mark Brown wrote: > > This completely bypasses and therefore breaks the cache infrastructure. > Right after sending the v2 patch, I realized that calling the > custom reg_update_bits would only be applicable for re

Re: [PATCH net] fib_rules: fix fib rule dumps across multiple skbs

2015-10-02 Thread Roland Dreier
On Tue, Sep 22, 2015 at 9:40 PM, Roopa Prabhu wrote: > + err = fib_nl_fill_rule(skb, rule, NETLINK_CB(cb->skb).portid, > + cb->nlh->nlmsg_seq, RTM_NEWRULE, > + NLM_F_MULTI, ops); > + if (err) FWI

Re: [PATCH net-next] ebpf: include perf_event only where really needed

2015-10-02 Thread Alexei Starovoitov
On 10/2/15 9:42 AM, Daniel Borkmann wrote: Commit ea317b267e9d ("bpf: Add new bpf map type to store the pointer to struct perf_event") added perf_event.h to the main eBPF header, so it gets included for all users. perf_event.h is actually only needed from array map side, so lets sanitize this a b

[PATCH] ip neigh: Add support for filtering dumps by master device

2015-10-02 Thread David Ahern
Add support for filtering neighbor dumps by master device. Kernel side support provided by commit 21fdd092acc7. Since the feature is not available in older kernels the user is given a warning message if the kernel does not support the request. Signed-off-by: David Ahern --- include/libnetlink.h

[PATCH net-next] ebpf: include perf_event only where really needed

2015-10-02 Thread Daniel Borkmann
Commit ea317b267e9d ("bpf: Add new bpf map type to store the pointer to struct perf_event") added perf_event.h to the main eBPF header, so it gets included for all users. perf_event.h is actually only needed from array map side, so lets sanitize this a bit. Signed-off-by: Daniel Borkmann Cc: Kaix

[PATCH v5 2/3] seccomp: add a ptrace command to get seccomp filter fds

2015-10-02 Thread Tycho Andersen
I just picked 40 for the constant out of thin air, but there may be a more appropriate value for this. Also, we return EINVAL when there is no filter for the index the user requested, but ptrace also returns EINVAL for invalid commands, making it slightly awkward to test whether or not the kernel s

[PATCH v5 1/3] seccomp: add the concept of a seccomp filter FD

2015-10-02 Thread Tycho Andersen
This patch introduces the concept of a seccomp fd, with a similar interface and usage to ebpf fds. Initially, one is allowed to create, install, and dump these fds. Any manipulation of seccomp fds requires users to be root in their own user namespace, matching the checks done for SECCOMP_SET_MODE_F

v5 of seccomp filter c/r patches

2015-10-02 Thread Tycho Andersen
Hi all, Here's v5 of the seccomp filter c/r set. The individual patch notes have changes, but two highlights are: * This series is now based on http://patchwork.ozlabs.org/patch/525492/ and will need to be built with that patch applied. This gets rid of two incorrect patches in the previous s

[PATCH v5 3/3] kcmp: add KCMP_SECCOMP_FD

2015-10-02 Thread Tycho Andersen
This command allows for comparing the filters pointed to by two seccomp fds. This is useful e.g. to find out if a seccomp filter is inherited, since struct seccomp_filter are unique across tasks and are the private_data seccomp fds. v2: switch to KCMP_SECCOMP_FD instead of KCMP_FILE_PRIVATE_DATA

Re: [PATCH net-next V2] ARM: net: support BPF_ALU | BPF_MOD instructions in the BPF JIT.

2015-10-02 Thread Alexei Starovoitov
On Fri, Oct 02, 2015 at 05:06:47PM +0200, Nicolas Schichan wrote: > For ARMv7 with UDIV instruction support, generate an UDIV instruction > followed by an MLS instruction. > > For other ARM variants, generate code calling a C wrapper similar to > the jit_udiv() function used for BPF_ALU | BPF_DIV

Re: [PATCH net-next] bpf, seccomp: prepare for upcoming criu support

2015-10-02 Thread Daniel Borkmann
On 10/02/2015 05:09 PM, Alexei Starovoitov wrote: ... I agree that adding flag to bpf_prog_create_from_user() is cleaner than exposing static bpf_prog_store_orig_filter(), so There's also another reason as mentioned, i.e. that the progs are ro-locked, so doing bpf_prog_store_orig_filter() after

Re: [PATCH net-next] bpf, seccomp: prepare for upcoming criu support

2015-10-02 Thread Daniel Borkmann
On 10/02/2015 05:06 PM, Tycho Andersen wrote: ... Cc: Pavel Emelyanov Cc: Kees Cook Cc: Andy Lutomirski Cc: Alexei Starovoitov --- This is in realtion to Tycho's latest patch set under [1]. The BPF handling is unfortunately not correct (triggering a crash on kernels that can set pages a

[PATCH net-next V2] ARM: net: support BPF_ALU | BPF_MOD instructions in the BPF JIT.

2015-10-02 Thread Nicolas Schichan
For ARMv7 with UDIV instruction support, generate an UDIV instruction followed by an MLS instruction. For other ARM variants, generate code calling a C wrapper similar to the jit_udiv() function used for BPF_ALU | BPF_DIV instructions. Some performance numbers reported by the test_bpf module (the

Re: [PATCH net-next] bpf, seccomp: prepare for upcoming criu support

2015-10-02 Thread Tycho Andersen
Hi Daniel, On Fri, Oct 02, 2015 at 03:17:33PM +0200, Daniel Borkmann wrote: > The current ongoing effort to dump existing cBPF seccomp filters back > to user space requires to hold the pre-transformed instructions like > we do in case of socket filters from sk_attach_filter() side, so they > can b

Re: [PATCH net-next] bpf, seccomp: prepare for upcoming criu support

2015-10-02 Thread Alexei Starovoitov
On 10/2/15 6:17 AM, Daniel Borkmann wrote: The current ongoing effort to dump existing cBPF seccomp filters back to user space requires to hold the pre-transformed instructions like we do in case of socket filters from sk_attach_filter() side, so they can be reloaded in original form at a later p

Re: [PATCH net] bpf: fix panic in SO_GET_FILTER with native ebpf programs

2015-10-02 Thread Alexei Starovoitov
On 10/2/15 3:06 AM, Daniel Borkmann wrote: However, sk_get_filter() wasn't updated to test for this at the time when eBPF could be attached. Just throw an error to the user to indicate that eBPF cannot be dumped over this interface. That way, it can also be known that a program_is_ attached (as

Re: [PATCH net-next] ARM: net: support BPF_ALU | BPF_MOD instructions in the BPF JIT.

2015-10-02 Thread Russell King - ARM Linux
On Fri, Oct 02, 2015 at 04:37:51PM +0200, Nicolas Schichan wrote: > @@ -125,7 +125,7 @@ static u64 jit_get_skb_w(struct sk_buff *skb, int offset) > } > > /* > - * Wrapper that handles both OABI and EABI and assures Thumb2 interworking > + * Wrappers that handles both OABI and EABI and assures T

[PATCH net-next] ARM: net: support BPF_ALU | BPF_MOD instructions in the BPF JIT.

2015-10-02 Thread Nicolas Schichan
For ARMv7 with UDIV instruction support, generate an UDIV instruction followed by an MLS instruction. For other ARM variants, generate code calling a C wrapper similar to the jit_udiv() function used for BPF_ALU | BPF_DIV instructions. Some performance numbers reported by the test_bpf module (the

Re: [PATCH RFC 3/7] netfilter: add NF_INET_LOCAL_SOCKET_IN chain type

2015-10-02 Thread Daniel Mack
On 10/02/2015 01:07 PM, Pablo Neira Ayuso wrote: > On Thu, Oct 01, 2015 at 11:07:30PM +0200, Daniel Mack wrote: > [...] >> That, however, got rejected because it doesn't work for multicast. This >> patch set implements one of the things Pablo suggested in his reply. > > People are rising valid con

Re: [MM PATCH V4.1 5/6] slub: support for bulk free with SLUB freelists

2015-10-02 Thread Jesper Dangaard Brouer
On Fri, 2 Oct 2015 11:41:18 +0200 Jesper Dangaard Brouer wrote: > On Thu, 1 Oct 2015 15:10:15 -0700 > Andrew Morton wrote: > > > On Wed, 30 Sep 2015 13:44:19 +0200 Jesper Dangaard Brouer > > wrote: > > > > > Make it possible to free a freelist with several objects by adjusting > > > API of s

[PATCH v1 0/5] Improve ASIX RX memory allocation error handling

2015-10-02 Thread Dean Jenkins
From: Mark Craske Please ignore the cover letter PATCH v2 as sent in error. Patches are all v1, (there are no v2 patches yet) The ASIX RX handler algorithm is weak on error handling. There is a design flaw in the ASIX RX handler algorithm because the implementation for handling RX Ethernet frame

Re: [PATCH 1/3] net: dsa: Use devm_ prefixed allocations

2015-10-02 Thread Sergei Shtylyov
On 10/2/2015 4:30 PM, Neil Armstrong wrote: To simplify and prevent memory leakage when unbinding, use the devm_ memory allocation calls. Tested-by: Andrew Lunn Tested-by: Florian Fainelli Signed-off-by: Neil Armstrong --- net/dsa/dsa.c | 6 +++--- 1 file changed, 3 insertions(+), 3 del

[PATCH net] ARM: net: make BPF_LD | BPF_IND instruction trigger r_X initialisation to 0.

2015-10-02 Thread Nicolas Schichan
Without this patch, if the only instructions using r_X are of the BPF_LD | BPF_IND type, r_X would not be reset to 0, using whatever value was there when entering the jited code. With this patch, r_X will be correctly marked as used so it will be reset to 0 in the prologue code. This fix also make

[PATCH v1 4/5] asix: On RX avoid creating bad Ethernet frames

2015-10-02 Thread Dean Jenkins
When RX Ethernet frames span multiple URB socket buffers, the data stream may suffer a discontinuity which will cause the current Ethernet frame in the netdev socket buffer to be incomplete. This frame needs to be discarded instead of appending unrelated data from the current URB socket buffer to t

[PATCH v2 0/5] Improve ASIX RX memory allocation error handling

2015-10-02 Thread Dean Jenkins
From: Mark Craske The ASIX RX handler algorithm is weak on error handling. There is a design flaw in the ASIX RX handler algorithm because the implementation for handling RX Ethernet frames for the DUB-E100 C1 can have Ethernet frames spanning multiple URBs. This means that payload data from more

[PATCH v1 2/5] asix: Tidy-up 32-bit header word synchronisation

2015-10-02 Thread Dean Jenkins
Tidy-up the Data header 32-bit word synchronisation logic in asix_rx_fixup_internal() by removing redundant logic tests. The code is looking at the following cases of the Data header 32-bit word that is present before each Ethernet frame: a) all 32 bits of the Data header word are in the URB sock

[PATCH v1 5/5] asix: Continue processing URB if no RX netdev buffer

2015-10-02 Thread Dean Jenkins
Avoid a loss of synchronisation of the Ethernet Data header 32-bit word due to a failure to get a netdev socket buffer. The ASIX RX handling algorithm returned 0 upon a failure to get an allocation of a netdev socket buffer. This causes the URB processing to stop which potentially causes a loss of

[PATCH v1 3/5] asix: Simplify asix_rx_fixup_internal() netdev alloc

2015-10-02 Thread Dean Jenkins
The code is checking that the Ethernet frame will fit into a netdev allocated socket buffer within the constraints of MTU size, Ethernet header length plus VLAN header length. The original code was checking rx->remaining each loop of the while loop that processes multiple Ethernet frames per URB a

[PATCH v1 1/5] asix: Rename remaining and size for clarity

2015-10-02 Thread Dean Jenkins
The Data header synchronisation is easier to understand if the variables "remaining" and "size" are renamed. Therefore, the lifetime of the "remaining" variable exists outside of asix_rx_fixup_internal() and is used to indicate any remaining pending bytes of the Ethernet frame that need to be obta

Re: [PATCH 1/3] net: dsa: Use devm_ prefixed allocations

2015-10-02 Thread Neil Armstrong
On 10/02/2015 03:29 PM, Sergei Shtylyov wrote: > On 10/2/2015 1:48 PM, Neil Armstrong wrote: > >> To simplify and prevent memory leakage when unbinding, use >> the devm_ memory allocation calls. >> >> Tested-by: Andrew Lunn >> Tested-by: Florian Fainelli >> Signed-off-by: Neil Armstrong >> ---

Re: [PATCH 1/3] net: dsa: Use devm_ prefixed allocations

2015-10-02 Thread Sergei Shtylyov
On 10/2/2015 1:48 PM, Neil Armstrong wrote: To simplify and prevent memory leakage when unbinding, use the devm_ memory allocation calls. Tested-by: Andrew Lunn Tested-by: Florian Fainelli Signed-off-by: Neil Armstrong --- net/dsa/dsa.c | 6 +++--- 1 file changed, 3 insertions(+), 3 delet

Re: [PATCH 1/3] net: dsa: Use devm_ prefixed allocations

2015-10-02 Thread Felix Fietkau
On 2015-10-02 12:48, Neil Armstrong wrote: > To simplify and prevent memory leakage when unbinding, use > the devm_ memory allocation calls. > > Tested-by: Andrew Lunn > Tested-by: Florian Fainelli > Signed-off-by: Neil Armstrong I think you also need to get rid of the corresponding free calls

[PATCH net-next] bpf, seccomp: prepare for upcoming criu support

2015-10-02 Thread Daniel Borkmann
The current ongoing effort to dump existing cBPF seccomp filters back to user space requires to hold the pre-transformed instructions like we do in case of socket filters from sk_attach_filter() side, so they can be reloaded in original form at a later point in time by utilities such as criu. To p

Re: netpoll_send_skb_on_dev warning with bnx2

2015-10-02 Thread Neil Horman
On Thu, Oct 01, 2015 at 08:25:46PM -0700, Vinson Lee wrote: > Hi. > > I am seeing a netpoll_send_skb_on_dev warning with bnx2. It happens on > Linux 4.1 and I am able to reproduce the warning with Linux 4.3-rc3. > > [ cut here ] > WARNING: CPU: 11 PID: 3110 at net/core/net

[PATCH net-next 1/4] bridge: vlan: use rcu list for the ordered vlan list

2015-10-02 Thread Nikolay Aleksandrov
From: Nikolay Aleksandrov When I did the conversion to rhashtable I missed the required locking of one important user of the vlan list - br_get_link_af_size_filtered() which is called: br_ifinfo_notify() -> br_nlmsg_size() -> br_get_link_af_size_filtered() and the notifications can be sent withou

[PATCH net-next 3/4] bridge: vlan: drop master_flags from __vlan_add

2015-10-02 Thread Nikolay Aleksandrov
From: Nikolay Aleksandrov There's only one user now and we can include the flag directly. Signed-off-by: Nikolay Aleksandrov --- net/bridge/br_vlan.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c index 8481d2567513..1f6f9f

  1   2   >