Re: [PATCH] mt7601u: phy: mark expected switch fall-through

2018-03-30 Thread Jakub Kicinski
On Fri, 30 Mar 2018 16:12:23 -0500, Gustavo A. R. Silva wrote: > In preparation to enabling -Wimplicit-fallthrough, mark switch cases > where we are expecting to fall through. > > Signed-off-by: Gustavo A. R. Silva Acked-by: Jakub Kicinski

Re: [PATCH net-next] netdevsim: Change nsim_devlink_setup to return error to caller

2018-03-30 Thread Jakub Kicinski
On Fri, 30 Mar 2018 09:28:51 -0700, David Ahern wrote: > Change nsim_devlink_setup to return any error back to the caller and > update nsim_init to handle it. > > Requested-by: Jakub Kicinski > Signed-off-by: David Ahern Acked-by: Jakub Kicinski Thank you!

Re: [PATCH net-next 0/9] devlink: Add support for region access

2018-03-30 Thread Alex Vesker
On 3/31/2018 1:26 AM, David Ahern wrote: On 3/30/18 1:39 PM, Alex Vesker wrote: On 3/30/2018 7:57 PM, David Ahern wrote: On 3/30/18 8:34 AM, Andrew Lunn wrote: And it seems to want contiguous pages. How well does that work after the system has been running for a while and memory is fragment

[PATCH net-next] vlan: vlan_hw_filter_capable() can be static

2018-03-30 Thread Wei Yongjun
Fixes the following sparse warning: net/8021q/vlan_core.c:168:6: warning: symbol 'vlan_hw_filter_capable' was not declared. Should it be static? Signed-off-by: Wei Yongjun --- net/8021q/vlan_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/8021q/vlan_core.c b/net/

Re: [PATCH v3 net-next 07/12] rhashtable: add schedule points

2018-03-30 Thread Herbert Xu
On Fri, Mar 30, 2018 at 05:53:04PM -0700, Eric Dumazet wrote: > Rehashing and destroying large hash table takes a lot of time, > and happens in process context. It is safe to add cond_resched() > in rhashtable_rehash_table() and rhashtable_free_and_destroy() > > Signed-off-by: Eric Dumazet Acked

[PATCH iproute2-next 1/1] tc: jsonify sample action

2018-03-30 Thread Roman Mashak
Signed-off-by: Roman Mashak --- tc/m_sample.c | 22 +- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/tc/m_sample.c b/tc/m_sample.c index 1e18c5154fe6..39a99246a8ea 100644 --- a/tc/m_sample.c +++ b/tc/m_sample.c @@ -149,23 +149,27 @@ static int print_sample(str

Re: [PATCH v2 net-next 07/12] rhashtable: add schedule points

2018-03-30 Thread Herbert Xu
On Fri, Mar 30, 2018 at 01:42:31PM -0700, Eric Dumazet wrote: > Rehashing and destroying large hash table takes a lot of time, > and happens in process context. It is safe to add cond_resched() > in rhashtable_rehash_table() and rhashtable_free_and_destroy() > > Signed-off-by: Eric Dumazet Acked

[PATCH iproute2-next 1/1] tc: support oneline mode in action generic printer functions

2018-03-30 Thread Roman Mashak
Signed-off-by: Roman Mashak --- tc/m_action.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/tc/m_action.c b/tc/m_action.c index 8891659ae15a..2f85d353279a 100644 --- a/tc/m_action.c +++ b/tc/m_action.c @@ -301,19 +301,21 @@ static int tc_print_one_action(FILE *f

[PATCH v3 net-next 02/12] inet: frags: change inet_frags_init_net() return value

2018-03-30 Thread Eric Dumazet
We will soon initialize one rhashtable per struct netns_frags in inet_frags_init_net(). This patch changes the return value to eventually propagate an error. Signed-off-by: Eric Dumazet --- include/net/inet_frag.h | 3 ++- net/ieee802154/6lowpan/reassembly.c | 11 --

[PATCH v3 net-next 08/12] inet: frags: use rhashtables for reassembly units

2018-03-30 Thread Eric Dumazet
Some applications still rely on IP fragmentation, and to be fair linux reassembly unit is not working under any serious load. It uses static hash tables of 1024 buckets, and up to 128 items per bucket (!!!) A work queue is supposed to garbage collect items when host is under memory pressure, and

[PATCH v3 net-next 09/12] inet: frags: remove some helpers

2018-03-30 Thread Eric Dumazet
Remove sum_frag_mem_limit(), ip_frag_mem() & ip6_frag_mem() Also since we use rhashtable we can bring back the number of fragments in "grep FRAG /proc/net/sockstat /proc/net/sockstat6" that was removed in commit 434d305405ab ("inet: frag: don't account number of fragment queues") Signed-off-by: E

[PATCH v3 net-next 04/12] inet: frags: refactor ipv6_frag_init()

2018-03-30 Thread Eric Dumazet
We want to call inet_frags_init() earlier. This is a prereq to "inet: frags: use rhashtables for reassembly units" Signed-off-by: Eric Dumazet --- net/ipv6/reassembly.c | 27 +++ 1 file changed, 15 insertions(+), 12 deletions(-) diff --git a/net/ipv6/reassembly.c b/net/

[PATCH v3 net-next 05/12] inet: frags: refactor lowpan_net_frag_init()

2018-03-30 Thread Eric Dumazet
We want to call lowpan_net_frag_init() earlier. Similar to commit "inet: frags: refactor ipv6_frag_init()" This is a prereq to "inet: frags: use rhashtables for reassembly units" Signed-off-by: Eric Dumazet --- net/ieee802154/6lowpan/reassembly.c | 20 +++- 1 file changed, 11 in

[PATCH v3 net-next 00/12] inet: frags: bring rhashtables to IP defrag

2018-03-30 Thread Eric Dumazet
IP defrag processing is one of the remaining problematic layer in linux. It uses static hash tables of 1024 buckets, and up to 128 items per bucket. A work queue is supposed to garbage collect items when host is under memory pressure, and doing a hash rebuild, changing seed used in hash computati

[PATCH v3 net-next 10/12] inet: frags: get rif of inet_frag_evicting()

2018-03-30 Thread Eric Dumazet
This refactors ip_expire() since one indentation level is removed. Note: in the future, we should try hard to avoid the skb_clone() since this is a serious performance cost. Under DDOS, the ICMP message wont be sent because of rate limits. Fact that ip6_expire_frag_queue() does not use skb_clone(

[PATCH v3 net-next 11/12] inet: frags: remove inet_frag_maybe_warn_overflow()

2018-03-30 Thread Eric Dumazet
This function is obsolete, after rhashtable addition to inet defrag. Signed-off-by: Eric Dumazet --- include/net/inet_frag.h | 2 -- net/ieee802154/6lowpan/reassembly.c | 5 ++--- net/ipv4/inet_fragment.c| 11 --- net/ipv4/ip_fragment.c

[PATCH v3 net-next 06/12] inet: frags: refactor ipfrag_init()

2018-03-30 Thread Eric Dumazet
We need to call inet_frags_init() before register_pernet_subsys(), as a prereq for following patch ("inet: frags: use rhashtables for reassembly units") Signed-off-by: Eric Dumazet --- net/ipv4/ip_fragment.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ipv4/ip_fra

[PATCH v3 net-next 12/12] inet: frags: break the 2GB limit for frags storage

2018-03-30 Thread Eric Dumazet
Some users are willing to provision huge amounts of memory to be able to perform reassembly reasonnably well under pressure. Current memory tracking is using one atomic_t and integers. Switch to atomic_long_t so that 64bit arches can use more than 2GB, without any cost for 32bit arches. Note tha

[PATCH v3 net-next 03/12] inet: frags: add a pointer to struct netns_frags

2018-03-30 Thread Eric Dumazet
In order to simplify the API, add a pointer to struct inet_frags. This will allow us to make things less complex. These functions no longer have a struct inet_frags parameter : inet_frag_destroy(struct inet_frag_queue *q /*, struct inet_frags *f */) inet_frag_put(struct inet_frag_queue *q /*, st

[PATCH v3 net-next 07/12] rhashtable: add schedule points

2018-03-30 Thread Eric Dumazet
Rehashing and destroying large hash table takes a lot of time, and happens in process context. It is safe to add cond_resched() in rhashtable_rehash_table() and rhashtable_free_and_destroy() Signed-off-by: Eric Dumazet --- lib/rhashtable.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/li

[PATCH v3 net-next 01/12] ipv6: frag: remove unused field

2018-03-30 Thread Eric Dumazet
csum field in struct frag_queue is not used, remove it. Signed-off-by: Eric Dumazet --- include/net/ipv6.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 50a6f0ddb8780f6c9169f4ae0b3b35af2d66cd4b..5c18836672e9d1c560cdce15f5b34928c337abfd 100644

[net-next 02/15] net/mlx5: Eliminate query xsrq dead code

2018-03-30 Thread Saeed Mahameed
1. This function is not used anywhere in mlx5 driver 2. It has a memcpy statement that makes no sense and produces build warning with gcc8 drivers/net/ethernet/mellanox/mlx5/core/transobj.c: In function 'mlx5_core_query_xsrq': drivers/net/ethernet/mellanox/mlx5/core/transobj.c:347:3: error: 'memc

[net-next 06/15] net/mlx5e: Derive Striding RQ size from MTU

2018-03-30 Thread Saeed Mahameed
From: Tariq Toukan In Striding RQ, each WQE serves multiple packets (hence called Multi-Packet WQE, MPWQE). The size of a MPWQE is constant (currently 256KB). Upon a ringparam set operation, we calculate the number of MPWQEs per RQ. For this, first it is needed to determine the number of packets

[net-next 10/15] net/mlx5e: Use linear SKB in Striding RQ

2018-03-30 Thread Saeed Mahameed
From: Tariq Toukan Current Striding RQ HW feature utilizes the RX buffers so that there is no wasted room between the strides. This maximises the memory utilization. This prevents the use of build_skb() (which requires headroom and tailroom), and demands to memcpy the packets headers into the skb

[net-next 12/15] net/mlx5e: Support XDP over Striding RQ

2018-03-30 Thread Saeed Mahameed
From: Tariq Toukan Add XDP support over Striding RQ. Now that linear SKB is supported over Striding RQ, we can support XDP by setting stride size to PAGE_SIZE and headroom to XDP_PACKET_HEADROOM. Upon a MPWQE free, do not release pages that are being XDP xmit, they will be released upon completi

[net-next 09/15] net/mlx5e: Use inline MTTs in UMR WQEs

2018-03-30 Thread Saeed Mahameed
From: Tariq Toukan When modifying the page mapping of a HW memory region (via a UMR post), post the new values inlined in WQE, instead of using a data pointer. This is a micro-optimization, inline UMR WQEs of different rings scale better in HW. In addition, this obsoletes a few control flows an

[pull request][net-next 00/15] Mellanox, mlx5 updates 2018-03-30

2018-03-30 Thread Saeed Mahameed
Hi Dave, This series contains updates to mlx5 core and mlx5e netdev drivers. The main highlight of this series is the RX optimizations for striding RQ path, introduced by Tariq. For more information please see tag log below. Please pull and let me know if there's any problem. Thanks, Saeed. --

[net-next 04/15] net/mlx5e: IPoIB, Fix spelling mistake

2018-03-30 Thread Saeed Mahameed
From: Talat Batheesh Fix spelling mistake in debug message text. "dettaching" -> "detaching" Signed-off-by: Talat Batheesh Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/et

[net-next 01/15] net/mlx5e: Use eq ptr from cq

2018-03-30 Thread Saeed Mahameed
Instead of looking for the EQ of the CQ, remove that redundant code and use the eq pointer stored in the cq struct. Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 14 ++ 1 file changed, 2 insertions(+), 12 deletions(-) diff --git a/drivers/net/

[net-next 03/15] net/mlx5: Change teardown with force mode failure message to warning

2018-03-30 Thread Saeed Mahameed
From: Alaa Hleihel With ConnectX-4, we expect the force teardown to fail in case that DC was enabled, therefore change the message from error to warning. Signed-off-by: Alaa Hleihel Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/fw.c | 2 +- 1 file changed, 1 insert

[net-next 07/15] net/mlx5e: Code movements in RX UMR WQE post

2018-03-30 Thread Saeed Mahameed
From: Tariq Toukan Gets the process of a UMR WQE post in one function, in preparation for a downstream patch that inlines the WQE data. No functional change here. Signed-off-by: Tariq Toukan Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 107 ++

[net-next 08/15] net/mlx5e: Do not busy-wait for UMR completion in Striding RQ

2018-03-30 Thread Saeed Mahameed
From: Tariq Toukan Do not busy-wait a pending UMR completion. Under high HW load, busy-waiting a delayed completion would fully utilize the CPU core and mistakenly indicate a SW bottleneck. Signed-off-by: Tariq Toukan Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/e

[net-next 15/15] net/mlx5e: RX, Recycle buffer of UMR WQEs

2018-03-30 Thread Saeed Mahameed
From: Tariq Toukan Upon a new UMR post, check if the WQE buffer contains a previous UMR WQE. If so, modify the dynamic fields instead of a whole WQE overwrite. This saves a memcpy. In current setting, after 2 WQ cycles (12 UMR posts), this will always be the case. No degradation sensed. Signed

[net-next 14/15] net/mlx5e: Keep single pre-initialized UMR WQE per RQ

2018-03-30 Thread Saeed Mahameed
From: Tariq Toukan All UMR WQEs of an RQ share many common fields. We use pre-initialized structures to save calculations in datapath. One field (xlt_offset) was the only reason we saved a pre-initialized copy per WQE index. Here we remove its initialization (move its calculation to datapath), an

[net-next 11/15] net/mlx5e: Refactor RQ XDP_TX indication

2018-03-30 Thread Saeed Mahameed
From: Tariq Toukan Make the xdp_xmit indication available for Striding RQ by taking it out of the type-specific union. This refactor is a preparation for a downstream patch that adds XDP support over Striding RQ. In addition, use a bitmap instead of a boolean for possible future flags. Signed-of

[net-next 05/15] net/mlx5e: Save MTU in channels params

2018-03-30 Thread Saeed Mahameed
From: Tariq Toukan Knowing the MTU is required for RQ creation flow. By our design, channels creation flow is totally isolated from priv/netdev, and can be completed with access to channels params and mdev. Adding the MTU to the channels params helps preserving that. In addition, we save it in RQ

[net-next 13/15] net/mlx5e: Remove page_ref bulking in Striding RQ

2018-03-30 Thread Saeed Mahameed
From: Tariq Toukan When many packets reside on the same page, the bulking of page_ref modifications reduces the total number of atomic operations executed. Besides the necessary 2 operations on page alloc/free, we have the following extra ops per page: - one on WQE allocation (bump refcnt to max

Re: [net-next V7 PATCH 14/16] mlx5: use page_pool for xdp_return_frame call

2018-03-30 Thread Saeed Mahameed
On Thu, 2018-03-29 at 19:02 +0200, Jesper Dangaard Brouer wrote: > This patch shows how it is possible to have both the driver local > page > cache, which uses elevated refcnt for "catching"/avoiding SKB > put_page returns the page through the page allocator. And at the > same time, have pages get

[next-queue PATCH] igb: Fix the transmission mode of queue 0 for Qav mode

2018-03-30 Thread Vinicius Costa Gomes
When Qav mode is enabled, queue 0 should be kept on Stream Reservation mode. From the i210 datasheet, section 8.12.19: "Note: Queue0 QueueMode must be set to 1b when TransmitMode is set to Qav." ("QueueMode 1b" represents the Stream Reservation mode) The solution is to give queue 0 the all the cr

Re: [net-next V7 PATCH 10/16] mlx5: register a memory model when XDP is enabled

2018-03-30 Thread Saeed Mahameed
On Thu, 2018-03-29 at 19:01 +0200, Jesper Dangaard Brouer wrote: > Now all the users of ndo_xdp_xmit have been converted to use > xdp_return_frame. > This enable a different memory model, thus activating another code > path > in the xdp_return_frame API. > > V2: Fixed issues pointed out by Tariq.

Re: [net-next V7 PATCH 01/16] mlx5: basic XDP_REDIRECT forward support

2018-03-30 Thread Saeed Mahameed
On Thu, 2018-03-29 at 19:01 +0200, Jesper Dangaard Brouer wrote: > This implements basic XDP redirect support in mlx5 driver. > > Notice that the ndo_xdp_xmit() is NOT implemented, because that API > need some changes that this patchset is working towards. > > The main purpose of this patch is ha

Re: [PATCH bpf-next 01/10] bpf: btf: Introduce BPF Type Format (BTF)

2018-03-30 Thread Martin KaFai Lau
On Sat, Mar 31, 2018 at 01:22:53AM +0200, Daniel Borkmann wrote: > On 03/30/2018 08:26 PM, Martin KaFai Lau wrote: > [...] > > +static int btf_add_type(struct btf_verifier_env *env, struct btf_type *t) > > +{ > > + struct btf *btf = env->btf; > > + > > + /* < 2 because +1 for btf_void which is

Re: [PATCH bpf-next 01/10] bpf: btf: Introduce BPF Type Format (BTF)

2018-03-30 Thread Daniel Borkmann
On 03/30/2018 08:26 PM, Martin KaFai Lau wrote: [...] > +static int btf_add_type(struct btf_verifier_env *env, struct btf_type *t) > +{ > + struct btf *btf = env->btf; > + > + /* < 2 because +1 for btf_void which is always in btf->types[0]. > + * btf_void is not accounted in btf->nr_ty

Re: [PATCH net-next RFC 0/5] ipv6: sr: introduce seg6local End.BPF action

2018-03-30 Thread Alexei Starovoitov
On Fri, Mar 23, 2018 at 10:15:59AM +, Mathieu Xhonneux wrote: > As of Linux 4.14, it is possible to define advanced local processing for > IPv6 packets with a Segment Routing Header through the seg6local LWT > infrastructure. This LWT implements the network programming principles > defined in t

Re: [PATCH v2 net-next 08/12] inet: frags: use rhashtables for reassembly units

2018-03-30 Thread Eric Dumazet
On 03/30/2018 03:44 PM, Kirill Tkhai wrote: > Hi, Eric, > > thanks for more small patches in v2. One comment below. > >> - >> -struct inet_frag_bucket { >> -struct hlist_head chain; >> -spinlock_t chain_lock; >> +struct netns_frags *net; >> +struct rcu_h

[PATCH net 1/1] net/mlx5e: Set EQE based as default TX interrupt moderation mode

2018-03-30 Thread Saeed Mahameed
From: Tal Gilboa The default TX moderation mode was mistakenly set to CQE based. The intention was to add a control ability in order to improve some specific use-cases. In general, we prefer to use EQE based moderation as it gives much better numbers for the common cases. CQE based causes a degr

Re: [PATCH v2 net-next 08/12] inet: frags: use rhashtables for reassembly units

2018-03-30 Thread Kirill Tkhai
Hi, Eric, thanks for more small patches in v2. One comment below. On 30.03.2018 23:42, Eric Dumazet wrote: > Some applications still rely on IP fragmentation, and to be fair linux > reassembly unit is not working under any serious load. > > It uses static hash tables of 1024 buckets, and up to 1

Re: [PATCH v2 net-next 08/12] inet: frags: use rhashtables for reassembly units

2018-03-30 Thread Eric Dumazet
On 03/30/2018 01:42 PM, Eric Dumazet wrote: > Some applications still rely on IP fragmentation, and to be fair linux > reassembly unit is not working under any serious load. ... > - > static struct inet_frag_queue *inet_frag_alloc(struct netns_frags *nf, >

Re: [PATCH net-next 0/9] devlink: Add support for region access

2018-03-30 Thread David Ahern
On 3/30/18 1:39 PM, Alex Vesker wrote: > > > On 3/30/2018 7:57 PM, David Ahern wrote: >> On 3/30/18 8:34 AM, Andrew Lunn wrote: > And it seems to want contiguous pages. How well does that work after > the system has been running for a while and memory is fragmented? The allocation ca

Re: [PATCH] usb: plusb: Add support for PL-27A1

2018-03-30 Thread Daniel Kučera
Hello Roman, it would be at least polite to mention where you got the code from: https://lkml.org/lkml/2016/2/21/14 -- S pozdravom / Best regards Daniel Kucera.

[PATCH v3 bpf-next 0/9] bpf: introduce cgroup-bpf bind, connect, post-bind hooks

2018-03-30 Thread Alexei Starovoitov
v2->v3: - rebase due to conflicts - fix ipv6=m build v1->v2: - support expected_attach_type at prog load time so that prog (incl. context accesses and calls to helpers) can be validated with regard to specific attach point it is supposed to be attached to. Later, at attach time, attach type

[PATCH v3 bpf-next 4/9] selftests/bpf: Selftest for sys_bind hooks

2018-03-30 Thread Alexei Starovoitov
From: Andrey Ignatov Add selftest to work with bpf_sock_addr context from `BPF_PROG_TYPE_CGROUP_SOCK_ADDR` programs. Try to bind(2) on IP:port and apply: * loads to make sure context can be read correctly, including narrow loads (byte, half) for IP and full-size loads (word) for all fields; *

[PATCH v3 bpf-next 9/9] selftests/bpf: Selftest for sys_bind post-hooks.

2018-03-30 Thread Alexei Starovoitov
From: Andrey Ignatov Add selftest for attach types `BPF_CGROUP_INET4_POST_BIND` and `BPF_CGROUP_INET6_POST_BIND`. The main things tested are: * prog load behaves as expected (valid/invalid accesses in prog); * prog attach behaves as expected (load- vs attach-time attach types); * `BPF_CGROUP_INE

[PATCH v3 bpf-next 3/9] bpf: Hooks for sys_bind

2018-03-30 Thread Alexei Starovoitov
From: Andrey Ignatov == The problem == There is a use-case when all processes inside a cgroup should use one single IP address on a host that has multiple IP configured. Those processes should use the IP for both ingress and egress, for TCP and UDP traffic. So TCP/UDP servers should be bound to

[PATCH v3 bpf-next 1/9] bpf: Check attach type at prog load time

2018-03-30 Thread Alexei Starovoitov
From: Andrey Ignatov == The problem == There are use-cases when a program of some type can be attached to multiple attach points and those attach points must have different permissions to access context or to call helpers. E.g. context structure may have fields for both IPv4 and IPv6 but it doe

[PATCH v3 bpf-next 2/9] libbpf: Support expected_attach_type at prog load

2018-03-30 Thread Alexei Starovoitov
From: Andrey Ignatov Support setting `expected_attach_type` at prog load time in both `bpf/bpf.h` and `bpf/libbpf.h`. Since both headers already have API to load programs, new functions are added not to break backward compatibility for existing ones: * `bpf_load_program_xattr()` is added to `bpf

[PATCH v3 bpf-next 5/9] net: Introduce __inet_bind() and __inet6_bind

2018-03-30 Thread Alexei Starovoitov
From: Andrey Ignatov Refactor `bind()` code to make it ready to be called from BPF helper function `bpf_bind()` (will be added soon). Implementation of `inet_bind()` and `inet6_bind()` is separated into `__inet_bind()` and `__inet6_bind()` correspondingly. These function can be used from both `sk

[PATCH v3 bpf-next 7/9] selftests/bpf: Selftest for sys_connect hooks

2018-03-30 Thread Alexei Starovoitov
From: Andrey Ignatov Add selftest for BPF_CGROUP_INET4_CONNECT and BPF_CGROUP_INET6_CONNECT attach types. Try to connect(2) to specified IP:port and test that: * remote IP:port pair is overridden; * local end of connection is bound to specified IP. All combinations of IPv4/IPv6 and TCP/UDP are

[PATCH v3 bpf-next 8/9] bpf: Post-hooks for sys_bind

2018-03-30 Thread Alexei Starovoitov
From: Andrey Ignatov "Post-hooks" are hooks that are called right before returning from sys_bind. At this time IP and port are already allocated and no further changes to `struct sock` can happen before returning from sys_bind but BPF program has a chance to inspect the socket and change sys_bind

[PATCH v3 bpf-next 6/9] bpf: Hooks for sys_connect

2018-03-30 Thread Alexei Starovoitov
From: Andrey Ignatov == The problem == See description of the problem in the initial patch of this patch set. == The solution == The patch provides much more reliable in-kernel solution for the 2nd part of the problem: making outgoing connecttion from desired IP. It adds new attach types `BPF

Re: [PATCH net v4 0/3] ipv6: udp6: set dst cache for a connected sk if current not valid

2018-03-30 Thread Martin KaFai Lau
On Fri, Mar 30, 2018 at 08:53:06PM +0300, Alexey Kodanev wrote: > A new RTF_CACHE route can be created with the socket's dst cache > update between the below calls in udpv6_sendmsg(), when datagram > sending results to ICMPV6_PKT_TOOBIG error: > >dst = ip6_sk_dst_lookup_flow(...) >... > re

[PATCH net-next 00/12] rxrpc: Fixes and more traces

2018-03-30 Thread David Howells
s are tagged here: git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git rxrpc-next-20180330 and can also be found on this branch: http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=rxrpc-next David --- David Howells (10): rxrpc

[PATCH net-next 01/12] rxrpc: Fix firewall route keepalive

2018-03-30 Thread David Howells
Fix the firewall route keepalive part of AF_RXRPC which is currently function incorrectly by replying to VERSION REPLY packets from the server with VERSION REQUEST packets. Instead, send VERSION REPLY packets to the peers of service connections to act as keep-alives 20s after the latest packet was

[PATCH net-next 02/12] rxrpc: Fix a bit of time confusion

2018-03-30 Thread David Howells
The rxrpc_reduce_call_timer() function should be passed the 'current time' in jiffies, not the current ktime time. It's confusing in rxrpc_resend because that has to deal with both. Pass the correct current time in. Note that this only affects the trace produced and not the functioning of the co

[PATCH net-next 03/12] rxrpc: Fix Tx ring annotation after initial Tx failure

2018-03-30 Thread David Howells
rxrpc calls have a ring of packets that are awaiting ACK or retransmission and a parallel ring of annotations that tracks the state of those packets. If the initial transmission of a packet on the underlying UDP socket fails then the packet annotation is marked for resend - but the setting of this

[PATCH net-next 04/12] rxrpc: Don't treat call aborts as conn aborts

2018-03-30 Thread David Howells
If a call-level abort is received for the previous call to complete on a connection channel, then that abort is queued for the connection processor to handle. Unfortunately, the connection processor then assumes without checking that the abort is connection-level (ie. callNumber is 0) and distribu

[PATCH net-next 07/12] rxrpc: Fix checker warnings and errors

2018-03-30 Thread David Howells
Fix various issues detected by checker. Errors: (*) rxrpc_discard_prealloc() should be using rcu_assign_pointer to set call->socket. Warnings: (*) rxrpc_service_connection_reaper() should be passing NULL rather than 0 to trace_rxrpc_conn() as the where argument. (*) rxrpc_disconne

[PATCH net-next 08/12] rxrpc: Fix potential call vs socket/net destruction race

2018-03-30 Thread David Howells
rxrpc_call structs don't pin sockets or network namespaces, but may attempt to access both after their refcount reaches 0 so that they can detach themselves from the network namespace. However, there's no guarantee that the socket still exists at this point (so sock_net(&call->socket->sk) may be i

[PATCH net-next 05/12] rxrpc: Fix resend event time calculation

2018-03-30 Thread David Howells
From: Marc Dionne Commit a158bdd3 ("rxrpc: Fix call timeouts") reworked the time calculation for the next resend event. For this calculation, "oldest" will be before "now", so ktime_sub(oldest, now) will yield a negative value. When passed to nsecs_to_jiffies which expects an unsigned value, th

[PATCH net-next 06/12] rxrpc: remove unused static variables

2018-03-30 Thread David Howells
From: Sebastian Andrzej Siewior The rxrpc_security_methods and rxrpc_security_sem user has been removed in 648af7fca159 ("rxrpc: Absorb the rxkad security module"). This was noticed by kbuild test robot for the -RT tree but is also true for !RT. Reported-by: kbuild test robot Signed-off-by: Seb

[PATCH net-next 10/12] rxrpc: Fix apparent leak of rxrpc_local objects

2018-03-30 Thread David Howells
rxrpc_local objects cannot be disposed of until all the connections that point to them have been RCU'd as a connection object holds refcount on the local endpoint it is communicating through. Currently, this can cause an assertion failure to occur when a network namespace is destroyed as there's n

[PATCH net-next 11/12] rxrpc: Add a tracepoint to track rxrpc_peer refcounting

2018-03-30 Thread David Howells
Add a tracepoint to track reference counting on the rxrpc_peer struct. Signed-off-by: David Howells --- include/trace/events/rxrpc.h | 42 +++ net/rxrpc/ar-internal.h | 23 +++ net/rxrpc/peer_event.c |2 + net/rxrpc/peer_object.c | 6

[PATCH net-next 09/12] rxrpc: Add a tracepoint to track rxrpc_local refcounting

2018-03-30 Thread David Howells
Add a tracepoint to track reference counting on the rxrpc_local struct. Signed-off-by: David Howells --- include/trace/events/rxrpc.h | 43 net/rxrpc/ar-internal.h | 27 +++-- net/rxrpc/call_accept.c |3 +- net/rxrpc/local_object.c

[PATCH net-next 12/12] rxrpc: Fix leak of rxrpc_peer objects

2018-03-30 Thread David Howells
When a new client call is requested, an rxrpc_conn_parameters struct object is passed in with a bunch of parameters set, such as the local endpoint to use. A pointer to the target peer record is also placed in there by rxrpc_get_client_conn() - and this is removed if and only if a new connection o

[PATCH] mt7601u: phy: mark expected switch fall-through

2018-03-30 Thread Gustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Signed-off-by: Gustavo A. R. Silva --- drivers/net/wireless/mediatek/mt7601u/phy.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/wireless/mediatek/mt7601u/phy.c b/drive

[PATCH v5 00/14] Report PCI device link status

2018-03-30 Thread Bjorn Helgaas
This is mostly Tal's work to reduce code duplication in drivers and unify the approach for reporting PCIe link speed/width and whether the device is being limited by a slower upstream link. This v5 series is based on Tal's v4 [1]. Changes since v4: - Added patches to replace uses of pcie_get_mi

[PATCH v5 01/14] PCI: Add pcie_get_speed_cap() to find max supported link speed

2018-03-30 Thread Bjorn Helgaas
From: Tal Gilboa Add pcie_get_speed_cap() to find the max link speed supported by a device. Change max_link_speed_show() to use pcie_get_speed_cap(). Signed-off-by: Tal Gilboa [bhelgaas: return speed directly instead of error and *speed, don't export outside drivers/pci] Signed-off-by: Bjorn He

[PATCH v5 03/14] PCI: Add pcie_bandwidth_capable() to compute max supported link bandwidth

2018-03-30 Thread Bjorn Helgaas
From: Tal Gilboa Add pcie_bandwidth_capable() to compute the max link bandwidth supported by a device, based on the max link speed and width, adjusted by the encoding overhead. The maximum bandwidth of the link is computed as: max_link_speed * max_link_width * (1 - encoding_overhead) The enc

[PATCH v5 02/14] PCI: Add pcie_get_width_cap() to find max supported link width

2018-03-30 Thread Bjorn Helgaas
From: Tal Gilboa Add pcie_get_width_cap() to find the max link width supported by a device. Change max_link_width_show() to use pcie_get_width_cap(). Signed-off-by: Tal Gilboa [bhelgaas: return width directly instead of error and *width, don't export outside drivers/pci] Signed-off-by: Bjorn He

[PATCH v5 04/14] PCI: Add pcie_bandwidth_available() to compute bandwidth available to device

2018-03-30 Thread Bjorn Helgaas
From: Tal Gilboa Add pcie_bandwidth_available() to compute the bandwidth available to a device. This may be limited by the device itself or by a slower upstream link leading to the device. The available bandwidth at each link along the path is computed as: link_speed * link_width * (1 - enco

[PATCH v5 07/14] net/mlx5: Report PCIe link properties with pcie_print_link_status()

2018-03-30 Thread Bjorn Helgaas
From: Tal Gilboa Use pcie_print_link_status() to report PCIe link speed and possible limitations. Signed-off-by: Tal Gilboa [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas Reviewed-by: Tariq Toukan --- drivers/net/ethernet/mellanox/mlx5/core/main.c |4 1 file changed, 4 insertion

[PATCH v5 08/14] net/mlx5e: Use pcie_bandwidth_available() to compute bandwidth

2018-03-30 Thread Bjorn Helgaas
From: Tal Gilboa Use the new pci_bandwidth_available() function to calculate maximum available bandwidth through the PCI chain instead of computing it ourselves with mlx5e_get_pci_bw(). This is used to detect when the device is capable of more bandwidth than is available in the current slot. Th

[PATCH v5 06/14] net/mlx4_core: Report PCIe link properties with pcie_print_link_status()

2018-03-30 Thread Bjorn Helgaas
From: Tal Gilboa Use pcie_print_link_status() to report PCIe link speed and possible limitations instead of implementing this in the driver itself. Signed-off-by: Tal Gilboa Signed-off-by: Tariq Toukan [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas --- drivers/net/ethernet/mellanox/mlx4/

[PATCH v5 09/14] bnx2x: Report PCIe link properties with pcie_print_link_status()

2018-03-30 Thread Bjorn Helgaas
From: Bjorn Helgaas Use pcie_print_link_status() to report PCIe link speed and possible limitations instead of implementing this in the driver itself. Note that pcie_get_minimum_link() can return misleading information because it finds the slowest link and the narrowest link without considering

[PATCH v5 11/14] cxgb4: Report PCIe link properties with pcie_print_link_status()

2018-03-30 Thread Bjorn Helgaas
From: Bjorn Helgaas Use pcie_print_link_status() to report PCIe link speed and possible limitations instead of implementing this in the driver itself. Note that pcie_get_minimum_link() can return misleading information because it finds the slowest link and the narrowest link without considering

[PATCH v5 13/14] ixgbe: Report PCIe link properties with pcie_print_link_status()

2018-03-30 Thread Bjorn Helgaas
From: Bjorn Helgaas Use pcie_print_link_status() to report PCIe link speed and possible limitations instead of implementing this in the driver itself. Note that pcie_get_minimum_link() can return misleading information because it finds the slowest link and the narrowest link without considering

[PATCH v5 10/14] bnxt_en: Report PCIe link properties with pcie_print_link_status()

2018-03-30 Thread Bjorn Helgaas
From: Bjorn Helgaas Use pcie_print_link_status() to report PCIe link speed and possible limitations instead of implementing this in the driver itself. Note that pcie_get_minimum_link() can return misleading information because it finds the slowest link and the narrowest link without considering

[PATCH v5 12/14] fm10k: Report PCIe link properties with pcie_print_link_status()

2018-03-30 Thread Bjorn Helgaas
From: Bjorn Helgaas Use pcie_print_link_status() to report PCIe link speed and possible limitations instead of implementing this in the driver itself. Note that pcie_get_minimum_link() can return misleading information because it finds the slowest link and the narrowest link without considering

[PATCH v5 14/14] PCI: Remove unused pcie_get_minimum_link()

2018-03-30 Thread Bjorn Helgaas
From: Bjorn Helgaas In some cases pcie_get_minimum_link() returned misleading information because it found the slowest link and the narrowest link without considering the total bandwidth of the link. For example, if the path contained a 16 GT/s x1 link and a 2.5 GT/s x16 link, pcie_get_minimum_l

[PATCH v5 05/14] PCI: Add pcie_print_link_status() to log link speed and whether it's limited

2018-03-30 Thread Bjorn Helgaas
From: Tal Gilboa Add pcie_print_link_status(). This logs the current settings of the link (speed, width, and total available bandwidth). If the device is capable of more bandwidth but is limited by a slower upstream link, we include information about the link that limits the device's performanc

[PATCH] Bluetooth: Mark expected switch fall-throughs

2018-03-30 Thread Gustavo A. R. Silva
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Signed-off-by: Gustavo A. R. Silva --- net/bluetooth/mgmt.c| 1 + net/bluetooth/rfcomm/sock.c | 1 + 2 files changed, 2 insertions(+) diff --git a/net/bluetooth/mgmt.c b/net/blue

[PATCH net-next] hv_netvsc: Clean up extra parameter from rndis_filter_receive_data()

2018-03-30 Thread Haiyang Zhang
From: Haiyang Zhang The variables, msg and data, have the same value. This patch removes the extra one. Signed-off-by: Haiyang Zhang --- drivers/net/hyperv/rndis_filter.c | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/drivers/net/hyperv/rndis_filter.c b/d

Re: [bpf-next PATCH v3 0/4] bpf, sockmap BPF_F_INGRESS support

2018-03-30 Thread Daniel Borkmann
On 03/28/2018 09:49 PM, John Fastabend wrote: > This series adds the BPF_F_INGRESS flag support to the redirect APIs. > Bringing the sockmap API in-line with the cls_bpf redirect APIs. > > We add it to both variants of sockmap programs, the first patch adds > support for tx ulp hooks and the third

Re: [PATCH v2 bpf-next 0/2] sockmap: fix sg api usage

2018-03-30 Thread Daniel Borkmann
On 03/30/2018 05:39 AM, John Fastabend wrote: > On 03/29/2018 05:20 PM, Prashant Bhole wrote: >> These patches fix sg api usage in sockmap. Previously sockmap didn't >> use sg_init_table(), which caused hitting BUG_ON in sg api, when >> CONFIG_DEBUG_SG is enabled >> >> v1: added sg_init_table() cal

[PATCH v2 net-next 12/12] inet: frags: break the 2GB limit for frags storage

2018-03-30 Thread Eric Dumazet
Some users are willing to provision huge amounts of memory to be able to perform reassembly reasonnably well under pressure. Current memory tracking is using one atomic_t and integers. Switch to atomic_long_t so that 64bit arches can use more than 2GB, without any cost for 32bit arches. Note tha

[PATCH v2 net-next 10/12] inet: frags: get rif of inet_frag_evicting()

2018-03-30 Thread Eric Dumazet
This refactors ip_expire() since one indentation level is removed. Note: in the future, we should try hard to avoid the skb_clone() since this is a serious performance cost. Under DDOS, the ICMP message wont be sent because of rate limits. Fact that ip6_expire_frag_queue() does not use skb_clone(

[PATCH v2 net-next 11/12] inet: frags: remove inet_frag_maybe_warn_overflow()

2018-03-30 Thread Eric Dumazet
This function is obsolete, after rhashtable addition to inet defrag. Signed-off-by: Eric Dumazet --- include/net/inet_frag.h | 2 -- net/ieee802154/6lowpan/reassembly.c | 5 ++--- net/ipv4/inet_fragment.c| 11 --- net/ipv4/ip_fragment.c

[PATCH v2 net-next 08/12] inet: frags: use rhashtables for reassembly units

2018-03-30 Thread Eric Dumazet
Some applications still rely on IP fragmentation, and to be fair linux reassembly unit is not working under any serious load. It uses static hash tables of 1024 buckets, and up to 128 items per bucket (!!!) A work queue is supposed to garbage collect items when host is under memory pressure, and

[PATCH v2 net-next 09/12] inet: frags: remove some helpers

2018-03-30 Thread Eric Dumazet
Remove sum_frag_mem_limit(), ip_frag_mem() & ip6_frag_mem() Also since we use rhashtable we can bring back the number of fragments in "grep FRAG /proc/net/sockstat /proc/net/sockstat6" that was removed in commit 434d305405ab ("inet: frag: don't account number of fragment queues") Signed-off-by: E

[PATCH v2 net-next 04/12] inet: frags: refactor ipv6_frag_init()

2018-03-30 Thread Eric Dumazet
We want to call inet_frags_init() earlier. This is a prereq to "inet: frags: use rhashtables for reassembly units" Signed-off-by: Eric Dumazet --- net/ipv6/reassembly.c | 27 +++ 1 file changed, 15 insertions(+), 12 deletions(-) diff --git a/net/ipv6/reassembly.c b/net/

  1   2   3   4   >