[PATCH net-next 10/11] net: Add warning if any lower device is still in adjacency list

2016-10-12 Thread David Ahern
Lower list should be empty just like upper. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- net/core/dev.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/net/core/dev.c b/net/core/dev.c index 0f9b0985a84c..52e70a3d61a4 100644 --- a/net/core/dev.c +++ b/ne

[PATCH net-next 04/11] IB/core: Flip to the new dev walk API

2016-10-12 Thread David Ahern
Convert rdma_is_upper_dev_rcu, handle_netdev_upper and ipoib_get_net_dev_match_addr to the new upper device walk API. This is just a move to the new API; no functional change is intended. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- drivers/infiniband/core/core_priv.h

[PATCH net-next 01/11] net: Remove refnr arg when inserting link adjacencies

2016-10-12 Thread David Ahern
clear. ie., the refnr arg in 93409033ae65 was only needed for the remove path. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- net/core/dev.c | 27 --- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index f1fe26f664

Re: [PATCH v2] net: Require exact match for TCP socket lookups if dif is l3mdev

2016-10-14 Thread David Ahern
On 10/14/16 12:33 AM, Eric Dumazet wrote: > There is a catch here. > TCP moves IP6CB() in a different location. > > Reference : > > 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses") thanks for the reference. > Problem is that the lookup can happen from IP early demux,

Re: [PATCH v2] net: Require exact match for TCP socket lookups if dif is l3mdev

2016-10-14 Thread David Ahern
On 10/14/16 6:21 AM, David Ahern wrote: >> So you might need to let the caller pass IP6CB(skb)->flags (or >> TCP_SKB_CB(skb)->header.h6.flags ) instead of skb since >> inet6_exact_dif_match() does not know where to fetch the flags. >> >> Same issue for IPv4.

[PATCH net-next 01/11] net: Remove refnr arg when inserting link adjacencies

2016-10-14 Thread David Ahern
clear. ie., the refnr arg in 93409033ae65 was only needed for the remove path. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- net/core/dev.c | 27 --- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 6498cc2ba8

[PATCH net-next 11/11] net: dev: Improve debug statements for adjacency tracking

2016-10-14 Thread David Ahern
. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- net/core/dev.c | 22 +++--- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 99a1cb432945..10fd42a833e6 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5567,6 +

[PATCH net-next 02/11] net: Introduce new api for walking upper and lower devices

2016-10-14 Thread David Ahern
. If the callback returns non-0, the walk is terminated and the functions return that code back to callers. v2 - fixed definition of netdev_next_lower_dev_rcu to mirror the upper_dev version. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- include/linux/netdevice.h | 17 + ne

[PATCH net-next 05/11] IB/ipoib: Flip to new dev walk API

2016-10-14 Thread David Ahern
Convert ipoib_get_net_dev_match_addr to the new upper device walk API. This is just a code conversion; no functional change is intended. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- drivers/infiniband/ulp/ipoib/ipoib_main.c | 37 +-- 1 file chang

[PATCH v2 net-next 00/11] net: Fix netdev adjacency tracking

2016-10-14 Thread David Ahern
of netdev_next_lower_dev_rcu to mirror the upper_dev version. David Ahern (11): net: Remove refnr arg when inserting link adjacencies net: Introduce new api for walking upper and lower devices net: bonding: Flip to the new dev walk API IB/core: Flip to the new dev walk API IB/ipoib: Flip to new dev walk

[PATCH net-next 03/11] net: bonding: Flip to the new dev walk API

2016-10-14 Thread David Ahern
Convert alb_send_learning_packets and bond_has_this_ip to use the new netdev_walk_all_upper_dev_rcu API. In both cases this is just a code conversion; no functional change is intended. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- drivers/net/bonding/bond_alb.c

[PATCH net-next 04/11] IB/core: Flip to the new dev walk API

2016-10-14 Thread David Ahern
Convert rdma_is_upper_dev_rcu, handle_netdev_upper and ipoib_get_net_dev_match_addr to the new upper device walk API. This is just a code conversion; no functional change is intended. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- drivers/infiniband/core/core_priv.h

[PATCH net-next 10/11] net: Add warning if any lower device is still in adjacency list

2016-10-14 Thread David Ahern
Lower list should be empty just like upper. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- net/core/dev.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/net/core/dev.c b/net/core/dev.c index a012c7266230..99a1cb432945 100644 --- a/net/core/dev.c +++ b/ne

[PATCH net-next 09/11] net: Remove all_adj_list and its references

2016-10-14 Thread David Ahern
Only direct adjacencies are maintained. All upper or lower devices can be learned via the new walk API which recursively walks the adj_list for upper devices or lower devices. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- include/linux/netdevice.h | 25 - net/core

[PATCH net-next 07/11] mlxsw: Flip to the new dev walk API

2016-10-14 Thread David Ahern
Convert mlxsw users to new dev walk API. This is just a code conversion; no functional change is intended. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 37 -- 1 file changed, 23 insertions(+), 14 del

[PATCH net-next 06/11] ixgbe: Flip to the new dev walk API

2016-10-14 Thread David Ahern
Convert ixgbe users to new dev walk API. This is just a code conversion; no functional change is intended. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 132 -- 1 file changed, 82 insertions(+), 50 del

[PATCH net-next 08/11] rocker: Flip to the new dev walk API

2016-10-14 Thread David Ahern
Convert rocker to the new dev walk API. This is just a code conversion; no functional change is intended. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- drivers/net/ethernet/rocker/rocker_main.c | 31 --- 1 file changed, 24 insertions(+), 7 del

Re: [PATCH net-next 02/11] net: Introduce new api for walking upper and lower devices

2016-10-17 Thread David Ahern
On 10/17/16 6:21 AM, Stephen Hemminger wrote: > > No if/else needed. No cast of void * ptr need. Use const if possible? > so much of the stack does not use const and trying to add it for this API does not work -- the upper or lower device is passed to the callbacks and those callbacks invoke

Re: [PATCH v3 net-next 00/11] net: Fix netdev adjacency tracking

2016-10-18 Thread David Ahern
On 10/18/16 9:46 AM, David Miller wrote: > Series applied, but the recursion is disappointing. > > If we run into problems due to kernel stack depth because of this with > some configurations (reasonable or not, if we allow it then it can't > crash the kernel), we will either need to find a way

[PATCH net-next 09/11] net: Remove all_adj_list and its references

2016-10-17 Thread David Ahern
Only direct adjacencies are maintained. All upper or lower devices can be learned via the new walk API which recursively walks the adj_list for upper devices or lower devices. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- include/linux/netdevice.h | 25 -- net/core

[PATCH v3 net-next 00/11] net: Fix netdev adjacency tracking

2016-10-17 Thread David Ahern
; no functional change is intended. v3 - address Stephen's comment to simplify logic and remove typecasts v2 - fixed bond0 references in cover-letter - fixed definition of netdev_next_lower_dev_rcu to mirror the upper_dev version. David Ahern (11): net: Remove refnr arg when inserting link adjacencies

[PATCH net-next 05/11] IB/ipoib: Flip to new dev walk API

2016-10-17 Thread David Ahern
Convert ipoib_get_net_dev_match_addr to the new upper device walk API. This is just a code conversion; no functional change is intended. v2 - removed typecast of data Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- drivers/infiniband/ulp/ipoib/ipoib_main.

[PATCH net-next 07/11] mlxsw: Flip to the new dev walk API

2016-10-17 Thread David Ahern
Convert mlxsw users to new dev walk API. This is just a code conversion; no functional change is intended. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 37 -- 1 file changed, 23 insertions(+), 14 del

[PATCH net-next 08/11] rocker: Flip to the new dev walk API

2016-10-17 Thread David Ahern
Convert rocker to the new dev walk API. This is just a code conversion; no functional change is intended. v2 - removed typecast of data Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- drivers/net/ethernet/rocker/rocker_main.c | 31 --- 1 file chang

[PATCH net-next 11/11] net: dev: Improve debug statements for adjacency tracking

2016-10-17 Thread David Ahern
. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- net/core/dev.c | 22 +++--- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index c6bbf310d407..f55fb4536016 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5561,6 +

[PATCH net-next 04/11] IB/core: Flip to the new dev walk API

2016-10-17 Thread David Ahern
Convert rdma_is_upper_dev_rcu, handle_netdev_upper and ipoib_get_net_dev_match_addr to the new upper device walk API. This is just a code conversion; no functional change is intended. v2 - removed typecast of data Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- drivers/infiniban

[PATCH net-next 02/11] net: Introduce new api for walking upper and lower devices

2016-10-17 Thread David Ahern
the upper_dev version. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- include/linux/netdevice.h | 17 + net/core/dev.c| 155 ++ 2 files changed, 172 insertions(+) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h

[PATCH net-next 03/11] net: bonding: Flip to the new dev walk API

2016-10-17 Thread David Ahern
Convert alb_send_learning_packets and bond_has_this_ip to use the new netdev_walk_all_upper_dev_rcu API. In both cases this is just a code conversion; no functional change is intended. v2 - removed typecast of data and simplified bond_upper_dev_walk Signed-off-by: David Ahern &l

[PATCH net-next 01/11] net: Remove refnr arg when inserting link adjacencies

2016-10-17 Thread David Ahern
clear. ie., the refnr arg in 93409033ae65 was only needed for the remove path. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- net/core/dev.c | 27 --- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 352e981296

[PATCH net-next 06/11] ixgbe: Flip to the new dev walk API

2016-10-17 Thread David Ahern
Convert ixgbe users to new dev walk API. This is just a code conversion; no functional change is intended. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 132 -- 1 file changed, 82 insertions(+), 50 del

[PATCH net-next 10/11] net: Add warning if any lower device is still in adjacency list

2016-10-17 Thread David Ahern
Lower list should be empty just like upper. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- net/core/dev.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/net/core/dev.c b/net/core/dev.c index a9fe14908b44..c6bbf310d407 100644 --- a/net/core/dev.c +++ b/ne

[PATCH v2] net: ipv6: Fix processing of RAs in presence of VRF

2016-10-24 Thread David Ahern
denoting if it is has a default route via RA. Fixes: ca254490c8dfd ("net: Add VRF support to IPv6 stack") Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- v2 - added Fixes to commit message include/net/ip6_fib.h | 2 ++ net/ipv6/r

[PATCH] net: ipv6: Fix processing of RAs in presence of VRF

2016-10-24 Thread David Ahern
denoting if it is has a default route via RA. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- include/net/ip6_fib.h | 2 ++ net/ipv6/route.c | 68 --- 2 files changed, 50 insertions(+), 20 deletions(-) diff --git a/include/net/ip6

[PATCH] net: ipv6: Do not consider link state for nexthop validation

2016-10-24 Thread David Ahern
t;net: ipv6: Use passed in table for nexthop lookups") Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- include/net/ip6_route.h | 1 + net/ipv6/route.c| 6 -- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/include/net/ip6_route.h b/include/net/i

Re: [PATCH v6 0/6] Add eBPF hooks for cgroups

2016-10-20 Thread David Ahern
On 9/19/16 10:43 AM, Daniel Mack wrote: > This is v6 of the patch set to allow eBPF programs for network > filtering and accounting to be attached to cgroups, so that they apply > to all sockets of all tasks placed in that cgroup. The logic also > allows to be extendeded for other cgroup based

Re: [PATCH net] net: core: Correctly iterate over lower adjacency list

2016-10-19 Thread David Ahern
;dev; > } > EXPORT_SYMBOL(netdev_all_lower_get_next_rcu); When I converted this function in my series I wondered how the current code worked at all. I guess I didn't. This is inline with what I did and matches the form used for the all_upper variant, so for 4.9 and 4.8.x Acked-by: David Ahern <d...@cumulusnetworks.com> I would like to see my series applied to 4.9 at some point.

Re: Source address fib invalidation on IPv6

2016-11-12 Thread David Ahern
On 11/12/16 8:40 AM, Jason A. Donenfeld wrote: > Hi again, > > I've done some pretty in depth debugging now to determine exactly what > the behavior of ipv6_stub->ipv6_dst_lookup is. First I'll start with > ip_route_output_flow, which I believe to be well behaved, and then > I'll show

Re: [PATCH v3] ip6_output: ensure flow saddr actually belongs to device

2016-11-14 Thread David Ahern
easy to use the same error handlers for both cases. >> >> Signed-off-by: Jason A. Donenfeld <ja...@zx2c4.com> >> Cc: David Ahern <d...@cumulusnetworks.com> >> --- >> Changes from v2: >> It turns out ipv6_chk_addr already has the device e

Re: [PATCH v3] ip6_output: ensure flow saddr actually belongs to device

2016-11-14 Thread David Ahern
feld <ja...@zx2c4.com> > Cc: David Ahern <d...@cumulusnetworks.com> > --- > Changes from v2: > It turns out ipv6_chk_addr already has the device enumeration > logic that we need by simply passing NULL. > > net/ipv6/ip6_output.c | 4 > 1 file changed

Re: [PATCH v3] ip6_output: ensure flow saddr actually belongs to device

2016-11-14 Thread David Ahern
On 11/14/16 10:33 AM, Hannes Frederic Sowa wrote: > I just also quickly read up on the history (sorry was travelling last > week) and wonder if you ever saw a user space facing bug or if this is > basically some difference you saw while writing out of tree code? I checked the

Re: [PATCH v3] ip6_output: ensure flow saddr actually belongs to device

2016-11-14 Thread David Ahern
On 11/14/16 10:04 AM, Hannes Frederic Sowa wrote: > On 14.11.2016 17:55, David Ahern wrote: >> On 11/14/16 9:44 AM, Hannes Frederic Sowa wrote: >>> On Mon, Nov 14, 2016, at 00:28, Jason A. Donenfeld wrote: >>>> This puts the IPv6 routing functions in parity with t

Re: net/icmp: null-ptr-deref in icmp6_send

2016-11-22 Thread David Ahern
On 11/22/16 1:11 PM, Cong Wang wrote: > I have no idea what commit 5d41ce29e tried to fix, but we already > use skb->dev a few lines before l3mdev_master_ifindex(), so I don't > understand why skb->dev could be NULL, maybe just for vrf dev? skb->dev can be null depending on when icmp6_send /

Re: [net,v2] neigh: fix the loop index error in neigh dump

2016-11-27 Thread David Ahern
On 11/27/16 6:32 PM, Zhang Shengju wrote: > Loop index in neigh dump function is not updated correctly under some > circumstances, this patch will fix it. What's an example? > > Fixes: 16660f0bd9 ("net: Add support for filtering neigh dump by device > index") > Fixes: 21fdd092ac ("net: Add

[PATCH net-next v3 2/3] bpf: Add new cgroup attach type to enable sock modifications

2016-11-28 Thread David Ahern
program. This allows a cgroup to be configured such that AF_INET{6} sockets opened by processes are automatically bound to a specific device. In turn, this enables the running of programs that do not support SO_BINDTODEVICE in a specific VRF context / L3 domain. Signed-off-by: David Ahern &l

[PATCH net-next v3 1/3] bpf: Refactor cgroups code in prep for new type

2016-11-28 Thread David Ahern
Code move only; no functional change intended. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- v3 - dropped the rename v2 - fix bpf_prog_run_clear_cb to bpf_prog_run_save_cb as caught by Daniel - rename BPF_PROG_TYPE_CGROUP_SKB and its cg_skb functions to BPF_PROG_TYPE_

[PATCH net-next v3 3/3] samples: bpf: add userspace example for modifying sk_bound_dev_if

2016-11-28 Thread David Ahern
Add a simple program to demonstrate the ability to attach a bpf program to a cgroup that sets sk_bound_dev_if for AF_INET{6} sockets when they are created. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- v3 - revert to BPF_PROG_TYPE_CGROUP_SOCK prog type v2 - removed bpf_sock_sto

[PATCH net-next v3 0/3] net: Add bpf support to set sk_bound_dev_if

2016-11-28 Thread David Ahern
p 'dev == '), including processes not running as root. This capability enables running any program in a VRF context and is key to deploying Management VRF, a fundamental configuration for networking gear, with any Linux OS installation. David Ahern (3): bpf: Refactor cgroups code in prep for new type

Re: [PATCH net-next v3 3/3] samples: bpf: add userspace example for modifying sk_bound_dev_if

2016-11-28 Thread David Ahern
On 11/28/16 1:37 PM, Alexei Starovoitov wrote: > On Mon, Nov 28, 2016 at 07:48:50AM -0800, David Ahern wrote: >> Add a simple program to demonstrate the ability to attach a bpf program >> to a cgroup that sets sk_bound_dev_if for AF_INET{6} sockets when they >> are crea

Re: [PATCH net-next v3 1/3] bpf: Refactor cgroups code in prep for new type

2016-11-28 Thread David Ahern
On 11/28/16 1:06 PM, Alexei Starovoitov wrote: > On Mon, Nov 28, 2016 at 07:48:48AM -0800, David Ahern wrote: >> Code move only; no functional change intended. > > not quite... > >> Signed-off-by: David Ahern <d...@cumulusnetworks.com> > ... >> * @sk:

Re: [PATCH net-next v3 2/3] bpf: Add new cgroup attach type to enable sock modifications

2016-11-28 Thread David Ahern
On 11/28/16 1:32 PM, Alexei Starovoitov wrote: > On Mon, Nov 28, 2016 at 07:48:49AM -0800, David Ahern wrote: >> Add new cgroup based program type, BPF_PROG_TYPE_CGROUP_SOCK. Similar to >> BPF_PROG_TYPE_CGROUP_SKB programs can be attached to a cgroup and run >> any time a

[PATCH net-next v4 1/3] bpf: Refactor cgroups code in prep for new type

2016-11-28 Thread David Ahern
Code move and rename only; no functional change intended. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- v4 - dropped refactor of __cgroup_bpf_run_filter and renamed it to __cgroup_bpf_run_filter_skb v3 - dropped the rename v2 - fix bpf_prog_run_clear_cb to bpf_prog_run_s

[PATCH net-next v4 0/3] net: Add bpf support to set sk_bound_dev_if

2016-11-28 Thread David Ahern
p 'dev == '), including processes not running as root. This capability enables running any program in a VRF context and is key to deploying Management VRF, a fundamental configuration for networking gear, with any Linux OS installation. David Ahern (3): bpf: Refactor cgroups code in prep for new type

[PATCH net-next v4 2/3] bpf: Add new cgroup attach type to enable sock modifications

2016-11-28 Thread David Ahern
program. This allows a cgroup to be configured such that AF_INET{6} sockets opened by processes are automatically bound to a specific device. In turn, this enables the running of programs that do not support SO_BINDTODEVICE in a specific VRF context / L3 domain. Signed-off-by: David Ahern &l

[PATCH net-next v4 3/3] samples: bpf: add userspace example for modifying sk_bound_dev_if

2016-11-28 Thread David Ahern
Add a simple program to demonstrate the ability to attach a bpf program to a cgroup that sets sk_bound_dev_if for AF_INET{6} sockets when they are created. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- v4 - added test_cgrp2_sock.sh for an automated test v3 -

[PATCH] net: handle no dst on skb in icmp6_send

2016-11-27 Thread David Ahern
ers the case reported here where icmp6_send is invoked on Rx before the route lookup. Fixes: 5d41ce29e ("net: icmp6_send should use dst dev to determine L3 domain") Reported-by: Andrey Konovalov <andreyk...@google.com> Signed-off-by: David Ahern <d...@cumulusnetworks.com>

Re: [net,v2] neigh: fix the loop index error in neigh dump

2016-11-27 Thread David Ahern
On 11/27/16 7:56 PM, David Ahern wrote: > On 11/27/16 7:53 PM, 张胜举 wrote: >> >> >>> -Original Message----- >>> From: David Ahern [mailto:d...@cumulusnetworks.com] >>> Sent: Monday, November 28, 2016 10:39 AM >>> To: 张胜举 <zhangshe

[PATCH net-next] bpf: samples: Fix compile of test_lru_dist.c

2016-11-27 Thread David Ahern
.git/samples/bpf/test_lru_dist.c:490:16: error: storage size of ‘r’ isn’t known struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY}; Add sys/resource.h to the include list Fixes: 5db58faf989f ("bpf: Add tests for the LRU bpf_htab") Signed-off-by: David Ahern <d...@cumulusnetworks.co

Re: [net,v2] neigh: fix the loop index error in neigh dump

2016-11-27 Thread David Ahern
On 11/27/16 7:53 PM, 张胜举 wrote: > > >> -Original Message----- >> From: David Ahern [mailto:d...@cumulusnetworks.com] >> Sent: Monday, November 28, 2016 10:39 AM >> To: 张胜举 <zhangshen...@cmss.chinamobile.com>; >> netdev@vger.kernel.org >> Subje

Re: [net,v2] neigh: fix the loop index error in neigh dump

2016-11-27 Thread David Ahern
On 11/27/16 9:50 PM, 张胜举 wrote: > No, when dump request must be processed by multiple 'recv/recvmsg' system > calls, > idx stores which dev/neigh the previous call have processed, so that next > call will scan > from the right place. I have tested multiple calls and I do not see redundant

Re: [net,v2] neigh: fix the loop index error in neigh dump

2016-11-27 Thread David Ahern
On 11/27/16 7:34 PM, 张胜举 wrote: >> -Original Message- >> From: David Ahern [mailto:d...@cumulusnetworks.com] >> Sent: Monday, November 28, 2016 10:10 AM >> To: Zhang Shengju <zhangshen...@cmss.chinamobile.com>; >> netdev@vger.kernel.org >> Subje

Re: net/icmp: null-ptr-deref in icmp6_send

2016-11-22 Thread David Ahern
Sent from my iPhone > On Nov 22, 2016, at 1:11 PM, Cong Wang wrote: > >> On Tue, Nov 22, 2016 at 2:23 AM, Andrey Konovalov >> wrote: >> Hi, >> >> I've got the following error report while fuzzing the kernel with syzkaller. >> >> It seems

Re: [PATCH v3] ip6_output: ensure flow saddr actually belongs to device

2016-11-15 Thread David Ahern
On 11/15/16 7:45 AM, Hannes Frederic Sowa wrote: >> @@ -1012,6 +1013,16 @@ static int ip6_dst_lookup_tail(struct net *net, >> const struct sock *sk, >> } >> #endif >> >> +addr_type = ipv6_addr_type(>saddr); >> +if (addr_type == IPv6_ADDR_ANY) >> +return

Re: [PATCH] ip6_output: ensure flow saddr actually belongs to device

2016-11-13 Thread David Ahern
feld <ja...@zx2c4.com> > Cc: David Ahern <d...@cumulusnetworks.com> > --- > net/ipv6/ip6_output.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c > index 6001e78..a834129 100644 > --- a/net/ipv6/i

Re: Source address fib invalidation on IPv6

2016-11-11 Thread David Ahern
On 11/11/16 12:29 PM, Jason A. Donenfeld wrote: > Hi folks, > > If I'm replying to a UDP packet, I generally want to use a source > address that's the same as the destination address of the packet to > which I'm replying. For example: > > Peer A sends packet: src = 10.0.0.1, dst = 10.0.0.3 >

Re: [PATCH] ip6_output: ensure flow saddr actually belongs to device

2016-11-13 Thread David Ahern
On 11/13/16 1:19 PM, Jason A. Donenfeld wrote: > I gave v2 my best shot. Hopefully it's adequate, but I have a feeling > it might be best for you to just code up what you have in mind. nah, you are doing fine. one more comment on v2.

Re: [PATCH v2] ip6_output: ensure flow saddr actually belongs to device

2016-11-13 Thread David Ahern
Donenfeld <ja...@zx2c4.com> > Cc: David Ahern <d...@cumulusnetworks.com> > --- > Changes from v1: >This moves the check to the top and now sees if it's a valid address >on _any_ device, not just the one in dst. > > include/net/ipv6.h| 2 ++ > net

Re: [PATCH v2 net-next 1/5] bpf: Refactor cgroups code in prep for new type

2016-11-13 Thread David Ahern
On 10/31/16 11:49 AM, Thomas Graf wrote: > On 10/31/16 at 06:16pm, Daniel Mack wrote: >> On 10/31/2016 06:05 PM, David Ahern wrote: >>> On 10/31/16 11:00 AM, Daniel Mack wrote: >>>> Yeah, I'm confused too. I changed that name in my v7 from

Re: [net-next] rtnl: fix the loop index update error in rtnl_dump_ifinfo()

2016-11-19 Thread David Ahern
goto cont; > if (idx < s_idx) > goto cont; > err = rtnl_fill_ifinfo(skb, dev, RTM_NEWLINK, > Fixes: dc599f76c22b0 ("net: Add support for filtering link dump by master device and kind") Acked-by: David Ahern <d...@cumulusnetworks.com>

[PATCH net-next] net: Enable support for VRF with ipv4 multicast

2016-10-31 Thread David Ahern
pmr forwarding and local rx/tx work. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- drivers/net/vrf.c | 23 ++- net/ipv4/ipmr.c | 13 - net/ipv4/route.c | 41 ++--- 3 files changed, 56 insertions(+), 21

[PATCH] net: tcp: check skb is non-NULL for exact match on lookups

2016-11-02 Thread David Ahern
orted-by: Andrey Konovalov <andreyk...@google.com> Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- Dave: commit a04a480d4392 was queued for stable, so this needs to follow it. include/linux/ipv6.h | 2 +- include/net/tcp.h| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)

Re: [PATCH net-next 0/3] tools lib bpf: Synchronize implementations

2016-11-01 Thread David Ahern
On 10/31/16 12:39 PM, Joe Stringer wrote: > Update tools/lib/bpf to provide more functionality and improve interoperation > with other tools that generate and use eBPF code. > > The kernel uapi headers are a bit newer than the version in the tools/ > directory; synchronize those. > >

Re: [PATCH net-next v2 3/5] bpf: BPF for lightweight tunnel encapsulation

2016-11-01 Thread David Ahern
On 10/31/16 6:37 PM, Thomas Graf wrote: > Register two new BPF prog types BPF_PROG_TYPE_LWT_IN and > BPF_PROG_TYPE_LWT_OUT which are invoked if a route contains a > LWT redirection of type LWTUNNEL_ENCAP_BPF. > > The separate program types are required because manipulation of > packet data is

Re: [PATCH v2 net-next 1/5] bpf: Refactor cgroups code in prep for new type

2016-10-31 Thread David Ahern
On 10/31/16 11:00 AM, Daniel Mack wrote: > On 10/31/2016 05:58 PM, David Miller wrote: >> From: David Ahern <d...@cumulusnetworks.com> >> Date: Wed, 26 Oct 2016 17:58:38 -0700 >> >>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h >>

Re: [PATCH v2 net-next 0/5] Add bpf support to set sk_bound_dev_if

2016-10-31 Thread David Ahern
On 10/31/16 11:01 AM, David Miller wrote: > From: David Ahern <d...@cumulusnetworks.com> > Date: Wed, 26 Oct 2016 17:58:37 -0700 > >> The recently added VRF support in Linux leverages the bind-to-device >> API for programs to specify an L3 domain for a socket. While

Re: [PATCH v7 0/6] Add eBPF hooks for cgroups

2016-10-28 Thread David Ahern
On 10/28/16 5:28 AM, Pablo Neira Ayuso wrote: > I saw those, I would really like to have a closer look at David > Ahern's usecase since that skb iif mangling looks kludgy to me, and > given this is exposing a new helper for general use, not only vrf, it > would be good to make sure helpers provide

[PATCH net-next] net: Update raw socket bind to consider l3 domain

2016-11-03 Thread David Ahern
to lookup the address. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- net/ipv4/raw.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c index 6a0bd68a565b..9ef2a602f052 100644 --- a/net/ipv4/raw.c +++ b/net/ipv4

[PATCH] net: icmp_route_lookup should use rt dev to determine L3 domain

2016-11-03 Thread David Ahern
;) Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- net/ipv4/icmp.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index 38abe70e595f..774a15e9f041 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -477,7 +477,7 @@ s

[PATCH net-next] netfilter: Update ip_route_me_harder to consider L3 domain

2016-11-03 Thread David Ahern
to pull the L3 domain from the dst currently attached to the skb directs both lookups to the correct table. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- Pablo: from a code review it seems ip_route_me_harder is only called in the output path and after skb_dst is set. ne

Re: [PATCH net-next] netfilter: Update ip_route_me_harder to consider L3 domain

2016-11-03 Thread David Ahern
On 11/3/16 11:43 AM, David Ahern wrote: > ip_route_me_harder is not considering the L3 domain and sending lookups > to the wrong table. For example consider the following output rule: > > iptables -I OUTPUT -p tcp --dport 12345 -j REJECT --reject-with tcp-reset > > using perf

[PATCH] net: icmp6_send should use dst dev to determine L3 domain

2016-11-03 Thread David Ahern
icmp6_send is called in response to some event. The skb may not have the device set (skb->dev is NULL), but it is expected to have a dst set. Update icmp6_send to use the dst on the skb to determine L3 domain. Fixes: ca254490c8dfd ("net: Add VRF support to IPv6 stack") Signed-off-by

[PATCH net-next] netfilter: Update nf_send_reset6 to consider L3 domain

2016-11-03 Thread David Ahern
([kernel.kallsyms]) 528 nf_send_reset6 ([nf_reject_ipv6]) Update nf_send_reset6 to pull the L3 domain from the dst currently attached to the skb. Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- net/ipv6/netfilter/nf_reject_ipv6.c | 1 + 1 file changed, 1 ins

Re: [PATCH] net: tcp: check skb is non-NULL for exact match on lookups

2016-11-02 Thread David Ahern
On 11/2/16 2:13 PM, Andrey Konovalov wrote: > I can confirm that this fixes the null-ptr-deref I've been getting. > Thanks, Andrey.

Re: [PATCH net-next iproute2 PATCH 2/2 v2] ss: Add inet raw sockets information gathering via netlink diag interface

2016-11-02 Thread David Ahern
lude/linux/inet_diag.h | 15 +++ > misc/ss.c | 20 ++-- > 2 files changed, 33 insertions(+), 2 deletions(-) worked for me. Acked-by: David Ahern <d...@cumulusnetworks.com>

Re: [patch net-next 2/2] [PATCH] net: ip, raw_diag -- Use jump for exiting from nested loop

2016-11-02 Thread David Ahern
On 11/2/16 6:36 AM, Cyrill Gorcunov wrote: > I managed to miss that sk_for_each is called under "for" > cycle so need to use goto here to return matching socket. > > CC: David S. Miller <da...@davemloft.net> > CC: Eric Dumazet <eric.duma...@gmail.com> > C

Re: [patch net-next 1/2] [PATCH] net: ip, raw_diag -- Fix socket leaking for destroy request

2016-11-02 Thread David Ahern
On 11/2/16 6:36 AM, Cyrill Gorcunov wrote: > In raw_diag_destroy the helper raw_sock_get returns > with sock_hold call, so we have to put it then. > > CC: David S. Miller <da...@davemloft.net> > CC: Eric Dumazet <eric.duma...@gmail.com> > CC: David Ahern <d...@

Re: [patch net-next 0/2] Fixes for raw diag sockets handling

2016-11-02 Thread David Ahern
On 11/2/16 9:29 AM, Cyrill Gorcunov wrote: > On Wed, Nov 02, 2016 at 09:10:32AM -0600, David Ahern wrote: >>> @__dif != 0 the match may return socket where sk_bound_dev_if = 0 >>> instead of completely matching one. Isn't it? >> >> yes. I recently added an exa

Re: [patch net-next 0/2] Fixes for raw diag sockets handling

2016-11-02 Thread David Ahern
On 11/2/16 6:36 AM, Cyrill Gorcunov wrote: > Also I have a question about sockets lookup not for raw diag only > (though I didn't modify lookup procedure) but in general: the structure > inet_diag_req_v2 has inet_diag_sockid::idiag_if member which supposed to > carry interface index from userspace

Re: [PATCH net-next 2/3] bpf: Add new cgroups prog type to enable sock modifications

2016-10-26 Thread David Ahern
On 10/26/16 2:31 PM, Mahesh Bandewar (महेश बंडेवार) wrote: > The hook insertion in sk_alloc() may not solve all control-path checks as not > much can be done (probably apart for changing sk_bound_dev_if) during > allocation but hooks in bind(), listen(), setsockopt() etc. (not a complete >

Re: RFH: problems with adjacency graph

2016-10-11 Thread David Ahern
On 10/11/16 12:54 AM, Jiri Pirko wrote: >> >> It seems like the complete mesh is not really needed, but cscope shows >> spectrum, ixgbe and bonding all using the for_each upper and lower device >> macros. >> >> Suggestions? > > Well other possibility is to traverse the tree recursively. But

Re: [PATCH net-next 00/11] net: Fix netdev adjacency tracking

2016-10-13 Thread David Ahern
On 10/13/16 1:34 AM, Jiri Pirko wrote: > > Although I didn't like the "all-list" idea when Veaceslav pushed it > because it looked to me like a big hammer, it turned out to be very handy > and quick for traversing neighbours. Why it cannot be fixed? > > The walks with possibly hundreds of

Re: [PATCH v6] net: ip, diag -- Add diag interface for raw sockets

2016-10-13 Thread David Ahern
On 10/13/16 1:16 AM, Cyrill Gorcunov wrote: > On Wed, Oct 12, 2016 at 07:55:04PM -0400, David Miller wrote: >> From: Cyrill Gorcunov >> Date: Wed, 12 Oct 2016 09:53:29 +0300 >> >>> I can't rename the field, neither a can use union. >> >> Remind me again what is wrong with

[PATCH] net: Require exact match for TCP socket lookups if dif is l3mdev

2016-10-13 Thread David Ahern
t;net: Introduce VRF device driver") Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- include/net/inet_sock.h | 10 ++ net/ipv4/inet_hashtables.c | 7 --- net/ipv6/inet6_hashtables.c | 7 --- 3 files changed, 18 insertions(+), 6 deletions(-) diff --git a/include/n

Re: [PATCH] net: Require exact match for TCP socket lookups if dif is l3mdev

2016-10-13 Thread David Ahern
On 10/13/16 3:29 PM, Eric Dumazet wrote: > Since netif_index_is_l3_master() is not cheap, can you reorder the > test ? > > if (!net->ipv4.sysctl_tcp_l3mdev_accept) > return netif_index_is_l3_master(net, dif); sure. Since this use case is called under rcu_read_lock I can make a

[PATCH v2] net: Require exact match for TCP socket lookups if dif is l3mdev

2016-10-13 Thread David Ahern
is expanded to u16 without increasing the size of the struct. This is needed to add another flag. Fixes: 193125dbd8eb ("net: Introduce VRF device driver") Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- v2 - reordered the checks in inet_exact_dif_match per Eric's comment - ch

[PATCH] net: ipv4: Do not drop to make_route if oif is l3mdev

2016-10-12 Thread David Ahern
er. Fixes: e0d56fdd7342 ("net: l3mdev: remove redundant calls") Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- include/net/l3mdev.h | 24 net/ipv4/route.c | 3 ++- 2 files changed, 26 insertions(+), 1 deletion(-) diff --git a/include/

Re: [PATCH v3] net: Require exact match for TCP socket lookups if dif is l3mdev

2016-10-15 Thread David Ahern
On 10/15/16 3:46 PM, David Miller wrote: > From: David Ahern <d...@cumulusnetworks.com> > Date: Fri, 14 Oct 2016 12:29:19 -0700 > >> +/* can not be used in TCP layer after tcp_v6_fill_cb */ >> +static inline bool inet6_exact_dif_match(struct net *n

[PATCH] net: Require exact match for TCP socket lookups if dif is l3mdev

2016-10-16 Thread David Ahern
following the flags, so it can be expanded to u16 without increasing the size of the struct. Fixes: 193125dbd8eb ("net: Introduce VRF device driver") Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- v4 - renamed existing skb_l3mdev_slave to ipv6_l3mdev_skb - renamed ipv4 version

[PATCH v3] net: Require exact match for TCP socket lookups if dif is l3mdev

2016-10-14 Thread David Ahern
following the flags, so it can be expanded to u16 without increasing the size of the struct. Fixes: 193125dbd8eb ("net: Introduce VRF device driver") Signed-off-by: David Ahern <d...@cumulusnetworks.com> --- v3 - changed the match functions to pull the skb flag from TCP_SKB_CB

Re: [PATCH net-next v5 2/3] bpf: Add new cgroup attach type to enable sock modifications

2016-11-29 Thread David Ahern
On 11/29/16 1:01 PM, Alexei Starovoitov wrote: > Could you also expose sk_protcol and sk_type as read only fields? Those are bitfields in struct sock, so can't use offsetof or sizeof. Any existing use cases that try to load a bitfield in a bpf that I can look at?

Re: [PATCH net-next v5 2/3] bpf: Add new cgroup attach type to enable sock modifications

2016-11-29 Thread David Ahern
On 11/29/16 1:01 PM, Alexei Starovoitov wrote: > Could you also expose sk_protcol and sk_type as read only fields? > They have user space visible values already and will make this new > BPF_PROG_TYPE_CGROUP_SOCK program type much more useful beyond vrf > use case. Like we'll be able to write a

<    6   7   8   9   10   11   12   13   14   15   >