Re: [PATCH] net: make tc-police action MTU behavior match documentation

2018-02-28 Thread Andrew Collins
From: Cong Wang 
> If this is just an iproute2 issue, please fix it there rather in kernel.

I'm fine with fixing this as a documentation issue (fix being change man page, 
MTU "Defaults to unlimited" to "Defaults to 2047").

Note however this does mean that tc-police will keep its (IMO) rather 
surprising default behavior of dropping coalesced
packets on a GRO enabled interface.

As mentioned in the commit message, segmenting GRO-ed skbs is a better solution 
long term, do you have any suggestions
for how an action might segment an skb similar to the way tc-tbf does?  I 
didn't see an obvious method, since qdiscs have
the advantage of an underlying queue whereas actions don't, but perhaps I 
missed something obvious.

Re: [PATCH] net: make tc-police action MTU behavior match documentation

2018-02-27 Thread Andrew Collins
> I don't find such statement from the man page:
> http://man7.org/linux/man-pages/man8/tc-police.8.html


> What am I missing?
   
Under MTU the man page states:

mtu BYTES[/BYTES]
  This is the maximum packet size handled by the policer (larger 
ones will be handled like they exceeded the configured rate). Setting this 
value correctly will improve the scheduler's precision.   
  Value  formatting  is identical to burst above. ->Defaults to 
unlimited<-.

Peakrate requiring MTU isn't mentioned directly in the man page, but if you 
provide peakrate without MTU, tc complains:

"mtu" is required, if "peakrate" is requested.

The idea here is just to make the actual implementation match these two 
statements, MTU is unlimited, unless you use peakrate in which case you have to 
provide it (although if you craft netlink messages without tc you can set 
peakrate with no mtu, and the action will still default to a reasonable mtu 
rather than falling over).

Andrew Collins

[PATCH] net: make tc-police action MTU behavior match documentation

2018-02-26 Thread Andrew Collins
The man page for tc-police states that the MTU defaults to
unlimited if peakrate is not specified, but it actually defaults
to 2047.

This causes issues with GRO enabled interfaces, as >2047 coalesced
packets get dropped and the resulting rate is wildly inaccurate.

Fix by only setting the default MTU when peakrate is specified.

Longer term act_police should likely segment GRO packets like
sch_tbf does, but I see no clear way to accomplish this within
a tc action.

Signed-off-by: Andrew Collins 
---
 net/sched/act_police.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sched/act_police.c b/net/sched/act_police.c
index 95d3c9097b25..36b8c92f644c 100644
--- a/net/sched/act_police.c
+++ b/net/sched/act_police.c
@@ -149,7 +149,7 @@ static int tcf_act_police_init(struct net *net, struct 
nlattr *nla,
police->tcfp_mtu = parm->mtu;
if (police->tcfp_mtu == 0) {
police->tcfp_mtu = ~0;
-   if (R_tab)
+   if (R_tab && P_tab)
police->tcfp_mtu = 255 << R_tab->rate.cell_log;
}
if (R_tab) {
-- 
2.14.3



Re: fq_codel and skb->hash

2017-01-18 Thread Andrew Collins

On 01/17/2017 08:14 AM, Eric Dumazet wrote:

Right, but that might add overhead in cases we do not need skb->hash
after IPsec . I've heard IPsec is already quite slow :/


I've been running with the following change locally with good results:

--- a/net/sched/sch_fq_codel.c
+++ b/net/sched/sch_fq_codel.c
@@ -57,7 +57,6 @@ struct fq_codel_sched_data {
struct fq_codel_flow *flows;/* Flows table [flows_cnt] */
u32 *backlogs;  /* backlog table [flows_cnt] */
u32 flows_cnt;  /* number of flows */
-   u32 perturbation;   /* hash perturbation */
u32 quantum;/* psched_mtu(qdisc_dev(sch)); */
u32 drop_batch_size;
u32 memory_limit;
@@ -75,9 +74,7 @@ struct fq_codel_sched_data {
 static unsigned int fq_codel_hash(const struct fq_codel_sched_data *q,
  struct sk_buff *skb)
 {
-   u32 hash = skb_get_hash_perturb(skb, q->perturbation);
-
-   return reciprocal_scale(hash, q->flows_cnt);
+   return reciprocal_scale(skb_get_hash(skb), q->flows_cnt);
 }

 static unsigned int fq_codel_classify(struct sk_buff *skb, struct Qdisc *sch,
@@ -482,7 +479,6 @@ static int fq_codel_init(struct Qdisc *sch, struct nlattr 
*opt)
q->memory_limit = 32 << 20; /* 32 MBytes */
q->drop_batch_size = 64;
q->quantum = psched_mtu(qdisc_dev(sch));
-   q->perturbation = prandom_u32();
INIT_LIST_HEAD(&q->new_flows);
INIT_LIST_HEAD(&q->old_flows);
codel_params_init(&q->cparams);

Any interest in me spinning a real patch for this?  I agree that it'd be better
if we were guaranteed to get a pre-encryption flow hash for any IPsec traffic,
but in my particular case I don't care, as I control the HW and can make it give
me a hash. :)

Thanks,
Andrew Collins


[PATCH net-next] fq_codel: Avoid regenerating skb flow hash unless necessary

2017-01-18 Thread Andrew Collins
The fq_codel qdisc currently always regenerates the skb flow hash.
This wastes some cycles and prevents flow seperation in cases where
the traffic has been encrypted and can no longer be understood by the
flow dissector.

Change it to use the prexisting flow hash if one exists, and only
regenerate if necessary.

Signed-off-by: Andrew Collins 
---
 net/sched/sch_fq_codel.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
index a5ea0e9..2f50e4c 100644
--- a/net/sched/sch_fq_codel.c
+++ b/net/sched/sch_fq_codel.c
@@ -57,7 +57,6 @@ struct fq_codel_sched_data {
struct fq_codel_flow *flows;/* Flows table [flows_cnt] */
u32 *backlogs;  /* backlog table [flows_cnt] */
u32 flows_cnt;  /* number of flows */
-   u32 perturbation;   /* hash perturbation */
u32 quantum;/* psched_mtu(qdisc_dev(sch)); */
u32 drop_batch_size;
u32 memory_limit;
@@ -75,9 +74,7 @@ struct fq_codel_sched_data {
 static unsigned int fq_codel_hash(const struct fq_codel_sched_data *q,
  struct sk_buff *skb)
 {
-   u32 hash = skb_get_hash_perturb(skb, q->perturbation);
-
-   return reciprocal_scale(hash, q->flows_cnt);
+   return reciprocal_scale(skb_get_hash(skb), q->flows_cnt);
 }
 
 static unsigned int fq_codel_classify(struct sk_buff *skb, struct Qdisc *sch,
@@ -482,7 +479,6 @@ static int fq_codel_init(struct Qdisc *sch, struct nlattr 
*opt)
q->memory_limit = 32 << 20; /* 32 MBytes */
q->drop_batch_size = 64;
q->quantum = psched_mtu(qdisc_dev(sch));
-   q->perturbation = prandom_u32();
INIT_LIST_HEAD(&q->new_flows);
INIT_LIST_HEAD(&q->old_flows);
codel_params_init(&q->cparams);
-- 
2.9.3



fq_codel and skb->hash

2017-01-16 Thread Andrew Collins

The fq_codel packet scheduler always regenerates the skb flow hash.  Is there 
any reason
to do this other than the recent hash perturbation additions?

I ask because I have a case where an incoming set of TCP flows is encapsulated 
in a
single ipsec tunnel, which is then classified on egress into a single flow by 
fq_codel
resulting in poor behavior.

Reusing the skb hash set in receive results in much better behavior, as the 
value is
determined pre-encapsulation and flows end up being distributed more naturally 
across
codel queues.

Is opportunistically using the pre-existing skb hash (if available) 
problematic?  Is
there a better way to manage flow separation in routed+encapsulated traffic?

Thanks,
Andrew Collins


[PATCH net] Add netdev all_adj_list refcnt propagation to fix panic

2016-10-03 Thread Andrew Collins
This is a respin of a patch to fix a relatively easily reproducible kernel
panic related to the all_adj_list handling for netdevs in recent kernels.

The following sequence of commands will reproduce the issue:

ip link add link eth0 name eth0.100 type vlan id 100
ip link add link eth0 name eth0.200 type vlan id 200
ip link add name testbr type bridge
ip link set eth0.100 master testbr
ip link set eth0.200 master testbr
ip link add link testbr mac0 type macvlan
ip link delete dev testbr

This creates an upper/lower tree of (excuse the poor ASCII art):

/---eth0.100-eth0
mac0-testbr-
\---eth0.200-eth0

When testbr is deleted, the all_adj_lists are walked, and eth0 is deleted twice 
from
the mac0 list. Unfortunately, during setup in __netdev_upper_dev_link, only one
reference to eth0 is added, so this results in a panic.

This change adds reference count propagation so things are handled properly.

Matthias Schiffer reported a similar crash in batman-adv:

https://github.com/freifunk-gluon/gluon/issues/680
https://www.open-mesh.org/issues/247

which this patch also seems to resolve.

Signed-off-by: Andrew Collins 
---
 net/core/dev.c | 68 --
 1 file changed, 37 insertions(+), 31 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index ea63120..1da79ef 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5578,6 +5578,7 @@ static inline bool netdev_adjacent_is_neigh_list(struct 
net_device *dev,
 
 static int __netdev_adjacent_dev_insert(struct net_device *dev,
struct net_device *adj_dev,
+   u16 ref_nr,
struct list_head *dev_list,
void *private, bool master)
 {
@@ -5587,7 +5588,7 @@ static int __netdev_adjacent_dev_insert(struct net_device 
*dev,
adj = __netdev_find_adj(adj_dev, dev_list);
 
if (adj) {
-   adj->ref_nr++;
+   adj->ref_nr += ref_nr;
return 0;
}
 
@@ -5597,7 +5598,7 @@ static int __netdev_adjacent_dev_insert(struct net_device 
*dev,
 
adj->dev = adj_dev;
adj->master = master;
-   adj->ref_nr = 1;
+   adj->ref_nr = ref_nr;
adj->private = private;
dev_hold(adj_dev);
 
@@ -5636,6 +5637,7 @@ free_adj:
 
 static void __netdev_adjacent_dev_remove(struct net_device *dev,
 struct net_device *adj_dev,
+u16 ref_nr,
 struct list_head *dev_list)
 {
struct netdev_adjacent *adj;
@@ -5648,10 +5650,10 @@ static void __netdev_adjacent_dev_remove(struct 
net_device *dev,
BUG();
}
 
-   if (adj->ref_nr > 1) {
-   pr_debug("%s to %s ref_nr-- = %d\n", dev->name, adj_dev->name,
-adj->ref_nr-1);
-   adj->ref_nr--;
+   if (adj->ref_nr > ref_nr) {
+   pr_debug("%s to %s ref_nr-%d = %d\n", dev->name, adj_dev->name,
+ref_nr, adj->ref_nr-ref_nr);
+   adj->ref_nr -= ref_nr;
return;
}
 
@@ -5670,21 +5672,22 @@ static void __netdev_adjacent_dev_remove(struct 
net_device *dev,
 
 static int __netdev_adjacent_dev_link_lists(struct net_device *dev,
struct net_device *upper_dev,
+   u16 ref_nr,
struct list_head *up_list,
struct list_head *down_list,
void *private, bool master)
 {
int ret;
 
-   ret = __netdev_adjacent_dev_insert(dev, upper_dev, up_list, private,
-  master);
+   ret = __netdev_adjacent_dev_insert(dev, upper_dev, ref_nr, up_list,
+  private, master);
if (ret)
return ret;
 
-   ret = __netdev_adjacent_dev_insert(upper_dev, dev, down_list, private,
-  false);
+   ret = __netdev_adjacent_dev_insert(upper_dev, dev, ref_nr, down_list,
+  private, false);
if (ret) {
-   __netdev_adjacent_dev_remove(dev, upper_dev, up_list);
+   __netdev_adjacent_dev_remove(dev, upper_dev, ref_nr, up_list);
return ret;
}
 
@@ -5692,9 +5695,10 @@ static int __netdev_adjacent_dev_link_lists(struct 
net_device *dev,
 }
 
 static int __netdev_adjacent_dev_link(struct net_device *dev,
- struct net_device *upper_dev)
+ struct net_device *upper_dev,
+ u16 ref_nr)
 {
-

Re: [PATCH] Add netdev all_adj_list refcnt propagation to fix panic

2016-09-28 Thread Andrew Collins

On 09/28/2016 12:06 PM, David Ahern wrote:

Andrew Collins posted this patch as RFC in March:
http://patchwork.ozlabs.org/patch/603101/

It has apparently fallen through the cracks and never applied.

It solves a refcnt problem (thanks Nik for pointing out this patch)


I have been running the patch in production for the past few months without 
issue, in spite of my concerns about the potential side effects of the change.  
That said, I suspect my use cases are a subset of yours.

If nobody has a better fix yet, I'm happy to respin this patch as non-RFC with 
proper signoff.

Regards,
Andrew Collins


Re: [RFC] Add netdev all_adj_list refcnt propagation to fix panic

2016-03-30 Thread Andrew Collins

From: Andrew Collins 
Date: Tue, 29 Mar 2016 11:25:03 -0600


This is an RFC patch to fix a relatively easily reproducible kernel
panic related to the all_adj_list handling for netdevs in recent kernels.

This is more to generate discussion than anything else.  I don't
particularly like this approach, I'm hoping someone has a better idea.

The following sequence of commands will reproduce the issue:

ip link add link eth0 name eth0.100 type vlan id 100
ip link add link eth0 name eth0.200 type vlan id 200
ip link add name testbr type bridge
ip link set eth0.100 master testbr
ip link set eth0.200 master testbr
ip link add link testbr mac0 type macvlan
ip link delete dev testbr

This creates an upper/lower tree of (excuse the poor ASCII art):

 /---eth0.100-eth0
mac0-testbr-
 \---eth0.200-eth0

When testbr is deleted, the all_adj_lists are walked, and eth0 is deleted twice 
from
the mac0 list. Unfortunately, during setup in __netdev_upper_dev_link, only one
reference to eth0 is added, so this results in a panic.

This change adds reference count propagation so things are handled properly.

Matthias Schiffer reported a similar crash in batman-adv:

https://github.com/freifunk-gluon/gluon/issues/680
https://www.open-mesh.org/issues/247

which this patch also seems to resolve.


Veaceslav, please look into this.

Thanks.

!SIG:56fc30b5184771297483788!



+vfalico's new address picked up from git logs



[RFC] Add netdev all_adj_list refcnt propagation to fix panic

2016-03-29 Thread Andrew Collins
This is an RFC patch to fix a relatively easily reproducible kernel
panic related to the all_adj_list handling for netdevs in recent kernels.

This is more to generate discussion than anything else.  I don't
particularly like this approach, I'm hoping someone has a better idea.

The following sequence of commands will reproduce the issue:

ip link add link eth0 name eth0.100 type vlan id 100
ip link add link eth0 name eth0.200 type vlan id 200
ip link add name testbr type bridge
ip link set eth0.100 master testbr
ip link set eth0.200 master testbr
ip link add link testbr mac0 type macvlan
ip link delete dev testbr

This creates an upper/lower tree of (excuse the poor ASCII art):

/---eth0.100-eth0
mac0-testbr-
\---eth0.200-eth0

When testbr is deleted, the all_adj_lists are walked, and eth0 is deleted twice 
from
the mac0 list. Unfortunately, during setup in __netdev_upper_dev_link, only one
reference to eth0 is added, so this results in a panic.

This change adds reference count propagation so things are handled properly.

Matthias Schiffer reported a similar crash in batman-adv:

https://github.com/freifunk-gluon/gluon/issues/680
https://www.open-mesh.org/issues/247

which this patch also seems to resolve.
---
 net/core/dev.c | 68 --
 1 file changed, 37 insertions(+), 31 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index b9bcbe7..4b4ef6b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5471,6 +5471,7 @@ static inline bool netdev_adjacent_is_neigh_list(struct 
net_device *dev,
 
 static int __netdev_adjacent_dev_insert(struct net_device *dev,
struct net_device *adj_dev,
+   u16 ref_nr,
struct list_head *dev_list,
void *private, bool master)
 {
@@ -5480,7 +5481,7 @@ static int __netdev_adjacent_dev_insert(struct net_device 
*dev,
adj = __netdev_find_adj(adj_dev, dev_list);
 
if (adj) {
-   adj->ref_nr++;
+   adj->ref_nr += ref_nr;
return 0;
}
 
@@ -5490,7 +5491,7 @@ static int __netdev_adjacent_dev_insert(struct net_device 
*dev,
 
adj->dev = adj_dev;
adj->master = master;
-   adj->ref_nr = 1;
+   adj->ref_nr = ref_nr;
adj->private = private;
dev_hold(adj_dev);
 
@@ -5529,6 +5530,7 @@ free_adj:
 
 static void __netdev_adjacent_dev_remove(struct net_device *dev,
 struct net_device *adj_dev,
+u16 ref_nr,
 struct list_head *dev_list)
 {
struct netdev_adjacent *adj;
@@ -5541,10 +5543,10 @@ static void __netdev_adjacent_dev_remove(struct 
net_device *dev,
BUG();
}
 
-   if (adj->ref_nr > 1) {
-   pr_debug("%s to %s ref_nr-- = %d\n", dev->name, adj_dev->name,
-adj->ref_nr-1);
-   adj->ref_nr--;
+   if (adj->ref_nr > ref_nr) {
+   pr_debug("%s to %s ref_nr-%d = %d\n", dev->name, adj_dev->name,
+ref_nr, adj->ref_nr-ref_nr);
+   adj->ref_nr -= ref_nr;
return;
}
 
@@ -5563,21 +5565,22 @@ static void __netdev_adjacent_dev_remove(struct 
net_device *dev,
 
 static int __netdev_adjacent_dev_link_lists(struct net_device *dev,
struct net_device *upper_dev,
+   u16 ref_nr,
struct list_head *up_list,
struct list_head *down_list,
void *private, bool master)
 {
int ret;
 
-   ret = __netdev_adjacent_dev_insert(dev, upper_dev, up_list, private,
-  master);
+   ret = __netdev_adjacent_dev_insert(dev, upper_dev, ref_nr, up_list,
+  private, master);
if (ret)
return ret;
 
-   ret = __netdev_adjacent_dev_insert(upper_dev, dev, down_list, private,
-  false);
+   ret = __netdev_adjacent_dev_insert(upper_dev, dev, ref_nr, down_list,
+  private, false);
if (ret) {
-   __netdev_adjacent_dev_remove(dev, upper_dev, up_list);
+   __netdev_adjacent_dev_remove(dev, upper_dev, ref_nr, up_list);
return ret;
}
 
@@ -5585,9 +5588,10 @@ static int __netdev_adjacent_dev_link_lists(struct 
net_device *dev,
 }
 
 static int __netdev_adjacent_dev_link(struct net_device *dev,
- struct net_device *upper_dev)
+ struct net_device *upper_dev,
+ u16 r

Re: RESEND: Easily reproducible kernel panic due to netdev all_adj_list refcnt handling

2016-03-25 Thread Andrew Collins

On 03/25/2016 02:43 PM, Matthias Schiffer wrote:

We've tried your patch, and it changes the symptoms a bit, but doesn't fix
the panic. I've attached kernel logs of the crash both before and after
applying the patch.

One note: I did not reproduce this issue myself, it was first reported in
[1], and then forwarded to the batman-adv issue tracker [2] by me.

Regards,
Matthias


[1] https://github.com/freifunk-gluon/gluon/issues/680
[2] https://www.open-mesh.org/issues/247


On the off chance it helps, the version of the patch I integrated locally takes 
a somewhat different approach
than the one I sent to the mailing list (propagates adj_list refcnts).  I've 
attached it in case it's useful.

I haven't submitted this upstream yet as it's still rather ugly.  I'm of the opinion 
that the whole "every device
knows every upperdev and lowerdev in its tree" model is rather broken, and the 
patch is just working around
a design that needs some rework.

Thanks,
Andrew Collins
commit df318544e282c6ab5bdc4595658fc1cf8739d091
Author: Andrew Collins 
Date:   Fri Mar 25 16:04:59 2016 -0600

This fixes a relatively easily reproducible kernel panic related to the
all_adj_list handling for netdevs in recent kernels.

The following sequence of commands will reproduce the issue:

ip link add link eth0 name eth0.100 type vlan id 100
ip link add link eth0 name eth0.200 type vlan id 200
ip link add name testbr type bridge
ip link set eth0.100 master testbr
ip link set eth0.200 master testbr
ip link add link testbr mac0 type macvlan
ip link delete dev testbr

This creates an upper/lower tree of (excuse the poor ASCII art):

/---eth0.100-eth0
mac0-testbr-
\---eth0.200-eth0

When testbr is deleted, the all_adj_lists are walked, and eth0 is deleted twice from
the mac0 list. Unfortunately, during setup in __netdev_upper_dev_link, only one
reference to eth0 is added, so this results in a panic.

This change adds reference count propagation so things are handled properly.

diff --git a/net/core/dev.c b/net/core/dev.c
index b9bcbe7..4b4ef6b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5471,6 +5471,7 @@ static inline bool netdev_adjacent_is_neigh_list(struct net_device *dev,
 
 static int __netdev_adjacent_dev_insert(struct net_device *dev,
 	struct net_device *adj_dev,
+	u16 ref_nr,
 	struct list_head *dev_list,
 	void *private, bool master)
 {
@@ -5480,7 +5481,7 @@ static int __netdev_adjacent_dev_insert(struct net_device *dev,
 	adj = __netdev_find_adj(adj_dev, dev_list);
 
 	if (adj) {
-		adj->ref_nr++;
+		adj->ref_nr += ref_nr;
 		return 0;
 	}
 
@@ -5490,7 +5491,7 @@ static int __netdev_adjacent_dev_insert(struct net_device *dev,
 
 	adj->dev = adj_dev;
 	adj->master = master;
-	adj->ref_nr = 1;
+	adj->ref_nr = ref_nr;
 	adj->private = private;
 	dev_hold(adj_dev);
 
@@ -5529,6 +5530,7 @@ free_adj:
 
 static void __netdev_adjacent_dev_remove(struct net_device *dev,
 	 struct net_device *adj_dev,
+	 u16 ref_nr,
 	 struct list_head *dev_list)
 {
 	struct netdev_adjacent *adj;
@@ -5541,10 +5543,10 @@ static void __netdev_adjacent_dev_remove(struct net_device *dev,
 		BUG();
 	}
 
-	if (adj->ref_nr > 1) {
-		pr_debug("%s to %s ref_nr-- = %d\n", dev->name, adj_dev->name,
-			 adj->ref_nr-1);
-		adj->ref_nr--;
+	if (adj->ref_nr > ref_nr) {
+		pr_debug("%s to %s ref_nr-%d = %d\n", dev->name, adj_dev->name,
+			 ref_nr, adj->ref_nr-ref_nr);
+		adj->ref_nr -= ref_nr;
 		return;
 	}
 
@@ -5563,21 +5565,22 @@ static void __netdev_adjacent_dev_remove(struct net_device *dev,
 
 static int __netdev_adjacent_dev_link_lists(struct net_device *dev,
 	struct net_device *upper_dev,
+	u16 ref_nr,
 	struct list_head *up_list,
 	struct list_head *down_list,
 	void *private, bool master)
 {
 	int ret;
 
-	ret = __netdev_adjacent_dev_insert(dev, upper_dev, up_list, private,
-	   master);
+	ret = __netdev_adjacent_dev_insert(dev, upper_dev, ref_nr, up_list,
+	   private, master);
 	if (ret)
 		return ret;
 
-	ret = __netdev_adjacent_dev_insert(upper_dev, dev, down_list, private,
-	   false);
+	ret = __netdev_adjacent_dev_insert(upper_dev, dev, ref_nr, down_list,
+	   private, false);
 	if (ret) {
-		__netdev_adjacent_dev_remove(dev, upper_dev, up_list);
+		__netdev_adjacent_dev_remove(dev, upper_dev, ref_nr, up_list);
 		return ret;
 	}
 
@@ -5585,9 +5588,10 @@ static int __netdev_adjacent_dev_link_lists(struct net_device *dev,
 }
 
 static int __netdev_adjacent_dev_link(struct net_device *dev,
-  struct net_device *upper_dev)
+  struct net_device *upper_dev,
+  u16 ref_nr)
 {
-	return __netdev_adjacent_dev_link_lists(dev, upper_dev,
+	return __netdev_adjacent_dev_link_lists(dev, upper_dev, ref_

RESEND: Easily reproducible kernel panic due to netdev all_adj_list refcnt handling

2016-02-23 Thread Andrew Collins
*dev,
list_for_each_entry(i, &upper_dev->all_adj_list.upper, list) {
pr_debug("linking %s's upper device %s with %s\n",
 upper_dev->name, i->dev->name, dev->name);
-   ret = __netdev_adjacent_dev_link(dev, i->dev);
-   if (ret)
-   goto rollback_upper_mesh;
+   for (refs = 0; refs < i->ref_nr; refs++) {
+   ret = __netdev_adjacent_dev_link(dev, i->dev);
+   if (ret)
+   goto rollback_upper_mesh;
+   }
}

/* add upper_dev to every dev's lower device */
list_for_each_entry(i, &dev->all_adj_list.lower, list) {
pr_debug("linking %s's lower device %s with %s\n", dev->name,
 i->dev->name, upper_dev->name);
-   ret = __netdev_adjacent_dev_link(i->dev, upper_dev);
-   if (ret)
-   goto rollback_lower_mesh;
+   for (refs = 0; refs < i->ref_nr; refs++) {
+   ret = __netdev_adjacent_dev_link(i->dev, upper_dev);
+   if (ret)
+   goto rollback_lower_mesh;
+   }
    }

Has anyone else encountered this before?  Any ideas on a cleaner solution?

Thanks,
Andrew Collins


Kernel panic due to netdev all_adj_list refcnt handling

2016-01-26 Thread Andrew Collins
*dev,
list_for_each_entry(i, &upper_dev->all_adj_list.upper, list) {
pr_debug("linking %s's upper device %s with %s\n",
 upper_dev->name, i->dev->name, dev->name);
-   ret = __netdev_adjacent_dev_link(dev, i->dev);
-   if (ret)
-   goto rollback_upper_mesh;
+   for (refs = 0; refs < i->ref_nr; refs++) {
+   ret = __netdev_adjacent_dev_link(dev, i->dev);
+   if (ret)
+   goto rollback_upper_mesh;
+   }
}

/* add upper_dev to every dev's lower device */
list_for_each_entry(i, &dev->all_adj_list.lower, list) {
pr_debug("linking %s's lower device %s with %s\n", dev->name,
 i->dev->name, upper_dev->name);
-   ret = __netdev_adjacent_dev_link(i->dev, upper_dev);
-   if (ret)
-   goto rollback_lower_mesh;
+   for (refs = 0; refs < i->ref_nr; refs++) {
+   ret = __netdev_adjacent_dev_link(i->dev, upper_dev);
+   if (ret)
+   goto rollback_lower_mesh;
+   }
    }

Has anyone else encountered this before?  Any ideas on a cleaner solution?

Thanks,
Andrew Collins