date:20170120

[PATCH net-next 2/2] vxlan: do not age static remote mac entries

2017-01-20 Thread Roopa Prabhu

From: Balakrishnan Raman 

Mac aging is applicable only for dynamically learnt remote mac
entries. Check for user configured static remote mac entries
and skip aging.

Signed-off-by: Balakrishnan Raman 
Signed-off-by: Roopa Prabhu 
---
 drivers/net/vxlan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 269e515..2c5bb0a 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2268,7 +2268,7 @@ static void vxlan_cleanup(unsigned long arg)
= container_of(p, struct vxlan_fdb, hlist);
unsigned long timeout;
 
-   if (f->state & NUD_PERMANENT)
+   if (f->state & (NUD_PERMANENT | NUD_NOARP))
continue;
 
timeout = f->used + vxlan->cfg.age_interval * HZ;
-- 
1.9.1

[PATCH net-next 0/2] vxlan: misc fdb fixes

2017-01-20 Thread Roopa Prabhu

From: Roopa Prabhu 

Balakrishnan Raman (1):
  vxlan: do not age static remote mac entries

Roopa Prabhu (1):
  vxlan: don't flush static fdb entries on admin down

 drivers/net/vxlan.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

-- 
1.9.1

[PATCH net-next 1/2] vxlan: don't flush static fdb entries on admin down

2017-01-20 Thread Roopa Prabhu

From: Roopa Prabhu 

This patch skips flushing static fdb entries in
ndo_stop, but flushes all fdb entries during vxlan
device delete. This is consistent with the bridge
driver fdb

Signed-off-by: Roopa Prabhu 
---
 drivers/net/vxlan.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 19b1653..269e515 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2354,7 +2354,7 @@ static int vxlan_open(struct net_device *dev)
 }
 
 /* Purge the forwarding table */
-static void vxlan_flush(struct vxlan_dev *vxlan)
+static void vxlan_flush(struct vxlan_dev *vxlan, int do_all)
 {
unsigned int h;
 
@@ -2364,6 +2364,8 @@ static void vxlan_flush(struct vxlan_dev *vxlan)
hlist_for_each_safe(p, n, >fdb_head[h]) {
struct vxlan_fdb *f
= container_of(p, struct vxlan_fdb, hlist);
+   if (!do_all && (f->state & (NUD_PERMANENT | NUD_NOARP)))
+   continue;
/* the all_zeros_mac entry is deleted at vxlan_uninit */
if (!is_zero_ether_addr(f->eth_addr))
vxlan_fdb_destroy(vxlan, f);
@@ -2385,7 +2387,7 @@ static int vxlan_stop(struct net_device *dev)
 
del_timer_sync(>age_timer);
 
-   vxlan_flush(vxlan);
+   vxlan_flush(vxlan, 0);
vxlan_sock_release(vxlan);
 
return ret;
@@ -3058,6 +3060,8 @@ static void vxlan_dellink(struct net_device *dev, struct 
list_head *head)
struct vxlan_dev *vxlan = netdev_priv(dev);
struct vxlan_net *vn = net_generic(vxlan->net, vxlan_net_id);
 
+   vxlan_flush(vxlan, 1);
+
spin_lock(>sock_lock);
if (!hlist_unhashed(>hlist))
hlist_del_rcu(>hlist);
-- 
1.9.1

Re: [PATCH cumulus-4.1.y 1/5] vxlan: flush fdb entries on oper down

2017-01-20 Thread Roopa Prabhu

On 1/20/17, 11:40 PM, Roopa Prabhu wrote:
> From: Balakrishnan Raman 
>
> Flush fdb entries of a vxlan device when its state
> changes to oper down. vxlan_stop handles flush on
> admin down.
>
> Signed-off-by: Balakrishnan Raman 
> Signed-off-by: Roopa Prabhu 
> ---
>  

pls ignore this series. Accidently hit send in the wrong folder :(

[PATCH cumulus-4.1.y 2/5] vxlan: don't replace fdb entry if nothing changed

2017-01-20 Thread Roopa Prabhu

From: Balakrishnan Raman 

This will avoid unnecessary notifications to userspace.

Signed-off-by: Balakrishnan Raman 
Signed-off-by: Roopa Prabhu 
---
 drivers/net/vxlan.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 15b1c23..72b99ff 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -467,12 +467,19 @@ static int vxlan_fdb_replace(struct vxlan_fdb *f,
if (!rd)
return 0;
 
-   dst_cache_reset(>dst_cache);
-   rd->remote_ip = *ip;
-   rd->remote_port = port;
-   rd->remote_vni = vni;
-   rd->remote_ifindex = ifindex;
-   return 1;
+   if (!vxlan_addr_equal(>remote_ip, ip) ||
+   rd->remote_port != port ||
+   rd->remote_vni != vni ||
+   rd->remote_ifindex != ifindex) {
+   dst_cache_reset(>dst_cache);
+   rd->remote_ip = *ip;
+   rd->remote_port = port;
+   rd->remote_vni = vni;
+   rd->remote_ifindex = ifindex;
+   return 1;
+   }
+
+   return 0;
 }
 
 /* Add/update destinations for multicast */
-- 
1.9.1

[PATCH cumulus-4.1.y 1/5] vxlan: flush fdb entries on oper down

2017-01-20 Thread Roopa Prabhu

From: Balakrishnan Raman 

Flush fdb entries of a vxlan device when its state
changes to oper down. vxlan_stop handles flush on
admin down.

Signed-off-by: Balakrishnan Raman 
Signed-off-by: Roopa Prabhu 
---
 drivers/net/vxlan.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 19b1653..15b1c23 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -3276,6 +3276,12 @@ static int vxlan_netdevice_event(struct notifier_block 
*unused,
vxlan_handle_lowerdev_unregister(vn, dev);
else if (event == NETDEV_UDP_TUNNEL_PUSH_INFO)
vxlan_push_rx_ports(dev);
+   else if (event == NETDEV_CHANGE) {
+   if (dev->netdev_ops == _netdev_ops) {
+   if (netif_running(dev) && !netif_oper_up(dev))
+   vxlan_flush(netdev_priv(dev));
+   }
+   }
 
return NOTIFY_DONE;
 }
-- 
1.9.1

[PATCH cumulus-4.1.y 3/5] vxlan: enforce precedence for static over dynamic fdb entry

2017-01-20 Thread Roopa Prabhu

From: Wilson Kok 

This patch enforces fdb state correctly when deciding
to add or update an existing fdb. It makes sure static fdb
entries are not replaced by dynamic fdb entries.

Signed-off-by: Wilson Kok 
Signed-off-by: Roopa Prabhu 
---
 drivers/net/vxlan.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 72b99ff..7300586 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -628,6 +628,10 @@ static int vxlan_fdb_create(struct vxlan_dev *vxlan,
return -EEXIST;
}
if (f->state != state) {
+   if ((f->state & NUD_PERMANENT) &&
+   !(state & NUD_PERMANENT))
+   return -EINVAL;
+
f->state = state;
f->updated = jiffies;
notify = 1;
-- 
1.9.1

[PATCH cumulus-4.1.y 5/5] vxlan: do not age static remote mac entries

2017-01-20 Thread Roopa Prabhu

From: Balakrishnan Raman 

Mac aging is applicable only for dynamically learnt remote mac
entries. Check for user configured static remote mac entries
and skip aging.

Signed-off-by: Balakrishnan Raman 
Signed-off-by: Roopa Prabhu 
---
 drivers/net/vxlan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 3314090..312240c 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2279,7 +2279,7 @@ static void vxlan_cleanup(unsigned long arg)
= container_of(p, struct vxlan_fdb, hlist);
unsigned long timeout;
 
-   if (f->state & NUD_PERMANENT)
+   if (f->state & (NUD_PERMANENT | NUD_NOARP))
continue;
 
timeout = f->used + vxlan->cfg.age_interval * HZ;
-- 
1.9.1

[PATCH cumulus-4.1.y 4/5] vxlan: don't flush static fdb entries on admin down

2017-01-20 Thread Roopa Prabhu

From: Roopa Prabhu 

This patch skips flushing static fdb entries in
ndo_stop, but flushes all fdb entries during vxlan
device delete. This is consistent with the bridge
driver fdb

Signed-off-by: Roopa Prabhu 
---
 drivers/net/vxlan.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 7300586..3314090 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2365,7 +2365,7 @@ static int vxlan_open(struct net_device *dev)
 }
 
 /* Purge the forwarding table */
-static void vxlan_flush(struct vxlan_dev *vxlan)
+static void vxlan_flush(struct vxlan_dev *vxlan, int do_all)
 {
unsigned int h;
 
@@ -2375,6 +2375,8 @@ static void vxlan_flush(struct vxlan_dev *vxlan)
hlist_for_each_safe(p, n, >fdb_head[h]) {
struct vxlan_fdb *f
= container_of(p, struct vxlan_fdb, hlist);
+   if (!do_all && (f->state & (NUD_PERMANENT | NUD_NOARP)))
+   continue;
/* the all_zeros_mac entry is deleted at vxlan_uninit */
if (!is_zero_ether_addr(f->eth_addr))
vxlan_fdb_destroy(vxlan, f);
@@ -2396,7 +2398,7 @@ static int vxlan_stop(struct net_device *dev)
 
del_timer_sync(>age_timer);
 
-   vxlan_flush(vxlan);
+   vxlan_flush(vxlan, 0);
vxlan_sock_release(vxlan);
 
return ret;
@@ -3069,6 +3071,8 @@ static void vxlan_dellink(struct net_device *dev, struct 
list_head *head)
struct vxlan_dev *vxlan = netdev_priv(dev);
struct vxlan_net *vn = net_generic(vxlan->net, vxlan_net_id);
 
+   vxlan_flush(vxlan, 1);
+
spin_lock(>sock_lock);
if (!hlist_unhashed(>hlist))
hlist_del_rcu(>hlist);
-- 
1.9.1

[PATCH v2] net: xilinx: constify net_device_ops structure

2017-01-20 Thread Bhumika Goyal

Declare net_device_ops structure as const as it is only stored in
the netdev_ops field of a net_device structure. This field is of type
const, so net_device_ops structures having same properties can be made
const too.
Done using Coccinelle:

@r1 disable optional_qualifier@
identifier i;
position p;
@@
static struct net_device_ops i@p={...};

@ok1@
identifier r1.i;
position p;
struct net_device ndev;
@@
ndev.netdev_ops=@p

@bad@
position p!={r1.p,ok1.p};
identifier r1.i;
@@
i@p

@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
+const
struct net_device_ops i;

File size before:
   textdata bss dec hex filename
   6201 744   069451b21 ethernet/xilinx/xilinx_emaclite.o

File size after:
   textdata bss dec hex filename
   6745 192   069371b19 ethernet/xilinx/xilinx_emaclite.o

Signed-off-by: Bhumika Goyal 
---
Changes in v2:
* Corrected the commit message.

 drivers/net/ethernet/xilinx/xilinx_emaclite.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/xilinx_emaclite.c 
b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
index 93dc10b..546f569 100644
--- a/drivers/net/ethernet/xilinx/xilinx_emaclite.c
+++ b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
@@ -1065,7 +1065,7 @@ static bool get_bool(struct platform_device *ofdev, const 
char *s)
}
 }
 
-static struct net_device_ops xemaclite_netdev_ops;
+static const struct net_device_ops xemaclite_netdev_ops;
 
 /**
  * xemaclite_of_probe - Probe method for the Emaclite device.
@@ -1219,7 +1219,7 @@ static int xemaclite_of_remove(struct platform_device 
*of_dev)
 }
 #endif
 
-static struct net_device_ops xemaclite_netdev_ops = {
+static const struct net_device_ops xemaclite_netdev_ops = {
.ndo_open   = xemaclite_open,
.ndo_stop   = xemaclite_close,
.ndo_start_xmit = xemaclite_send,
-- 
1.9.1

[PATCH v2] net: moxa: constify net_device_ops structures

2017-01-20 Thread Bhumika Goyal

Declare net_device_ops structure as const as it is only stored in
the netdev_ops field of a net_device structure. This field is of type
const, so net_device_ops structures having same properties can be made
const too.
Done using Coccinelle:

@r1 disable optional_qualifier@
identifier i;
position p;
@@
static struct net_device_ops i@p={...};

@ok1@
identifier r1.i;
position p;
struct net_device ndev;
@@
ndev.netdev_ops=@p

@bad@
position p!={r1.p,ok1.p};
identifier r1.i;
@@
i@p

@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
+const
struct net_device_ops i;

File size before:
   textdata bss dec hex filename
   4821 744   0556515bd ethernet/moxa/moxart_ether.o

File size after:
   textdata bss dec hex filename
   5373 192   0556515bd ethernet/moxa/moxart_ether.o

Signed-off-by: Bhumika Goyal 
---
Changes in v2:
* Corrected the commit message.

 drivers/net/ethernet/moxa/moxart_ether.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/moxa/moxart_ether.c 
b/drivers/net/ethernet/moxa/moxart_ether.c
index 9774b50..6a6525f 100644
--- a/drivers/net/ethernet/moxa/moxart_ether.c
+++ b/drivers/net/ethernet/moxa/moxart_ether.c
@@ -436,7 +436,7 @@ static void moxart_mac_set_rx_mode(struct net_device *ndev)
spin_unlock_irq(>txlock);
 }
 
-static struct net_device_ops moxart_netdev_ops = {
+static const struct net_device_ops moxart_netdev_ops = {
.ndo_open   = moxart_mac_open,
.ndo_stop   = moxart_mac_stop,
.ndo_start_xmit = moxart_mac_start_xmit,
-- 
1.9.1

[PATCH] net: moxa: constify net_device_ops structures

2017-01-20 Thread Bhumika Goyal

Declare net_device_ops structures as const as they are only stored in
the netdev_ops field of a net_device structure. This field is of type
const, so net_device_ops structures having same properties can be made
const too.
Done using Coccinelle:

@r1 disable optional_qualifier@
identifier i;
position p;
@@
static struct net_device_ops i@p={...};

@ok1@
identifier r1.i;
position p;
struct net_device ndev;
@@
ndev.netdev_ops=@p

@bad@
position p!={r1.p,ok1.p};
identifier r1.i;
@@
i@p

@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
+const
struct net_device_ops i;

File size before:
   textdata bss dec hex filename
   4821 744   0556515bd ethernet/moxa/moxart_ether.o

File size after:
   textdata bss dec hex filename
   5373 192   0556515bd ethernet/moxa/moxart_ether.o

Signed-off-by: Bhumika Goyal 
---
 drivers/net/ethernet/moxa/moxart_ether.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/moxa/moxart_ether.c 
b/drivers/net/ethernet/moxa/moxart_ether.c
index 9774b50..6a6525f 100644
--- a/drivers/net/ethernet/moxa/moxart_ether.c
+++ b/drivers/net/ethernet/moxa/moxart_ether.c
@@ -436,7 +436,7 @@ static void moxart_mac_set_rx_mode(struct net_device *ndev)
spin_unlock_irq(>txlock);
 }
 
-static struct net_device_ops moxart_netdev_ops = {
+static const struct net_device_ops moxart_netdev_ops = {
.ndo_open   = moxart_mac_open,
.ndo_stop   = moxart_mac_stop,
.ndo_start_xmit = moxart_mac_start_xmit,
-- 
1.9.1

[PATCH] net: xilinx: constify net_device_ops structures

2017-01-20 Thread Bhumika Goyal

Declare net_device_ops structures as const as they are only stored in
the netdev_ops field of a net_device structure. This field is of type
const, so net_device_ops structures having same properties can be made
const too.
Done using Coccinelle:

@r1 disable optional_qualifier@
identifier i;
position p;
@@
static struct net_device_ops i@p={...};

@ok1@
identifier r1.i;
position p;
struct net_device ndev;
@@
ndev.netdev_ops=@p

@bad@
position p!={r1.p,ok1.p};
identifier r1.i;
@@
i@p

@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
+const
struct net_device_ops i;

File size before:
   textdata bss dec hex filename
   6201 744   069451b21 ethernet/xilinx/xilinx_emaclite.o

File size after:
   textdata bss dec hex filename
   6745 192   069371b19 ethernet/xilinx/xilinx_emaclite.o

Signed-off-by: Bhumika Goyal 
---
 drivers/net/ethernet/xilinx/xilinx_emaclite.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/xilinx_emaclite.c 
b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
index 93dc10b..546f569 100644
--- a/drivers/net/ethernet/xilinx/xilinx_emaclite.c
+++ b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
@@ -1065,7 +1065,7 @@ static bool get_bool(struct platform_device *ofdev, const 
char *s)
}
 }
 
-static struct net_device_ops xemaclite_netdev_ops;
+static const struct net_device_ops xemaclite_netdev_ops;
 
 /**
  * xemaclite_of_probe - Probe method for the Emaclite device.
@@ -1219,7 +1219,7 @@ static int xemaclite_of_remove(struct platform_device 
*of_dev)
 }
 #endif
 
-static struct net_device_ops xemaclite_netdev_ops = {
+static const struct net_device_ops xemaclite_netdev_ops = {
.ndo_open   = xemaclite_open,
.ndo_stop   = xemaclite_close,
.ndo_start_xmit = xemaclite_send,
-- 
1.9.1

[RFC PATCH net-next 5/5] bridge: vlan lwt dst_metadata hooks in ingress and egress paths

2017-01-20 Thread Roopa Prabhu

From: Roopa Prabhu 

- ingress hook:
- if port is a lwt tunnel port, use tunnel info in
  attached dst_metadata to map it to a local vlan
- egress hook:
- if port is a lwt tunnel port, use tunnel info attached to
  vlan to set dst_metadata on the skb

CC: Nikolay Aleksandrov 
Signed-off-by: Roopa Prabhu 
---
CC'ing Nikolay for some more eyes as he has been trying to keep the
bridge driver fast path lite.

 net/bridge/br_input.c   |4 
 net/bridge/br_private.h |4 
 net/bridge/br_vlan.c|   55 +++
 3 files changed, 63 insertions(+)

diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 83f356f..96602a1 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -262,6 +262,10 @@ rx_handler_result_t br_handle_frame(struct sk_buff **pskb)
return RX_HANDLER_CONSUMED;
 
p = br_port_get_rcu(skb->dev);
+   if (p->flags & BR_LWT_VLAN) {
+   if (br_handle_ingress_vlan_tunnel(skb, p, 
nbp_vlan_group_rcu(p)))
+   goto drop;
+   }
 
if (unlikely(is_link_local_ether_addr(dest))) {
u16 fwd_mask = p->br->group_fwd_mask_required;
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index f68e360..68a23c5 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -804,6 +804,10 @@ int __vlan_tunnel_info_del(struct net_bridge_vlan_group 
*vg,
 int nbp_vlan_tunnel_info_add(struct net_bridge_port *port, u16 vid, u32 
tun_id);
 bool vlan_tunnel_id_isrange(struct net_bridge_vlan *v_end,
struct net_bridge_vlan *v);
+int br_handle_ingress_vlan_tunnel(struct sk_buff *skb, struct net_bridge_port 
*p,
+ struct net_bridge_vlan_group *vg);
+int br_handle_egress_vlan_tunnel(struct sk_buff *skb,
+struct net_bridge_vlan *vlan);
 
 static inline struct net_bridge_vlan_group *br_vlan_group(
const struct net_bridge *br)
diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index 2040f08..6cf2344 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -405,6 +405,11 @@ struct sk_buff *br_handle_vlan(struct net_bridge *br,
 
if (v->flags & BRIDGE_VLAN_INFO_UNTAGGED)
skb->vlan_tci = 0;
+
+   if (br_handle_egress_vlan_tunnel(skb, v)) {
+   kfree_skb(skb);
+   return NULL;
+   }
 out:
return skb;
 }
@@ -1213,3 +1218,53 @@ int nbp_vlan_tunnel_info_delete(struct net_bridge_port 
*port, u16 vid)
 
return 0;
 }
+
+int br_handle_ingress_vlan_tunnel(struct sk_buff *skb,
+ struct net_bridge_port *p,
+ struct net_bridge_vlan_group *vg)
+{
+   struct ip_tunnel_info *tinfo = skb_tunnel_info(skb);
+   struct net_bridge_vlan *vlan;
+
+   if (!vg || !tinfo)
+   return 0;
+
+   /* if already tagged, ignore */
+   if (skb_vlan_tagged(skb))
+   return 0;
+
+   /* lookup vid, given tunnel id */
+   vlan = br_vlan_tunnel_lookup(>tunnel_hash, tinfo->key.tun_id);
+   if (!vlan)
+   return 0;
+
+   skb_dst_drop(skb);
+
+   __vlan_hwaccel_put_tag(skb, p->br->vlan_proto, vlan->vid);
+
+   return 0;
+}
+
+int br_handle_egress_vlan_tunnel(struct sk_buff *skb,
+struct net_bridge_vlan *vlan)
+{
+   __be32 tun_id;
+   int err;
+
+   if (!vlan || !vlan->tinfo.tunnel_id)
+   return 0;
+
+   if (unlikely(!skb_vlan_tag_present(skb)))
+   return 0;
+
+   skb_dst_drop(skb);
+   tun_id = tunnel_id_to_key32(vlan->tinfo.tunnel_id);
+
+   err = skb_vlan_pop(skb);
+   if (err)
+   return err;
+
+   skb_dst_set(skb, dst_clone(>tinfo.tunnel_dst->dst));
+
+   return 0;
+}
-- 
1.7.10.4

[RFC PATCH net-next 3/5] bridge: uapi: add per vlan tunnel info

2017-01-20 Thread Roopa Prabhu

From: Roopa Prabhu 

New netlink api to associate tunnel info per vlan.
This is used by bridge driver to send tunnel metadata to
bridge ports in LWT tunnel dst metadata mode.

One example use for this is a vxlan bridging gateway or vtep
which maps vlans to vn-segments (or vnis). User can configure
per-vlan tunnel information which the bridge driver can use
to bridge vlan into the corresponding vn-segment.

This patch also introduces a bridge port flag IFLA_BRPORT_LWT_VLAN
to enable this feature on a tunnel bridge port. It is off by default.

Signed-off-by: Roopa Prabhu 
---
 include/uapi/linux/if_bridge.h |   11 +++
 include/uapi/linux/if_link.h   |1 +
 2 files changed, 12 insertions(+)

diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
index ab92bca..a9e6244 100644
--- a/include/uapi/linux/if_bridge.h
+++ b/include/uapi/linux/if_bridge.h
@@ -118,6 +118,7 @@ enum {
IFLA_BRIDGE_FLAGS,
IFLA_BRIDGE_MODE,
IFLA_BRIDGE_VLAN_INFO,
+   IFLA_BRIDGE_VLAN_TUNNEL_INFO,
__IFLA_BRIDGE_MAX,
 };
 #define IFLA_BRIDGE_MAX (__IFLA_BRIDGE_MAX - 1)
@@ -134,6 +135,16 @@ struct bridge_vlan_info {
__u16 vid;
 };
 
+enum {
+   IFLA_BRIDGE_VLAN_TUNNEL_UNSPEC,
+   IFLA_BRIDGE_VLAN_TUNNEL_ID,
+   IFLA_BRIDGE_VLAN_TUNNEL_VID,
+   IFLA_BRIDGE_VLAN_TUNNEL_FLAGS,
+   __IFLA_BRIDGE_VLAN_TUNNEL_MAX,
+};
+
+#define IFLA_BRIDGE_VLAN_TUNNEL_MAX (__IFLA_BRIDGE_VLAN_TUNNEL_MAX - 1)
+
 struct bridge_vlan_xstats {
__u64 rx_bytes;
__u64 rx_packets;
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 184b16e..f0356b9 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -321,6 +321,7 @@ enum {
IFLA_BRPORT_MULTICAST_ROUTER,
IFLA_BRPORT_PAD,
IFLA_BRPORT_MCAST_FLOOD,
+   IFLA_BRPORT_LWT_VLAN,
__IFLA_BRPORT_MAX
 };
 #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
-- 
1.7.10.4

[RFC PATCH net-next 2/5] vxlan: make COLLECT_METADATA mode bridge friendly

2017-01-20 Thread Roopa Prabhu

From: Roopa Prabhu 

This patch series makes vxlan COLLECT_METADATA mode bridge
and layer2 network friendly. Vxlan COLLECT_METADATA mode today
solves the per-vni netdev scalability problem in l3 networks.
When vxlan collect metadata device participates in bridging
vlan to vn-segments, It can only get the vlan mapped vni in
the xmit tunnel dst metadata. It will need the vxlan driver to
continue learn, hold forwarding state and remote destination
information similar to how it already does for non COLLECT_METADATA
vxlan netdevices today.

Changes introduced by this patch:
- allow learning and forwarding database state to vxlan netdev in
  COLLECT_METADATA mode. Current behaviour is not changed
  by default. tunnel info flag IP_TUNNEL_INFO_BRIDGE is used
  to support the new bridge friendly mode.
- A single fdb table hashed by (mac, vni) to allow fdb entries with
  multiple vnis in the same fdb table
- rx path already has the vni
- tx path expects a vni in the packet with dst_metadata
- prior to this series, fdb remote_dsts carried remote vni and
  the vxlan device carrying the fdb table represented the
  source vni. With the vxlan device now representing multiple vnis,
  this patch adds a src vni attribute to the fdb entry. The remote
  vni already uses NDA_VNI attribute. This patch introduces
  NDA_SRC_VNI netlink attribute to represent the src vni in a multi
  vni fdb table.

Signed-off-by: Roopa Prabhu 
---
 drivers/net/vxlan.c|  209 +---
 include/uapi/linux/neighbour.h |1 +
 2 files changed, 134 insertions(+), 76 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index ca7196c..fb114b3 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -57,6 +57,8 @@
 
 static const u8 all_zeros_mac[ETH_ALEN + 2];
 
+static u32 fdb_salt __read_mostly;
+
 static int vxlan_sock_add(struct vxlan_dev *vxlan);
 
 /* per-network namespace private data for this module */
@@ -75,6 +77,7 @@ struct vxlan_fdb {
struct list_head  remotes;
u8eth_addr[ETH_ALEN];
u16   state;/* see ndm_state */
+   __be32vni;
u8flags;/* see ndm_flags */
 };
 
@@ -302,6 +305,10 @@ static int vxlan_fdb_info(struct sk_buff *skb, struct 
vxlan_dev *vxlan,
if (rdst->remote_vni != vxlan->default_dst.remote_vni &&
nla_put_u32(skb, NDA_VNI, be32_to_cpu(rdst->remote_vni)))
goto nla_put_failure;
+   if ((vxlan->flags & VXLAN_F_COLLECT_METADATA) && fdb->vni &&
+   nla_put_u32(skb, NDA_SRC_VNI,
+   be32_to_cpu(fdb->vni)))
+   goto nla_put_failure;
if (rdst->remote_ifindex &&
nla_put_u32(skb, NDA_IFINDEX, rdst->remote_ifindex))
goto nla_put_failure;
@@ -400,34 +407,50 @@ static u32 eth_hash(const unsigned char *addr)
return hash_64(value, FDB_HASH_BITS);
 }
 
+static u32 eth_vni_hash(const unsigned char *addr, __be32 vni)
+{
+   /* use 1 byte of OUI and 3 bytes of NIC */
+   u32 key = get_unaligned((u32 *)(addr + 2));
+   return jhash_2words(key, vni, fdb_salt) & (FDB_HASH_SIZE - 1);
+}
+
 /* Hash chain to use given mac address */
 static inline struct hlist_head *vxlan_fdb_head(struct vxlan_dev *vxlan,
-   const u8 *mac)
+   const u8 *mac, __be32 vni)
 {
-   return >fdb_head[eth_hash(mac)];
+   if (vxlan->flags & VXLAN_F_COLLECT_METADATA)
+   return >fdb_head[eth_vni_hash(mac, vni)];
+   else
+   return >fdb_head[eth_hash(mac)];
 }
 
 /* Look up Ethernet address in forwarding table */
 static struct vxlan_fdb *__vxlan_find_mac(struct vxlan_dev *vxlan,
-   const u8 *mac)
+ const u8 *mac, __be32 vni)
 {
-   struct hlist_head *head = vxlan_fdb_head(vxlan, mac);
+   struct hlist_head *head = vxlan_fdb_head(vxlan, mac, vni);
struct vxlan_fdb *f;
 
hlist_for_each_entry_rcu(f, head, hlist) {
-   if (ether_addr_equal(mac, f->eth_addr))
-   return f;
+   if (ether_addr_equal(mac, f->eth_addr)) {
+   if (vxlan->flags & VXLAN_F_COLLECT_METADATA) {
+   if (vni == f->vni)
+   return f;
+   } else {
+   return f;
+   }
+   }
}
 
return NULL;
 }
 
 static struct vxlan_fdb *vxlan_find_mac(struct vxlan_dev *vxlan,
-   const u8 *mac)
+   const u8 *mac, __be32 vni)
 {
struct vxlan_fdb *f;
 
-   f = __vxlan_find_mac(vxlan, mac);
+

[RFC PATCH net-next 1/5] ip_tunnels: new IP_TUNNEL_INFO_BRIDGE flag for ip_tunnel_info mode

2017-01-20 Thread Roopa Prabhu

From: Roopa Prabhu 

New ip_tunnel_info flag to represent bridged tunnel metadata.
Used by bridge driver later in the series to pass per vlan dst
metadata to bridge ports.

Signed-off-by: Roopa Prabhu 
---
 include/net/ip_tunnels.h |1 +
 1 file changed, 1 insertion(+)

diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
index 3d4ca4d..9505679 100644
--- a/include/net/ip_tunnels.h
+++ b/include/net/ip_tunnels.h
@@ -58,6 +58,7 @@ struct ip_tunnel_key {
 /* Flags for ip_tunnel_info mode. */
 #define IP_TUNNEL_INFO_TX  0x01/* represents tx tunnel parameters */
 #define IP_TUNNEL_INFO_IPV60x02/* key contains IPv6 addresses */
+#define IP_TUNNEL_INFO_BRIDGE  0x04/* represents a bridged tunnel id */
 
 /* Maximum tunnel options length. */
 #define IP_TUNNEL_OPTS_MAX \
-- 
1.7.10.4

[RFC PATCH net-next 4/5] bridge: vlan lwt and dst_metadata netlink support

2017-01-20 Thread Roopa Prabhu

From: Roopa Prabhu 

This patch adds support to attach per vlan tunnel info dst
metadata. This enables bridge driver to map vlan to tunnel_info
at ingress and egress

The initial use case is vlan to vni bridging, but the api is generic
to extend to any tunnel_info in the future:
- Uapi to configure/unconfigure/dump per vlan tunnel data
- netlink functions to configure vlan and tunnel_info mapping
- Introduces bridge port flag BR_LWT_VLAN to enable attach/detach
dst_metadata to bridged packets on ports.

Use case:
example use for this is a vxlan bridging gateway or vtep
which maps vlans to vn-segments (or vnis). User can configure
per-vlan tunnel information which the bridge driver can use
to bridge vlan into the corresponding tunnel.

CC: Nikolay Aleksandrov 
Signed-off-by: Roopa Prabhu 
---
CC'ing Nikolay for some more eyes as he has been trying to keep the
bridge driver fast path lite.

 include/linux/if_bridge.h |1 +
 net/bridge/br_input.c |1 +
 net/bridge/br_netlink.c   |  410 ++---
 net/bridge/br_private.h   |   18 ++
 net/bridge/br_vlan.c  |  138 ++-
 5 files changed, 507 insertions(+), 61 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index c6587c0..36ff611 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -46,6 +46,7 @@ struct br_ip_list {
 #define BR_LEARNING_SYNC   BIT(9)
 #define BR_PROXYARP_WIFI   BIT(10)
 #define BR_MCAST_FLOOD BIT(11)
+#define BR_LWT_VLANBIT(12)
 
 #define BR_DEFAULT_AGEING_TIME (300 * HZ)
 
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 855b72f..83f356f 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "br_private.h"
 
 /* Hook for brouter */
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 71c7453..df997ad 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -17,17 +17,30 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "br_private.h"
 #include "br_private_stp.h"
 
-static int __get_num_vlan_infos(struct net_bridge_vlan_group *vg,
-   u32 filter_mask)
+static size_t br_get_vlan_tinfo_size(void)
 {
+   return nla_total_size(0) + /* nest IFLA_BRIDGE_VLAN_TUNNEL_INFO */
+ nla_total_size(sizeof(u32)) + /* IFLA_BRIDGE_VLAN_TUNNEL_ID */
+ nla_total_size(sizeof(u16)) + /* IFLA_BRIDGE_VLAN_TUNNEL_VID 
*/
+ nla_total_size(sizeof(u16)); /* IFLA_BRIDGE_VLAN_TUNNEL_FLAGS 
*/
+}
+
+static int __get_num_vlan_infos(struct net_bridge_port *p,
+   struct net_bridge_vlan_group *vg,
+   u32 filter_mask, int *num_vtinfos)
+{
+   struct net_bridge_vlan *vbegin = NULL, *vend = NULL;
+   struct net_bridge_vlan *vtbegin = NULL, *vtend = NULL;
struct net_bridge_vlan *v;
-   u16 vid_range_start = 0, vid_range_end = 0, vid_range_flags = 0;
+   bool get_tinfos = (p && p->flags & BR_LWT_VLAN) ? true: false;
+   bool vcontinue, vtcontinue;
+   int num_vinfos = 0;
u16 flags, pvid;
-   int num_vlans = 0;
 
if (!(filter_mask & RTEXT_FILTER_BRVLAN_COMPRESSED))
return 0;
@@ -36,6 +49,8 @@ static int __get_num_vlan_infos(struct net_bridge_vlan_group 
*vg,
/* Count number of vlan infos */
list_for_each_entry_rcu(v, >vlan_list, vlist) {
flags = 0;
+   vcontinue = false;
+   vtcontinue = false;
/* only a context, bridge vlan not activated */
if (!br_vlan_should_use(v))
continue;
@@ -45,47 +60,79 @@ static int __get_num_vlan_infos(struct 
net_bridge_vlan_group *vg,
if (v->flags & BRIDGE_VLAN_INFO_UNTAGGED)
flags |= BRIDGE_VLAN_INFO_UNTAGGED;
 
-   if (vid_range_start == 0) {
-   goto initvars;
-   } else if ((v->vid - vid_range_end) == 1 &&
-   flags == vid_range_flags) {
-   vid_range_end = v->vid;
+   if (!vbegin) {
+   vbegin = v;
+   vend = v;
+   vcontinue = true;
+   } else if ((v->vid - vend->vid) == 1 &&
+   flags == vbegin->flags) {
+   vend = v;
+   vcontinue = true;
+   }
+
+   if (!vcontinue) {
+   if ((vend->vid - vbegin->vid) > 0)
+   num_vinfos += 2;
+   else
+   num_vinfos += 1;
+   }
+
+   if (!get_tinfos && !v->tinfo.tunnel_id)
continue;
-   } else {

[RFC PATCH net-next 0/5] bridge: per vlan lwt and dst_metadata support

2017-01-20 Thread Roopa Prabhu

From: Roopa Prabhu 

High level summary:
lwt and dst_metadata/collect_metadata have enabled vxlan l3 deployments
to use a single vxlan netdev for multiple vnis eliminating the scalability
problem with using a single vxlan netdev per vni. This series tries to
do the same for vxlan netdevs in pure l2 bridged networks.
Use-case/deployment and details are below.

Deployment scerario details:
As we know VXLAN is used to build layer 2 virtual networks across the
underlay layer3 infrastructure. A VXLAN tunnel endpoint (VTEP)
originates and terminates VXLAN tunnels. And a VTEP can be a TOR switch
or a vswitch in the hypervisor. This patch series mainly
focuses on the TOR switch configured as a Vtep. Vxlan segment ID (vni)
along with vlan id is used to identify layer 2 segments in a vxlan
overlay network. Vxlan bridging is the function provided by Vteps to terminate
vxlan tunnels and map the vxlan vni to traditional end host vlan. This is
covered in the "VXLAN Deployment Scenarios" in sections 6 and 6.1 in RFC 7348.
To provide vxlan bridging function, a vtep has to map vlan to a vni. The rfc
says that the ingress VTEP device shall remove the IEEE 802.1Q VLAN tag in
the original Layer 2 packet if there is one before encapsulating the packet
into the VXLAN format to transmit it through the underlay network. The remote
VTEP devices have information about the VLAN in which the packet will be
placed based on their own VLAN-to-VXLAN VNI mapping configurations.

Existing solution:
Without this patch series one can deploy such a vtep configuration by
by adding the local ports and vxlan netdevs into a vlan filtering bridge.
The local ports are configured as trunk ports carrying all vlans.
A vxlan netdev per vni is added to the bridge. Vlan mapping to vni is
achieved by configuring the vlan as pvid on the corresponding vxlan netdev.
The vxlan netdev only receives traffic corresponding to the vlan it is mapped
to. This configuration maps traffic belonging to a vlan to the corresponding
vxlan segment.

  ---
 |  bridge   |
 |   |
  ---
|100,200   |100 (pvid)|200 (pvid)
|  |  |
   swp1  vxlan1000  vxlan2000

This provides the required vxlan bridging function but poses a
scalability problem with using a single vxlan netdev for each vni.

Solution in this patch series:
The Goal is to use a single vxlan device to carry all vnis similar
to the vxlan collect metadata mode but vxlan driver still carrying all
the forwarding information.
- vxlan driver changes:
- enable collect metadata mode device to be used with learning,
  replication, fdb
- A single fdb table hashed by (mac, vni)
- rx path already has the vni
- tx path expects a vni in the packet with dst_metadata and vxlan
  driver has all the forwarding information for the vni in the
  dst_metadata.

- Bridge driver changes: per vlan LWT and dst_metadata support:
- Our use case is vxlan and 1-1 mapping between vlan and vni, but I have
  kept the api generic for any tunnel info
- Uapi to configure/unconfigure/dump per vlan tunnel data
- new bridge port flag to turn this feature on/off. off by default
- ingress hook:
- if port is a lwt tunnel port, use tunnel info in
  attached dst_metadata to map it to a local vlan
- egress hook:
- if port is a lwt tunnel port, use tunnel info attached to vlan
  to set dst_metadata on the skb

Other approaches tried and vetoed:
- tc vlan push/pop and tunnel metadata dst:
- posses a tc rule scalability problem (2 rules per vni)
- cannot handle the case where a packet needs to be replicated to
  multiple vxlan remote tunnel end-points.. which the vxlan driver
  can do today by having multiple remote destinations per fdb.
- making vxlan driver understand vlan-vni mapping:
- I had a series almost ready with this one but soon realized
  it duplicated a lot of vlan handling code in the vxlan driver

This series is briefly tested for functionality. Sending it out as RFC while
I continue to test it more. There are some rough edges which I am in the process
of fixing.

Signed-off-by: Roopa Prabhu 

Roopa Prabhu (5):
  ip_tunnels: new IP_TUNNEL_INFO_BRIDGE flag for ip_tunnel_info mode
  vxlan: make COLLECT_METADATA mode bridge friendly
  bridge: uapi: add per vlan tunnel info
  bridge: vlan lwt and dst_metadata netlink support
  bridge: vlan lwt dst_metadata hooks in ingress and egress paths

 drivers/net/vxlan.c|  209 
 include/linux/if_bridge.h  |1 +
 include/net/ip_tunnels.h   |1 +
 include/uapi/linux/if_bridge.h |   11 ++
 include/uapi/linux/if_link.h   |1 +
 include/uapi/linux/neighbour.h |1 +

Re: [PATCH] net: qcom/emac: claim the irq only when the device is opened

2017-01-20 Thread Lino Sanfilippo



On 20.01.2017 22:36, Timur Tabi wrote:

On 01/20/2017 03:31 PM, Lino Sanfilippo wrote:


In emac_mac_down() however we need synchronize_irq(), since it ensures
that the irq
handler is not running any more when it (synchronize_irq) returns.


So in general, if a driver disables a interrupt but does not free it, 
it should call synchronize_irq()?




Yes, thats right.

Regards,
Lino

[PATCH v3 net-next] Introduce a sysctl that modifies the value of PROT_SOCK.

2017-01-20 Thread Krister Johansen

Add net.ipv4.ip_unprivileged_port_start, which is a per namespace sysctl
that denotes the first unprivileged inet port in the namespace.  To
disable all privileged ports set this to zero.  It also checks for
overlap with the local port range.  The privileged and local range may
not overlap.

The use case for this change is to allow containerized processes to bind
to priviliged ports, but prevent them from ever being allowed to modify
their container's network configuration.  The latter is accomplished by
ensuring that the network namespace is not a child of the user
namespace.  This modification was needed to allow the container manager
to disable a namespace's priviliged port restrictions without exposing
control of the network namespace to processes in the user namespace.

Signed-off-by: Krister Johansen 
---
 Documentation/networking/ip-sysctl.txt |  9 ++
 include/net/ip.h   | 10 +++
 include/net/netns/ipv4.h   |  1 +
 net/ipv4/af_inet.c |  5 +++-
 net/ipv4/sysctl_net_ipv4.c | 50 +-
 net/ipv6/af_inet6.c|  3 +-
 net/netfilter/ipvs/ip_vs_ctl.c |  7 ++---
 net/sctp/socket.c  | 10 ---
 security/selinux/hooks.c   |  3 +-
 9 files changed, 86 insertions(+), 12 deletions(-)

Changes v1 -> v2:

Remove LOWPORT_SYSCTL config option.  This is now always enabled as long
as CONFIG_SYSCTL is.

Changes v2 -> v3:

Add documentation to ip-sysctl.txt.
Rename "protected" variables and functions to "privileged."

diff --git a/Documentation/networking/ip-sysctl.txt 
b/Documentation/networking/ip-sysctl.txt
index aa1bb49..17f2e77 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -822,6 +822,15 @@ ip_local_reserved_ports - list of comma separated ranges
 
Default: Empty
 
+ip_unprivileged_port_start - INTEGER
+   This is a per-namespace sysctl.  It defines the first
+   unprivileged port in the network namespace.  Privileged ports
+   require root or CAP_NET_BIND_SERVICE in order to bind to them.
+   To disable all privileged ports, set this to 0.  It may not
+   overlap with the ip_local_reserved_ports range.
+
+   Default: 1024
+
 ip_nonlocal_bind - BOOLEAN
If set, allows processes to bind() to non-local IP addresses,
which can be quite useful - but may break some applications.
diff --git a/include/net/ip.h b/include/net/ip.h
index ab6761a..bf264a8 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -263,11 +263,21 @@ static inline bool sysctl_dev_name_is_allowed(const char 
*name)
return strcmp(name, "default") != 0  && strcmp(name, "all") != 0;
 }
 
+static inline int inet_prot_sock(struct net *net)
+{
+   return net->ipv4.sysctl_ip_prot_sock;
+}
+
 #else
 static inline int inet_is_local_reserved_port(struct net *net, int port)
 {
return 0;
 }
+
+static inline int inet_prot_sock(struct net *net)
+{
+   return PROT_SOCK;
+}
 #endif
 
 __be32 inet_current_timestamp(void);
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 8e3f5b6..e365732 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -135,6 +135,7 @@ struct netns_ipv4 {
 
 #ifdef CONFIG_SYSCTL
unsigned long *sysctl_local_reserved_ports;
+   int sysctl_ip_prot_sock;
 #endif
 
 #ifdef CONFIG_IP_MROUTE
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index aae410b..28fe8da 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -479,7 +479,7 @@ int inet_bind(struct socket *sock, struct sockaddr *uaddr, 
int addr_len)
 
snum = ntohs(addr->sin_port);
err = -EACCES;
-   if (snum && snum < PROT_SOCK &&
+   if (snum && snum < inet_prot_sock(net) &&
!ns_capable(net->user_ns, CAP_NET_BIND_SERVICE))
goto out;
 
@@ -1700,6 +1700,9 @@ static __net_init int inet_init_net(struct net *net)
net->ipv4.sysctl_ip_default_ttl = IPDEFTTL;
net->ipv4.sysctl_ip_dynaddr = 0;
net->ipv4.sysctl_ip_early_demux = 1;
+#ifdef CONFIG_SYSCTL
+   net->ipv4.sysctl_ip_prot_sock = PROT_SOCK;
+#endif
 
return 0;
 }
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index c8d2836..1b86199 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -35,6 +35,8 @@ static int ip_local_port_range_min[] = { 1, 1 };
 static int ip_local_port_range_max[] = { 65535, 65535 };
 static int tcp_adv_win_scale_min = -31;
 static int tcp_adv_win_scale_max = 31;
+static int ip_privileged_port_min;
+static int ip_privileged_port_max = 65535;
 static int ip_ttl_min = 1;
 static int ip_ttl_max = 255;
 static int tcp_syn_retries_min = 1;
@@ -79,7 +81,12 @@ static int ipv4_local_port_range(struct ctl_table *table, 
int write,
ret = proc_dointvec_minmax(, write, buffer, lenp, ppos);
 
if (write && ret == 0) {
-

Re: [PATCH net-next 8/8] net: dsa: mv88e6xxx: Fix typ0 when configuring 2.5Gbps

2017-01-20 Thread Vivien Didelot

Hi Andrew,

Andrew Lunn  writes:

> In order to enable 2.5Gbps mode, we need the base speed of 10G, plus
> the Alt bit setting. Fix a typ0 that used 1Gb base speed.
>
> Signed-off-by: Andrew Lunn 

Reviewed-by: Vivien Didelot 

Thanks,

Vivien

Re: [PATCH net-next 5/8] net: dsa: mv88e6xxx: Workaround missing PHY ID on mv88e6390

2017-01-20 Thread Vivien Didelot

Hi Andrew,

Andrew Lunn  writes:

> The internal PHYs of the mv88e6390 do not have a model ID. Trap any
> calls to the ID register, and if it is zero, return the ID for the
> mv88e6390. The Marvell PHY driver can then bind to this ID.

This, in addition to the temperature code not working (despite what the
datasheet says) makes me wonder if this is intentional from Marvell. Do
we have a revision number for the 88E6390X's on the ZII Dev Rev C board?

It would be interesting to ask Gregory maybe about that. This looks not
"production-ready".

Other than that, I have no objection on the patch itself if that is
indeed expected from them...

Thanks,

Vivien

[PATCH net v2 1/2] net: Specify the owning module for lwtunnel ops

2017-01-20 Thread Robert Shearman

Modules implementing lwtunnel ops should not be allowed to unload
while there is state alive using those ops, so specify the owning
module for all lwtunnel ops.

Signed-off-by: Robert Shearman 
---
 include/net/lwtunnel.h| 2 ++
 net/core/lwt_bpf.c| 1 +
 net/ipv4/ip_tunnel_core.c | 2 ++
 net/ipv6/ila/ila_lwt.c| 1 +
 net/ipv6/seg6_iptunnel.c  | 1 +
 net/mpls/mpls_iptunnel.c  | 1 +
 6 files changed, 8 insertions(+)

diff --git a/include/net/lwtunnel.h b/include/net/lwtunnel.h
index 0b585f1fd340..73dd87647460 100644
--- a/include/net/lwtunnel.h
+++ b/include/net/lwtunnel.h
@@ -44,6 +44,8 @@ struct lwtunnel_encap_ops {
int (*get_encap_size)(struct lwtunnel_state *lwtstate);
int (*cmp_encap)(struct lwtunnel_state *a, struct lwtunnel_state *b);
int (*xmit)(struct sk_buff *skb);
+
+   struct module *owner;
 };
 
 #ifdef CONFIG_LWTUNNEL
diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c
index 71bb3e2eca08..b3eef90b2df9 100644
--- a/net/core/lwt_bpf.c
+++ b/net/core/lwt_bpf.c
@@ -386,6 +386,7 @@ static const struct lwtunnel_encap_ops bpf_encap_ops = {
.fill_encap = bpf_fill_encap_info,
.get_encap_size = bpf_encap_nlsize,
.cmp_encap  = bpf_encap_cmp,
+   .owner  = THIS_MODULE,
 };
 
 static int __init bpf_lwt_init(void)
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index fed3d29f9eb3..0fd1976ab63b 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -313,6 +313,7 @@ static const struct lwtunnel_encap_ops ip_tun_lwt_ops = {
.fill_encap = ip_tun_fill_encap_info,
.get_encap_size = ip_tun_encap_nlsize,
.cmp_encap = ip_tun_cmp_encap,
+   .owner = THIS_MODULE,
 };
 
 static const struct nla_policy ip6_tun_policy[LWTUNNEL_IP6_MAX + 1] = {
@@ -403,6 +404,7 @@ static const struct lwtunnel_encap_ops ip6_tun_lwt_ops = {
.fill_encap = ip6_tun_fill_encap_info,
.get_encap_size = ip6_tun_encap_nlsize,
.cmp_encap = ip_tun_cmp_encap,
+   .owner = THIS_MODULE,
 };
 
 void __init ip_tunnel_core_init(void)
diff --git a/net/ipv6/ila/ila_lwt.c b/net/ipv6/ila/ila_lwt.c
index a7bc54ab46e2..13b5e85fe0d5 100644
--- a/net/ipv6/ila/ila_lwt.c
+++ b/net/ipv6/ila/ila_lwt.c
@@ -238,6 +238,7 @@ static const struct lwtunnel_encap_ops ila_encap_ops = {
.fill_encap = ila_fill_encap_info,
.get_encap_size = ila_encap_nlsize,
.cmp_encap = ila_encap_cmp,
+   .owner = THIS_MODULE,
 };
 
 int ila_lwt_init(void)
diff --git a/net/ipv6/seg6_iptunnel.c b/net/ipv6/seg6_iptunnel.c
index 1d60cb132835..c46f8cbf5ab5 100644
--- a/net/ipv6/seg6_iptunnel.c
+++ b/net/ipv6/seg6_iptunnel.c
@@ -422,6 +422,7 @@ static const struct lwtunnel_encap_ops seg6_iptun_ops = {
.fill_encap = seg6_fill_encap_info,
.get_encap_size = seg6_encap_nlsize,
.cmp_encap = seg6_encap_cmp,
+   .owner = THIS_MODULE,
 };
 
 int __init seg6_iptunnel_init(void)
diff --git a/net/mpls/mpls_iptunnel.c b/net/mpls/mpls_iptunnel.c
index 2f7ccd934416..1d281c1ff7c1 100644
--- a/net/mpls/mpls_iptunnel.c
+++ b/net/mpls/mpls_iptunnel.c
@@ -215,6 +215,7 @@ static const struct lwtunnel_encap_ops mpls_iptun_ops = {
.fill_encap = mpls_fill_encap_info,
.get_encap_size = mpls_encap_nlsize,
.cmp_encap = mpls_encap_cmp,
+   .owner = THIS_MODULE,
 };
 
 static int __init mpls_iptunnel_init(void)
-- 
2.1.4

[PATCH net v2 0/2] net: Fix oops on state free after lwt module unload

2017-01-20 Thread Robert Shearman

An oops is seen in lwtstate_free after an lwt ops module has been
unloaded. This patchset fixes this by preventing modules implementing
lwtunnel ops from being unloaded whilst there's state alive using
those ops. The first patch adds fills in a new owner field in all lwt
ops and the second patch makes use of this to reference count the
modules as state is built and destroyed using them.

Changes in v2:
 - specify module owner for all modules as suggested by DaveM
 - reference count all modules building lwt state, not just those ops
   implementing destroy_state, as also suggested by DaveM.
 - rebased on top of David Ahern's lwtunnel changes

Robert Shearman (2):
  net: Specify the owning module for lwtunnel ops
  lwtunnel: Fix oops on state free after encap module unload

 include/net/lwtunnel.h| 2 ++
 net/core/lwt_bpf.c| 1 +
 net/core/lwtunnel.c   | 9 +++--
 net/ipv4/ip_tunnel_core.c | 2 ++
 net/ipv6/ila/ila_lwt.c| 1 +
 net/ipv6/seg6_iptunnel.c  | 1 +
 net/mpls/mpls_iptunnel.c  | 1 +
 7 files changed, 15 insertions(+), 2 deletions(-)

-- 
2.1.4

[PATCH net v2 2/2] lwtunnel: Fix oops on state free after encap module unload

2017-01-20 Thread Robert Shearman

When attempting to free lwtunnel state after the module for the encap
has been unloaded an oops occurs:

BUG: unable to handle kernel NULL pointer dereference at 0008
IP: lwtstate_free+0x18/0x40
[..]
task: 88003e372380 task.stack: c91fc000
RIP: 0010:lwtstate_free+0x18/0x40
RSP: 0018:88003fd83e88 EFLAGS: 00010246
RAX:  RBX: 88002bbb3380 RCX: 88000c91a300
[..]
Call Trace:
 
 free_fib_info_rcu+0x195/0x1a0
 ? rt_fibinfo_free+0x50/0x50
 rcu_process_callbacks+0x2d3/0x850
 ? rcu_process_callbacks+0x296/0x850
 __do_softirq+0xe4/0x4cb
 irq_exit+0xb0/0xc0
 smp_apic_timer_interrupt+0x3d/0x50
 apic_timer_interrupt+0x93/0xa0
[..]
Code: e8 6e c6 fc ff 89 d8 5b 5d c3 bb de ff ff ff eb f4 66 90 66 66 66 66 90 
55 48 89 e5 53 0f b7 07 48 89 fb 48 8b 04 c5 00 81 d5 81 <48> 8b 40 08 48 85 c0 
74 13 ff d0 48 8d 7b 20 be 20 00 00 00 e8

The problem is after the module for the encap is unloaded the
corresponding ops is removed and thus is NULL here.

Modules implementing lwtunnel ops should not be allowed to unload
while there is state alive using those ops, so grab the module
reference for the ops on creating lwtunnel state and of course release
the reference when freeing the state.

Fixes: 1104d9ba443a ("lwtunnel: Add destroy state operation")
Signed-off-by: Robert Shearman 
---
 net/core/lwtunnel.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/core/lwtunnel.c b/net/core/lwtunnel.c
index 47b1dd65947b..ebfaffaa777b 100644
--- a/net/core/lwtunnel.c
+++ b/net/core/lwtunnel.c
@@ -115,8 +115,12 @@ int lwtunnel_build_state(struct net_device *dev, u16 
encap_type,
ret = -EOPNOTSUPP;
rcu_read_lock();
ops = rcu_dereference(lwtun_encaps[encap_type]);
-   if (likely(ops && ops->build_state))
-   ret = ops->build_state(dev, encap, family, cfg, lws);
+   if (likely(ops)) {
+   if (likely(try_module_get(ops->owner) && ops->build_state))
+   ret = ops->build_state(dev, encap, family, cfg, lws);
+   if (ret)
+   module_put(ops->owner);
+   }
rcu_read_unlock();
 
return ret;
@@ -194,6 +198,7 @@ void lwtstate_free(struct lwtunnel_state *lws)
} else {
kfree(lws);
}
+   module_put(ops->owner);
 }
 EXPORT_SYMBOL(lwtstate_free);
 
-- 
2.1.4

Re: [PATCH net-next 4/8] net: dsa: mv88e6xxx: Set the CMODE for mv88e6390 ports 9 & 10

2017-01-20 Thread Vivien Didelot

Hi Andrew,

Andrew Lunn  writes:

> Unlike most ports, ports 9 and 10 of the 6390X family have configurable
> PHY modes. Set the mode as part of adjust_link().
>
> Ordering is important, because the SERDES interfaces connected to
> ports 9 and 10 can be split and assigned to other ports. The CMODE has
> to be correctly set before the SERDES interface on another port can be
> configured. Such configuration is likely to be performed in
> port_enable() and port_disabled(), called on slave_open() and
> slave_close().
>
> The simple case is port 9 and 10 are used for 'CPU' or 'DSA'. In this
> case, the CMODE is set via a phy-mode in dsa_cpu_dsa_setup(), which is
> called early in the switch setup.
>
> When ports 9 or 10 are used as user ports, and have a fixed-phy, when
> the fixed fixed-phy is attached, dsa_slave_adjust_link() is called,
> which results in the adjust_link function being called, setting the
> cmode. The port_enable() will for other ports will be called much
> later.
>
> When ports 9 or 10 are used as user ports and have a real phy attached
> which does not use all the available SERDES interface, e.g. a 1Gbps
> SGMII, there is currently no mechanism in place to set the CMODE of
> the port from software. It must be hoped the stripping resistors are
> correct.
>
> At the same time, add a function to get the cmode. This will be needed
> when configuring the SERDES interfaces.
>
> Signed-off-by: Andrew Lunn 

Reviewed-by: Vivien Didelot 

Thanks for the very descriptive message, the patch looks perfect.

Vivien

Re: [PATCH net-next 2/8] net: phy: Add 2000base-x, 2500base-x and rxaui modes

2017-01-20 Thread Florian Fainelli

On 01/20/2017 03:30 PM, Andrew Lunn wrote:
> The mv88e6390 ports 9 and 10 supports some additional PHY modes. Add
> these modes to the PHY core so they can be used in the binding.
> 
> Signed-off-by: Andrew Lunn 

Reviewed-by: Florian Fainelli 

Can you also send a Device Tree specification patch with these updates?
Thanks!
-- 
Florian

Re: [PATCH net-next 3/8] net: dsa: mv88e6xxx: Fix ATU age timer for MV88E6390

2017-01-20 Thread Vivien Didelot

Hi Andrew,

Andrew Lunn  writes:

> The MV88E6390 family uses a different ATU age timer coefficient.
> Fix the the info structures.

Redundant "the" here. Otherwise good catch, the minimum age time is 3.75
seconds.

> Signed-off-by: Andrew Lunn 

Reviewed-by: Vivien Didelot 

Thanks,

Vivien

[PATCH 1/4] xfrm: Constify xfrm_user arguments and xfrm_mgr callback APIs

2017-01-20 Thread Kevin Cernekee

This provides a better sense of the data flow and inputs/outputs.  No
change to code size or functionality.

Signed-off-by: Kevin Cernekee 
---
 include/net/xfrm.h |  36 --
 net/key/af_key.c   |  34 +++--
 net/xfrm/xfrm_policy.c |   8 +-
 net/xfrm/xfrm_state.c  |   2 +-
 net/xfrm/xfrm_user.c   | 342 +
 5 files changed, 253 insertions(+), 169 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 31947b9c21d6..34298d78ba45 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -228,7 +228,7 @@ struct xfrm_state {
void*data;
 };
 
-static inline struct net *xs_net(struct xfrm_state *x)
+static inline struct net *xs_net(const struct xfrm_state *x)
 {
return read_pnet(>xs_net);
 }
@@ -587,12 +587,23 @@ struct xfrm_migrate {
 struct xfrm_mgr {
struct list_headlist;
char*id;
-   int (*notify)(struct xfrm_state *x, const struct 
km_event *c);
-   int (*acquire)(struct xfrm_state *x, struct 
xfrm_tmpl *, struct xfrm_policy *xp);
-   struct xfrm_policy  *(*compile_policy)(struct sock *sk, int opt, u8 
*data, int len, int *dir);
-   int (*new_mapping)(struct xfrm_state *x, 
xfrm_address_t *ipaddr, __be16 sport);
-   int (*notify_policy)(struct xfrm_policy *x, int 
dir, const struct km_event *c);
-   int (*report)(struct net *net, u8 proto, struct 
xfrm_selector *sel, xfrm_address_t *addr);
+   int (*notify)(const struct xfrm_state *x,
+ const struct km_event *c);
+   int (*acquire)(struct xfrm_state *x,
+  const struct xfrm_tmpl *,
+  const struct xfrm_policy *xp);
+   struct xfrm_policy  *(*compile_policy)(struct sock *sk,
+  int opt, u8 *data,
+  int len, int *dir);
+   int (*new_mapping)(struct xfrm_state *x,
+  const xfrm_address_t *ipaddr,
+  __be16 sport);
+   int (*notify_policy)(const struct xfrm_policy *x,
+int dir,
+const struct km_event *c);
+   int (*report)(struct net *net, u8 proto,
+ const struct xfrm_selector *sel,
+ const xfrm_address_t *addr);
int (*migrate)(const struct xfrm_selector *sel,
   u8 dir, u8 type,
   const struct xfrm_migrate *m,
@@ -1432,7 +1443,7 @@ static inline void xfrm_sysctl_fini(struct net *net)
 void xfrm_state_walk_init(struct xfrm_state_walk *walk, u8 proto,
  struct xfrm_address_filter *filter);
 int xfrm_state_walk(struct net *net, struct xfrm_state_walk *walk,
-   int (*func)(struct xfrm_state *, int, void*), void *);
+   int (*func)(const struct xfrm_state *, int, void*), void *);
 void xfrm_state_walk_done(struct xfrm_state_walk *walk, struct net *net);
 struct xfrm_state *xfrm_state_alloc(struct net *net);
 struct xfrm_state *xfrm_state_find(const xfrm_address_t *daddr,
@@ -1584,13 +1595,13 @@ struct xfrm_policy *xfrm_policy_alloc(struct net *net, 
gfp_t gfp);
 
 void xfrm_policy_walk_init(struct xfrm_policy_walk *walk, u8 type);
 int xfrm_policy_walk(struct net *net, struct xfrm_policy_walk *walk,
-int (*func)(struct xfrm_policy *, int, int, void*),
+int (*func)(const struct xfrm_policy *, int, int, void*),
 void *);
 void xfrm_policy_walk_done(struct xfrm_policy_walk *walk, struct net *net);
 int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl);
 struct xfrm_policy *xfrm_policy_bysel_ctx(struct net *net, u32 mark,
  u8 type, int dir,
- struct xfrm_selector *sel,
+ const struct xfrm_selector *sel,
  struct xfrm_sec_ctx *ctx, int delete,
  int *err);
 struct xfrm_policy *xfrm_policy_byid(struct net *net, u32 mark, u8, int dir,
@@ -1695,7 +1706,7 @@ static inline int xfrm_acquire_is_on(struct net *net)
 }
 #endif
 
-static inline int aead_len(struct xfrm_algo_aead *alg)
+static inline int aead_len(const struct xfrm_algo_aead *alg)
 {
return sizeof(*alg) + ((alg->alg_key_len + 7) / 8);
 }
@@ -1710,7 +1721,8 @@ static inline

[PATCH 0/4] Make xfrm usable by 32-bit programs

2017-01-20 Thread Kevin Cernekee

Several of the xfrm netlink and setsockopt() interfaces are not usable
from a 32-bit binary running on a 64-bit kernel due to struct padding
differences.  This has been the case for many, many years[0].  This
patch series deprecates the broken netlink messages and replaces them
with packed structs that are compatible between 64-bit and 32-bit
programs.  It retains support for legacy user programs (i.e. anything
that is currently working today), and allows legacy support to be
compiled out via CONFIG_XFRM_USER_LEGACY if it becomes unnecessary in
the future.

Earlier attempts at fixing the problem had implemented a compat layer.
A compat layer is helpful because it avoids the need to recompile old
user binaries, but there are many challenges involved in implementing
it.  I believe a compat layer is of limited value in this instance
because anybody who really needed to solve the problem without
recompiling their binaries has almost certainly found another solution
in the ~7 years since the compat patches were first proposed.

A benefit of this approach is that long-term, the broken netlink messages
will no longer be used.  A drawback is that in the short term, user
programs that want to adopt the new message formats will require a
modern kernel.  Projects like strongSwan and iproute2 bundle the xfrm.h
header inside their own source trees, so they will need to make a
judgment call on when to remove support for kernels that do not support
the new messages.  And programs built against the new kernel headers
will not work on old kernels.  (Perhaps this is an argument for naming
the new messages _NEW, rather than renaming the old messages to
_LEGACY.)

The following netlink messages are affected:

XFRM_MSG_NEWSA
XFRM_MSG_UPDSA
XFRM_MSG_DELSA
XFRM_MSG_GETSA
XFRM_MSG_NEWPOLICY
XFRM_MSG_UPDPOLICY
XFRM_MSG_DELPOLICY
XFRM_MSG_GETPOLICY
XFRM_MSG_ALLOCSPI
XFRM_MSG_ACQUIRE
XFRM_MSG_EXPIRE
XFRM_MSG_POLEXPIRE

The following setsockopt() settings are affected:

IP_XFRM_POLICY
IPV6_XFRM_POLICY

The root cause of the problem involves padding and alignment
incompatibilities in the following structs:

xfrm_usersa_info 220 bytes on i386 -> 224 bytes on amd64
xfrm_userpolicy_info 164 -> 168
xfrm_userspi_info 228 -> 232, offset mismatch on min
xfrm_user_acquire 276 -> 280, offset mismatch on aalgos
xfrm_user_expire 224 -> 232, offset mismatch on hard
xfrm_user_polexpire 168 -> 176, offset mismatch on hard

Most xfrm netlink messages consist of an xfrm_* struct followed by
additional attributes (struct nlattr TLV), so even cases where the
struct layout (sans padding) is identical will result in incompatible
messages.

Some possible tweaks to this approach:

a) Name the new messages _NEW instead of renaming the old messages
_LEGACY.  This fixes the "new binary on old kernel" problem, but it
means that callers need to change every call site in their programs
to explicitly request the new interface.

b) Tweak xfrm.h so that user programs build against the legacy
interfaces by default, but can alter that behavior using a #define
flag.  Maybe in a few years, assume that everyone is running a modern
kernel and make the new interface the default.


[0] https://www.spinics.net/lists/netdev/msg126176.html


Kevin Cernekee (4):
  xfrm: Constify xfrm_user arguments and xfrm_mgr callback APIs
  xfrm_user: Allow common functions to be called from another file
  xfrm_user: Initial commit of xfrm_user_legacy.c
  xfrm_user: Add new 32/64-agnostic netlink messages

 include/net/xfrm.h  |   36 +-
 include/uapi/linux/xfrm.h   |  152 --
 net/key/af_key.c|   34 +-
 net/xfrm/Kconfig|   14 +
 net/xfrm/Makefile   |8 +-
 net/xfrm/xfrm_policy.c  |8 +-
 net/xfrm/xfrm_state.c   |2 +-
 net/xfrm/xfrm_user.c|  587 +-
 net/xfrm/xfrm_user.h|  165 +++
 net/xfrm/xfrm_user_legacy.c | 1140 +++
 security/selinux/nlmsgtab.c |   61 ++-
 11 files changed, 1890 insertions(+), 317 deletions(-)
 create mode 100644 net/xfrm/xfrm_user.h
 create mode 100644 net/xfrm/xfrm_user_legacy.c

-- 
2.11.0.483.g087da7b7c-goog

[PATCH 2/4] xfrm_user: Allow common functions to be called from another file

2017-01-20 Thread Kevin Cernekee

xfrm_user_legacy.c will need to call a few common functions.  Make
sure them have an "xfrm_" prefix, and declare them in a new xfrm_user.h
header.

Signed-off-by: Kevin Cernekee 
---
 net/xfrm/xfrm_user.c | 147 +--
 net/xfrm/xfrm_user.h |  90 +++
 2 files changed, 138 insertions(+), 99 deletions(-)
 create mode 100644 net/xfrm/xfrm_user.h

diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index ed389aad4994..4d733f02c3a1 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -32,6 +32,7 @@
 #include 
 #endif
 #include 
+#include "xfrm_user.h"
 
 static int verify_one_alg(struct nlattr **attrs, enum xfrm_attr_type_t type)
 {
@@ -100,7 +101,7 @@ static void verify_one_addr(struct nlattr **attrs, enum 
xfrm_attr_type_t type,
*addrp = nla_data(rt);
 }
 
-static inline int verify_sec_ctx_len(struct nlattr **attrs)
+int xfrm_verify_sec_ctx_len(struct nlattr **attrs)
 {
struct nlattr *rt = attrs[XFRMA_SEC_CTX];
struct xfrm_user_sec_ctx *uctx;
@@ -148,8 +149,8 @@ static inline int verify_replay(const struct 
xfrm_usersa_info *p,
return 0;
 }
 
-static int verify_newsa_info(const struct xfrm_usersa_info *p,
-struct nlattr **attrs)
+int xfrm_verify_newsa_info(const struct xfrm_usersa_info *p,
+  struct nlattr **attrs)
 {
int err;
 
@@ -241,7 +242,7 @@ static int verify_newsa_info(const struct xfrm_usersa_info 
*p,
goto out;
if ((err = verify_one_alg(attrs, XFRMA_ALG_COMP)))
goto out;
-   if ((err = verify_sec_ctx_len(attrs)))
+   if ((err = xfrm_verify_sec_ctx_len(attrs)))
goto out;
if ((err = verify_replay(p, attrs)))
goto out;
@@ -460,17 +461,6 @@ static int xfrm_alloc_replay_state_esn(
return 0;
 }
 
-static inline int xfrm_user_sec_ctx_size(const struct xfrm_sec_ctx *xfrm_ctx)
-{
-   int len = 0;
-
-   if (xfrm_ctx) {
-   len += sizeof(struct xfrm_user_sec_ctx);
-   len += xfrm_ctx->ctx_len;
-   }
-   return len;
-}
-
 static void copy_from_user_state(struct xfrm_state *x,
 const struct xfrm_usersa_info *p)
 {
@@ -537,10 +527,10 @@ static void xfrm_update_ae_params(struct xfrm_state *x,
x->replay_maxdiff = nla_get_u32(rt);
 }
 
-static struct xfrm_state *xfrm_state_construct(struct net *net,
-  const struct xfrm_usersa_info *p,
-  struct nlattr **attrs,
-  int *errp)
+struct xfrm_state *xfrm_state_construct(struct net *net,
+   const struct xfrm_usersa_info *p,
+   struct nlattr **attrs,
+   int *errp)
 {
struct xfrm_state *x = xfrm_state_alloc(net);
int err = -ENOMEM;
@@ -634,7 +624,7 @@ static int xfrm_add_sa(struct sk_buff *skb, const struct 
nlmsghdr *nlh,
int err;
struct km_event c;
 
-   err = verify_newsa_info(p, attrs);
+   err = xfrm_verify_newsa_info(p, attrs);
if (err)
return err;
 
@@ -666,10 +656,10 @@ static int xfrm_add_sa(struct sk_buff *skb, const struct 
nlmsghdr *nlh,
return err;
 }
 
-static struct xfrm_state *xfrm_user_state_lookup(struct net *net,
-const struct xfrm_usersa_id *p,
-struct nlattr **attrs,
-int *errp)
+struct xfrm_state *xfrm_user_state_lookup(struct net *net,
+ const struct xfrm_usersa_id *p,
+ struct nlattr **attrs,
+ int *errp)
 {
struct xfrm_state *x = NULL;
struct xfrm_mark m;
@@ -757,14 +747,7 @@ static void copy_to_user_state(const struct xfrm_state *x,
p->seq = x->km.seq;
 }
 
-struct xfrm_dump_info {
-   struct sk_buff *in_skb;
-   struct sk_buff *out_skb;
-   u32 nlmsg_seq;
-   u16 nlmsg_flags;
-};
-
-static int copy_sec_ctx(const struct xfrm_sec_ctx *s, struct sk_buff *skb)
+int xfrm_copy_sec_ctx(const struct xfrm_sec_ctx *s, struct sk_buff *skb)
 {
struct xfrm_user_sec_ctx *uctx;
struct nlattr *attr;
@@ -785,8 +768,8 @@ static int copy_sec_ctx(const struct xfrm_sec_ctx *s, 
struct sk_buff *skb)
return 0;
 }
 
-static int copy_to_user_auth(const struct xfrm_algo_auth *auth,
-struct sk_buff *skb)
+int xfrm_copy_to_user_auth(const struct xfrm_algo_auth *auth,
+  struct sk_buff *skb)
 {
struct xfrm_algo *algo;
struct nlattr *nla;
@@ -837,7 +820,7 @@ static int

[PATCH 4/4] xfrm_user: Add new 32/64-agnostic netlink messages

2017-01-20 Thread Kevin Cernekee

Add several new message types to address longstanding 32-bit/64-bit
compatibility issues.  Use xfrm_user_legacy to handle the existing
message types, which will retain the old IDs for compatibility with
existing binaries.

For user->kernel messages, the nlmsg_type will determine whether to use
the old format or the new format (for both requests and replies).  For
kernel->user multicasts, both types will be sent.

setsockopt() will deduce the format from the length.

Signed-off-by: Kevin Cernekee 
---
 include/uapi/linux/xfrm.h   | 152 ++-
 net/xfrm/xfrm_user.c| 136 ---
 net/xfrm/xfrm_user.h|  75 
 net/xfrm/xfrm_user_legacy.c | 169 
 security/selinux/nlmsgtab.c |  61 +---
 5 files changed, 466 insertions(+), 127 deletions(-)

diff --git a/include/uapi/linux/xfrm.h b/include/uapi/linux/xfrm.h
index 1fc62b239f1b..ae5f97681989 100644
--- a/include/uapi/linux/xfrm.h
+++ b/include/uapi/linux/xfrm.h
@@ -1,6 +1,7 @@
 #ifndef _LINUX_XFRM_H
 #define _LINUX_XFRM_H
 
+#include 
 #include 
 #include 
 
@@ -157,34 +158,34 @@ enum {
 enum {
XFRM_MSG_BASE = 0x10,
 
-   XFRM_MSG_NEWSA = 0x10,
-#define XFRM_MSG_NEWSA XFRM_MSG_NEWSA
-   XFRM_MSG_DELSA,
-#define XFRM_MSG_DELSA XFRM_MSG_DELSA
-   XFRM_MSG_GETSA,
-#define XFRM_MSG_GETSA XFRM_MSG_GETSA
-
-   XFRM_MSG_NEWPOLICY,
-#define XFRM_MSG_NEWPOLICY XFRM_MSG_NEWPOLICY
-   XFRM_MSG_DELPOLICY,
-#define XFRM_MSG_DELPOLICY XFRM_MSG_DELPOLICY
-   XFRM_MSG_GETPOLICY,
-#define XFRM_MSG_GETPOLICY XFRM_MSG_GETPOLICY
-
-   XFRM_MSG_ALLOCSPI,
-#define XFRM_MSG_ALLOCSPI XFRM_MSG_ALLOCSPI
-   XFRM_MSG_ACQUIRE,
-#define XFRM_MSG_ACQUIRE XFRM_MSG_ACQUIRE
-   XFRM_MSG_EXPIRE,
-#define XFRM_MSG_EXPIRE XFRM_MSG_EXPIRE
-
-   XFRM_MSG_UPDPOLICY,
-#define XFRM_MSG_UPDPOLICY XFRM_MSG_UPDPOLICY
-   XFRM_MSG_UPDSA,
-#define XFRM_MSG_UPDSA XFRM_MSG_UPDSA
-
-   XFRM_MSG_POLEXPIRE,
-#define XFRM_MSG_POLEXPIRE XFRM_MSG_POLEXPIRE
+   XFRM_MSG_NEWSA_LEGACY = 0x10,
+#define XFRM_MSG_NEWSA_LEGACY XFRM_MSG_NEWSA_LEGACY
+   XFRM_MSG_DELSA_LEGACY,
+#define XFRM_MSG_DELSA_LEGACY XFRM_MSG_DELSA_LEGACY
+   XFRM_MSG_GETSA_LEGACY,
+#define XFRM_MSG_GETSA_LEGACY XFRM_MSG_GETSA_LEGACY
+
+   XFRM_MSG_NEWPOLICY_LEGACY,
+#define XFRM_MSG_NEWPOLICY_LEGACY XFRM_MSG_NEWPOLICY_LEGACY
+   XFRM_MSG_DELPOLICY_LEGACY,
+#define XFRM_MSG_DELPOLICY_LEGACY XFRM_MSG_DELPOLICY_LEGACY
+   XFRM_MSG_GETPOLICY_LEGACY,
+#define XFRM_MSG_GETPOLICY_LEGACY XFRM_MSG_GETPOLICY_LEGACY
+
+   XFRM_MSG_ALLOCSPI_LEGACY,
+#define XFRM_MSG_ALLOCSPI_LEGACY XFRM_MSG_ALLOCSPI_LEGACY
+   XFRM_MSG_ACQUIRE_LEGACY,
+#define XFRM_MSG_ACQUIRE_LEGACY XFRM_MSG_ACQUIRE_LEGACY
+   XFRM_MSG_EXPIRE_LEGACY,
+#define XFRM_MSG_EXPIRE_LEGACY XFRM_MSG_EXPIRE_LEGACY
+
+   XFRM_MSG_UPDPOLICY_LEGACY,
+#define XFRM_MSG_UPDPOLICY_LEGACY XFRM_MSG_UPDPOLICY_LEGACY
+   XFRM_MSG_UPDSA_LEGACY,
+#define XFRM_MSG_UPDSA_LEGACY XFRM_MSG_UPDSA_LEGACY
+
+   XFRM_MSG_POLEXPIRE_LEGACY,
+#define XFRM_MSG_POLEXPIRE_LEGACY XFRM_MSG_POLEXPIRE_LEGACY
 
XFRM_MSG_FLUSHSA,
 #define XFRM_MSG_FLUSHSA XFRM_MSG_FLUSHSA
@@ -214,6 +215,34 @@ enum {
 
XFRM_MSG_MAPPING,
 #define XFRM_MSG_MAPPING XFRM_MSG_MAPPING
+
+   XFRM_MSG_ALLOCSPI,
+#define XFRM_MSG_ALLOCSPI XFRM_MSG_ALLOCSPI
+   XFRM_MSG_ACQUIRE,
+#define XFRM_MSG_ACQUIRE XFRM_MSG_ACQUIRE
+   XFRM_MSG_EXPIRE,
+#define XFRM_MSG_EXPIRE XFRM_MSG_EXPIRE
+   XFRM_MSG_POLEXPIRE,
+#define XFRM_MSG_POLEXPIRE XFRM_MSG_POLEXPIRE
+
+   XFRM_MSG_NEWSA,
+#define XFRM_MSG_NEWSA XFRM_MSG_NEWSA
+   XFRM_MSG_UPDSA,
+#define XFRM_MSG_UPDSA XFRM_MSG_UPDSA
+   XFRM_MSG_DELSA,
+#define XFRM_MSG_DELSA XFRM_MSG_DELSA
+   XFRM_MSG_GETSA,
+#define XFRM_MSG_GETSA XFRM_MSG_GETSA
+
+   XFRM_MSG_NEWPOLICY,
+#define XFRM_MSG_NEWPOLICY XFRM_MSG_NEWPOLICY
+   XFRM_MSG_UPDPOLICY,
+#define XFRM_MSG_UPDPOLICY XFRM_MSG_UPDPOLICY
+   XFRM_MSG_DELPOLICY,
+#define XFRM_MSG_DELPOLICY XFRM_MSG_DELPOLICY
+   XFRM_MSG_GETPOLICY,
+#define XFRM_MSG_GETPOLICY XFRM_MSG_GETPOLICY
+
__XFRM_MSG_MAX
 };
 #define XFRM_MSG_MAX (__XFRM_MSG_MAX - 1)
@@ -221,7 +250,7 @@ enum {
 #define XFRM_NR_MSGTYPES (XFRM_MSG_MAX + 1 - XFRM_MSG_BASE)
 
 /*
- * Generic LSM security context for comunicating to user space
+ * Generic LSM security context for communicating to user space
  * NOTE: Same format as sadb_x_sec_ctx
  */
 struct xfrm_user_sec_ctx {
@@ -357,6 +386,22 @@ struct xfrmu_spdhthresh {
__u8 rbits;
 };
 
+/* Legacy structs are incompatible between 32-bit and 64-bit. */
+struct xfrm_usersa_info_legacy {
+   struct xfrm_selectorsel;
+   struct xfrm_id  id;
+   xfrm_address_t  saddr;
+   struct xfrm_lifetime_cfglft;
+   struct xfrm_lifetime_curcurlft;
+

[PATCH 3/4] xfrm_user: Initial commit of xfrm_user_legacy.c

2017-01-20 Thread Kevin Cernekee

Several xfrm_* structs are incompatible between 32bit and 64bit builds:

xfrm_usersa_info 220 bytes on i386 -> 224 bytes on amd64
xfrm_userpolicy_info 164 -> 168
xfrm_userspi_info 228 -> 232, offset mismatch on min
xfrm_user_acquire 276 -> 280, offset mismatch on aalgos
xfrm_user_expire 224 -> 232, offset mismatch on hard
xfrm_user_polexpire 168 -> 176, offset mismatch on hard

Fork all of the functions that handle these structs into a new file so
that it is possible to support both legacy + new layouts.

This commit contains an exact copy of the necessary functions from
xfrm_user.c, for ease of reviewing.  The next commit will contain all
of the changes needed to make these functions handle legacy messages
correctly.

Signed-off-by: Kevin Cernekee 
---
 net/xfrm/Kconfig|   14 +
 net/xfrm/Makefile   |8 +-
 net/xfrm/xfrm_user_legacy.c | 1091 +++
 3 files changed, 1112 insertions(+), 1 deletion(-)
 create mode 100644 net/xfrm/xfrm_user_legacy.c

diff --git a/net/xfrm/Kconfig b/net/xfrm/Kconfig
index bda1a13628a8..317dcc411345 100644
--- a/net/xfrm/Kconfig
+++ b/net/xfrm/Kconfig
@@ -20,6 +20,20 @@ config XFRM_USER
 
  If unsure, say Y.
 
+config XFRM_USER_LEGACY
+   tristate "Legacy transformation user configuration interface"
+   depends on XFRM_USER
+   default y
+   ---help---
+ The original Transformation(XFRM) netlink messages were not
+ compatible between 32-bit programs and 64-bit kernels, so they
+ have been deprecated.  Enable this option if you have existing
+ binaries that rely on the old format messages.  Disable this
+ option if you know that all users of the interface have been
+ built against recent kernel headers.
+
+ If unsure, say Y.
+
 config XFRM_SUB_POLICY
bool "Transformation sub policy support"
depends on XFRM
diff --git a/net/xfrm/Makefile b/net/xfrm/Makefile
index c0e961983f17..6cf6f8da3dc8 100644
--- a/net/xfrm/Makefile
+++ b/net/xfrm/Makefile
@@ -7,5 +7,11 @@ obj-$(CONFIG_XFRM) := xfrm_policy.o xfrm_state.o xfrm_hash.o \
  xfrm_sysctl.o xfrm_replay.o
 obj-$(CONFIG_XFRM_STATISTICS) += xfrm_proc.o
 obj-$(CONFIG_XFRM_ALGO) += xfrm_algo.o
-obj-$(CONFIG_XFRM_USER) += xfrm_user.o
+
+xfrm-user-objs := xfrm_user.o
+ifneq ($(CONFIG_XFRM_USER_LEGACY),)
+xfrm-user-objs += xfrm_user_legacy.o
+endif
+obj-$(CONFIG_XFRM_USER) += xfrm-user.o
+
 obj-$(CONFIG_XFRM_IPCOMP) += xfrm_ipcomp.o
diff --git a/net/xfrm/xfrm_user_legacy.c b/net/xfrm/xfrm_user_legacy.c
new file mode 100644
index ..058accfefc83
--- /dev/null
+++ b/net/xfrm/xfrm_user_legacy.c
@@ -0,0 +1,1091 @@
+/* xfrm_user.c: User interface to configure xfrm engine.
+ *
+ * Copyright (C) 2002 David S. Miller (da...@redhat.com)
+ *
+ * Changes:
+ * Mitsuru KANDA @USAGI
+ * Kazunori MIYAZAWA @USAGI
+ * Kunihiro Ishiguro 
+ * IPv6 support
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "xfrm_user.h"
+
+static int xfrm_add_sa(struct sk_buff *skb, const struct nlmsghdr *nlh,
+  struct nlattr **attrs)
+{
+   struct net *net = sock_net(skb->sk);
+   const struct xfrm_usersa_info *p = nlmsg_data(nlh);
+   struct xfrm_state *x;
+   int err;
+   struct km_event c;
+
+   err = xfrm_verify_newsa_info(p, attrs);
+   if (err)
+   return err;
+
+   x = xfrm_state_construct(net, p, attrs, );
+   if (!x)
+   return err;
+
+   xfrm_state_hold(x);
+   if (nlh->nlmsg_type == XFRM_MSG_NEWSA)
+   err = xfrm_state_add(x);
+   else
+   err = xfrm_state_update(x);
+
+   xfrm_audit_state_add(x, err ? 0 : 1, true);
+
+   if (err < 0) {
+   x->km.state = XFRM_STATE_DEAD;
+   __xfrm_state_put(x);
+   goto out;
+   }
+
+   c.seq = nlh->nlmsg_seq;
+   c.portid = nlh->nlmsg_pid;
+   c.event = nlh->nlmsg_type;
+
+   km_state_notify(x, );
+out:
+   xfrm_state_put(x);
+   return err;
+}
+
+static int xfrm_del_sa(struct sk_buff *skb, const struct nlmsghdr *nlh,
+  struct nlattr **attrs)
+{
+   struct net *net = sock_net(skb->sk);
+   struct xfrm_state *x;
+   int err = -ESRCH;
+   struct km_event c;
+   const struct xfrm_usersa_id *p = nlmsg_data(nlh);
+
+   x = xfrm_user_state_lookup(net, p, attrs, );
+   if (x == NULL)
+   return err;
+
+   if ((err = security_xfrm_state_delete(x)) != 0)
+   goto out;
+
+   if (xfrm_state_kern(x)) {
+   err = -EPERM;
+   goto out;
+   }
+
+   err = xfrm_state_delete(x);
+
+   if (err < 0)
+

Re: [PATCH net-next 1/8] net: dsa: mv88e6xxx: Implement external MDIO bus on mv88e6390

2017-01-20 Thread Vivien Didelot

Hi Andrew,

Andrew Lunn  writes:

> The mv88e6390 has two MDIO busses. The internal MDIO bus is used for
> the internal PHYs. The external MDIO can be used for external PHYs.
> The external MDIO bus will be instantiated if there is an
> "mdio-external" node in the device tree.

Thanks for pushing the 88E6390 support. Some comments below.

> +static int mv88e6xxx_read_phy(struct mv88e6xxx_chip *chip, int addr, int reg,
> +static int mv88e6xxx_write_phy(struct mv88e6xxx_chip *chip, int addr, int 
> reg,
>  static int mv88e6xxx_phy_read(struct mv88e6xxx_chip *chip, int phy,
>  static int mv88e6xxx_phy_write(struct mv88e6xxx_chip *chip, int phy,

Adding mv88e6xxx_read/write_phy() in addition to existing
mv88e6xxx_phy_read/write() feels really confusing and hard to
maintain. Can that be done the other way around maybe?

> +static int mv88e6xxx_external_mdio_register(struct mv88e6xxx_chip *chip,
> + struct device_node *np)
> +static void mv88e6xxx_external_mdio_unregister(struct mv88e6xxx_chip *chip)
>  static int mv88e6xxx_mdio_register(struct mv88e6xxx_chip *chip,
>  struct device_node *np)

We already have mv88e6xxx_mdio_register/unregister(). Isn't it possible
to tweak them to take a struct mv88e6xxx_mdio_bus instance and use them
twice for both internal and external MDIO busses?

> + if (mv88e6xxx_has(chip, MV88E6XXX_FLAG_G2_EXTERNAL_MDIO)) {
> + err = mv88e6xxx_external_mdio_register(chip, np);
> + if (err)
> + goto out_mdio;
> + }

We are trying to get rid of the flags and family checks... Please don't
add new ones. If the external MDIO bus is a new feature of switches like
88E6390, isn't it better to add new external_phy_read/write ops and
register the bus if they are provided?

if (chip->info->ops->external_phy_read) {
struct mv88e6xxx_mdio_bus *external_mdio_bus;
...
err = mv88e6xxx_mdio_register(external_mdio_bus);
if (err)
...
}

Thanks,

Vivien

[PATCH net] net: dsa: Check return value of phy_connect_direct()

2017-01-20 Thread Florian Fainelli

We need to check the return value of phy_connect_direct() in
dsa_slave_phy_connect() otherwise we may be continuing the
initialization of a slave network device with a PHY that already
attached somewhere else and which will soon be in error because the PHY
device is in error.

The conditions for such an error to occur are that we have a port of our
switch that is not disabled, and has the same port number as a PHY
address (say both 5) that can be probed using the DSA slave MII bus. We
end-up having this slave network device find a PHY at the same address
as our port number, and we try to attach to it.

A slave network (e.g: port 0) has already attached to our PHY device,
and we try to re-attach it with a different network device, but since we
ignore the error we would end-up initializating incorrect device
references by the time the slave network interface is opened.

The code has been (re)organized several times, making it hard to provide
an exact Fixes tag, this is a bugfix nonetheless.

Signed-off-by: Florian Fainelli 
---
 net/dsa/slave.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 68c9eea00518..ba1b6b9630d2 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -1105,10 +1105,8 @@ static int dsa_slave_phy_connect(struct dsa_slave_priv 
*p,
/* Use already configured phy mode */
if (p->phy_interface == PHY_INTERFACE_MODE_NA)
p->phy_interface = p->phy->interface;
-   phy_connect_direct(slave_dev, p->phy, dsa_slave_adjust_link,
-  p->phy_interface);
-
-   return 0;
+   return phy_connect_direct(slave_dev, p->phy, dsa_slave_adjust_link,
+ p->phy_interface);
 }
 
 static int dsa_slave_phy_setup(struct dsa_slave_priv *p,
-- 
2.11.0

Re: [PATCH net-next 6/8] net: phy: Marvell: Add mv88e6390 internal PHY

2017-01-20 Thread Florian Fainelli

On 01/20/2017 03:31 PM, Andrew Lunn wrote:
> The mv88e6390 Ethernet switch has internal PHYs. These PHYs don't have
> an model ID in the ID2 register. So the MDIO driver in the switch
> intercepts reads to this register, and returns the switch family ID.
> Extend the Marvell PHY driver by including this ID, and tread the PHY
> as a 88E1540.
> 
> Signed-off-by: Andrew Lunn 

Reviewed-by: Florian Fainelli 
-- 
Florian

Re: [PATCH net-next 5/8] net: dsa: mv88e6xxx: Workaround missing PHY ID on mv88e6390

2017-01-20 Thread Florian Fainelli

On 01/20/2017 03:30 PM, Andrew Lunn wrote:
> The internal PHYs of the mv88e6390 do not have a model ID. Trap any
> calls to the ID register, and if it is zero, return the ID for the
> mv88e6390. The Marvell PHY driver can then bind to this ID.
> 
> Signed-off-by: Andrew Lunn 

Reviewed-by: Florian Fainelli 

Nice and clean!
-- 
Florian

Re: [PATCH net-next 1/8] net: dsa: mv88e6xxx: Implement external MDIO bus on mv88e6390

2017-01-20 Thread Florian Fainelli

On 01/20/2017 03:30 PM, Andrew Lunn wrote:
> The mv88e6390 has two MDIO busses. The internal MDIO bus is used for
> the internal PHYs. The external MDIO can be used for external PHYs.
> The external MDIO bus will be instantiated if there is an
> "mdio-external" node in the device tree.

This looks fine, although I am not clear why we cannot utilize a
standard representation of a MDIO bus (with PHY devices as child nodes)
which has a specific compatible string, e.g:
marvell,mv88e6390-external-mdio, and that is a child node of the 6390
Ethernet switch itself, something like:

/* assuming this is, e.g: an independent or CPU EThernet MAC MDIO bus */
 {
switch@0 {
compatible = "marvell,mv88e6390";
reg = <0>;

ports {
#address-cells = <1>;
#size-cells = <0>;

port@0 {
phy-handle = ;
reg = <0>;
};
};

mdio {
compatible = "marvell,mv88e6390-external-mdio";
#address-cells = <1>;
#size-cells = <0>;

phy0: phy@0 {
reg = <0>;
};
};
};
};

In both cases (your proposal) and this one, we still have a dependency
on the Ethernet switch driver being probed to create the internal and
external MDIO buses.

Thanks!
-- 
Florian

[PATCH net-next 8/8] net: dsa: mv88e6xxx: Fix typ0 when configuring 2.5Gbps

2017-01-20 Thread Andrew Lunn

In order to enable 2.5Gbps mode, we need the base speed of 10G, plus
the Alt bit setting. Fix a typ0 that used 1Gb base speed.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx/port.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c
index e253ecb6624b..d543a6817d61 100644
--- a/drivers/net/dsa/mv88e6xxx/port.c
+++ b/drivers/net/dsa/mv88e6xxx/port.c
@@ -194,7 +194,7 @@ static int mv88e6xxx_port_set_speed(struct mv88e6xxx_chip 
*chip, int port,
ctrl = PORT_PCS_CTRL_SPEED_1000;
break;
case 2500:
-   ctrl = PORT_PCS_CTRL_SPEED_1000 | PORT_PCS_CTRL_ALTSPEED;
+   ctrl = PORT_PCS_CTRL_SPEED_1 | PORT_PCS_CTRL_ALTSPEED;
break;
case 1:
/* all bits set, fall through... */
-- 
2.11.0

[PATCH net-next 3/8] net: dsa: mv88e6xxx: Fix ATU age timer for MV88E6390

2017-01-20 Thread Andrew Lunn

The MV88E6390 family uses a different ATU age timer coefficient.
Fix the the info structures.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx/chip.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 77e960826402..e79c8d09a95f 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -3932,7 +3932,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
.port_base_addr = 0x0,
.global1_addr = 0x1b,
.tag_protocol = DSA_TAG_PROTO_DSA,
-   .age_time_coeff = 15000,
+   .age_time_coeff = 3750,
.g1_irqs = 9,
.flags = MV88E6XXX_FLAGS_FAMILY_6390,
.ops = _ops,
@@ -3946,7 +3946,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
.num_ports = 11,/* 10 + Z80 */
.port_base_addr = 0x0,
.global1_addr = 0x1b,
-   .age_time_coeff = 15000,
+   .age_time_coeff = 3750,
.g1_irqs = 9,
.tag_protocol = DSA_TAG_PROTO_DSA,
.flags = MV88E6XXX_FLAGS_FAMILY_6390,
@@ -3961,7 +3961,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
.num_ports = 11,/* 10 + Z80 */
.port_base_addr = 0x0,
.global1_addr = 0x1b,
-   .age_time_coeff = 15000,
+   .age_time_coeff = 3750,
.g1_irqs = 9,
.tag_protocol = DSA_TAG_PROTO_DSA,
.flags = MV88E6XXX_FLAGS_FAMILY_6390,
@@ -3991,7 +3991,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
.num_ports = 11,/* 10 + Z80 */
.port_base_addr = 0x0,
.global1_addr = 0x1b,
-   .age_time_coeff = 15000,
+   .age_time_coeff = 3750,
.g1_irqs = 9,
.tag_protocol = DSA_TAG_PROTO_DSA,
.flags = MV88E6XXX_FLAGS_FAMILY_6390,
@@ -4080,7 +4080,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
.num_ports = 11,/* 10 + Z80 */
.port_base_addr = 0x0,
.global1_addr = 0x1b,
-   .age_time_coeff = 15000,
+   .age_time_coeff = 3750,
.g1_irqs = 9,
.tag_protocol = DSA_TAG_PROTO_DSA,
.flags = MV88E6XXX_FLAGS_FAMILY_6390,
@@ -4094,7 +4094,7 @@ static const struct mv88e6xxx_info mv88e6xxx_table[] = {
.num_ports = 11,/* 10 + Z80 */
.port_base_addr = 0x0,
.global1_addr = 0x1b,
-   .age_time_coeff = 15000,
+   .age_time_coeff = 3750,
.g1_irqs = 9,
.tag_protocol = DSA_TAG_PROTO_DSA,
.flags = MV88E6XXX_FLAGS_FAMILY_6390,
-- 
2.11.0

[PATCH net-next 1/8] net: dsa: mv88e6xxx: Implement external MDIO bus on mv88e6390

2017-01-20 Thread Andrew Lunn

The mv88e6390 has two MDIO busses. The internal MDIO bus is used for
the internal PHYs. The external MDIO can be used for external PHYs.
The external MDIO bus will be instantiated if there is an
"mdio-external" node in the device tree.

Signed-off-by: Andrew Lunn 
---
 .../devicetree/bindings/net/dsa/marvell.txt|   4 +
 drivers/net/dsa/mv88e6xxx/chip.c   | 136 +
 drivers/net/dsa/mv88e6xxx/global2.c|  10 +-
 drivers/net/dsa/mv88e6xxx/global2.h|   4 +-
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h  |  35 --
 5 files changed, 148 insertions(+), 41 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/dsa/marvell.txt 
b/Documentation/devicetree/bindings/net/dsa/marvell.txt
index b3dd6b40e0de..51b881f1645e 100644
--- a/Documentation/devicetree/bindings/net/dsa/marvell.txt
+++ b/Documentation/devicetree/bindings/net/dsa/marvell.txt
@@ -28,6 +28,10 @@ Optional properties:
 #interrupt-cells = <2> : Controller uses two cells, number and flag
 - mdio : container of PHY and devices on the switches MDIO
  bus
+
+Optional for the "marvell,mv88e6390"
+- mdio-external: A collection of PHY nodes on external bus
+
 Example:
 
mdio {
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index c7e08e13bb54..77e960826402 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -206,6 +206,12 @@ int mv88e6xxx_read(struct mv88e6xxx_chip *chip, int addr, 
int reg, u16 *val)
return 0;
 }
 
+static int mv88e6xxx_read_phy(struct mv88e6xxx_chip *chip, int addr, int reg,
+ u16 *val, bool external)
+{
+   return mv88e6xxx_read(chip, addr, reg, val);
+}
+
 int mv88e6xxx_write(struct mv88e6xxx_chip *chip, int addr, int reg, u16 val)
 {
int err;
@@ -222,26 +228,32 @@ int mv88e6xxx_write(struct mv88e6xxx_chip *chip, int 
addr, int reg, u16 val)
return 0;
 }
 
+static int mv88e6xxx_write_phy(struct mv88e6xxx_chip *chip, int addr, int reg,
+  u16 val, bool external)
+{
+   return mv88e6xxx_write(chip, addr, reg, val);
+}
+
 static int mv88e6xxx_phy_read(struct mv88e6xxx_chip *chip, int phy,
- int reg, u16 *val)
+ int reg, u16 *val, bool external)
 {
int addr = phy; /* PHY devices addresses start at 0x0 */
 
if (!chip->info->ops->phy_read)
return -EOPNOTSUPP;
 
-   return chip->info->ops->phy_read(chip, addr, reg, val);
+   return chip->info->ops->phy_read(chip, addr, reg, val, external);
 }
 
 static int mv88e6xxx_phy_write(struct mv88e6xxx_chip *chip, int phy,
-  int reg, u16 val)
+  int reg, u16 val, bool external)
 {
int addr = phy; /* PHY devices addresses start at 0x0 */
 
if (!chip->info->ops->phy_write)
return -EOPNOTSUPP;
 
-   return chip->info->ops->phy_write(chip, addr, reg, val);
+   return chip->info->ops->phy_write(chip, addr, reg, val, external);
 }
 
 static int mv88e6xxx_phy_page_get(struct mv88e6xxx_chip *chip, int phy, u8 
page)
@@ -249,7 +261,7 @@ static int mv88e6xxx_phy_page_get(struct mv88e6xxx_chip 
*chip, int phy, u8 page)
if (!mv88e6xxx_has(chip, MV88E6XXX_FLAG_PHY_PAGE))
return -EOPNOTSUPP;
 
-   return mv88e6xxx_phy_write(chip, phy, PHY_PAGE, page);
+   return mv88e6xxx_phy_write(chip, phy, PHY_PAGE, page, false);
 }
 
 static void mv88e6xxx_phy_page_put(struct mv88e6xxx_chip *chip, int phy)
@@ -257,7 +269,7 @@ static void mv88e6xxx_phy_page_put(struct mv88e6xxx_chip 
*chip, int phy)
int err;
 
/* Restore PHY page Copper 0x0 for access via the registered MDIO bus */
-   err = mv88e6xxx_phy_write(chip, phy, PHY_PAGE, PHY_PAGE_COPPER);
+   err = mv88e6xxx_phy_write(chip, phy, PHY_PAGE, PHY_PAGE_COPPER, false);
if (unlikely(err)) {
dev_err(chip->dev, "failed to restore PHY %d page Copper 
(%d)\n",
phy, err);
@@ -275,7 +287,7 @@ static int mv88e6xxx_phy_page_read(struct mv88e6xxx_chip 
*chip, int phy,
 
err = mv88e6xxx_phy_page_get(chip, phy, page);
if (!err) {
-   err = mv88e6xxx_phy_read(chip, phy, reg, val);
+   err = mv88e6xxx_phy_read(chip, phy, reg, val, false);
mv88e6xxx_phy_page_put(chip, phy);
}
 
@@ -293,7 +305,7 @@ static int mv88e6xxx_phy_page_write(struct mv88e6xxx_chip 
*chip, int phy,
 
err = mv88e6xxx_phy_page_get(chip, phy, page);
if (!err) {
-   err = mv88e6xxx_phy_write(chip, phy, PHY_PAGE, page);
+   err = mv88e6xxx_phy_write(chip, phy, PHY_PAGE, page, false);
mv88e6xxx_phy_page_put(chip, phy);
}
 
@@ -612,7 +624,7 @@ static void mv88e6xxx_ppu_state_destroy(struct

[PATCH net-next 2/8] net: phy: Add 2000base-x, 2500base-x and rxaui modes

2017-01-20 Thread Andrew Lunn

The mv88e6390 ports 9 and 10 supports some additional PHY modes. Add
these modes to the PHY core so they can be used in the binding.

Signed-off-by: Andrew Lunn 
---
 Documentation/devicetree/bindings/net/ethernet.txt | 3 +++
 include/linux/phy.h| 9 +
 2 files changed, 12 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/ethernet.txt 
b/Documentation/devicetree/bindings/net/ethernet.txt
index 05150957ecfd..3a6916909d90 100644
--- a/Documentation/devicetree/bindings/net/ethernet.txt
+++ b/Documentation/devicetree/bindings/net/ethernet.txt
@@ -29,6 +29,9 @@ The following properties are common to the Ethernet 
controllers:
   * "smii"
   * "xgmii"
   * "trgmii"
+  * "2000base-x",
+  * "2500base-x",
+  * "rxaui"
 - phy-connection-type: the same as "phy-mode" property but described in ePAPR;
 - phy-handle: phandle, specifies a reference to a node representing a PHY
   device; this property is described in ePAPR and so preferred;
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 5c9d2529685f..2252ce88efdd 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -82,6 +82,9 @@ typedef enum {
PHY_INTERFACE_MODE_MOCA,
PHY_INTERFACE_MODE_QSGMII,
PHY_INTERFACE_MODE_TRGMII,
+   PHY_INTERFACE_MODE_1000BASEX,
+   PHY_INTERFACE_MODE_2500BASEX,
+   PHY_INTERFACE_MODE_RXAUI,
PHY_INTERFACE_MODE_MAX,
 } phy_interface_t;
 
@@ -142,6 +145,12 @@ static inline const char *phy_modes(phy_interface_t 
interface)
return "qsgmii";
case PHY_INTERFACE_MODE_TRGMII:
return "trgmii";
+   case PHY_INTERFACE_MODE_1000BASEX:
+   return "1000base-x";
+   case PHY_INTERFACE_MODE_2500BASEX:
+   return "2500base-x";
+   case PHY_INTERFACE_MODE_RXAUI:
+   return "rxaui";
default:
return "unknown";
}
-- 
2.11.0

[PATCH net-next 5/8] net: dsa: mv88e6xxx: Workaround missing PHY ID on mv88e6390

2017-01-20 Thread Andrew Lunn

The internal PHYs of the mv88e6390 do not have a model ID. Trap any
calls to the ID register, and if it is zero, return the ID for the
mv88e6390. The Marvell PHY driver can then bind to this ID.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx/global2.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/mv88e6xxx/global2.c 
b/drivers/net/dsa/mv88e6xxx/global2.c
index 5c9ccb6775ae..e2b00d1b0c28 100644
--- a/drivers/net/dsa/mv88e6xxx/global2.c
+++ b/drivers/net/dsa/mv88e6xxx/global2.c
@@ -518,7 +518,21 @@ int mv88e6xxx_g2_smi_phy_read(struct mv88e6xxx_chip *chip, 
int addr, int reg,
if (err)
return err;
 
-   return mv88e6xxx_g2_read(chip, GLOBAL2_SMI_PHY_DATA, val);
+   err = mv88e6xxx_g2_read(chip, GLOBAL2_SMI_PHY_DATA, val);
+   if (err)
+   return err;
+
+   if (reg == MII_PHYSID2) {
+   /* The mv88e6390 internal PHYS don't have a model number.
+* Use the switch family model number instead.
+*/
+   if (!(*val & 0x1ff)) {
+   if (chip->info->family == MV88E6XXX_FAMILY_6390)
+   *val |= PORT_SWITCH_ID_PROD_NUM_6390;
+   }
+   }
+
+   return 0;
 }
 
 int mv88e6xxx_g2_smi_phy_write(struct mv88e6xxx_chip *chip, int addr, int reg,
-- 
2.11.0

[net-next 0/8] More MV88E6390 patches

2017-01-20 Thread Andrew Lunn

This is the ongoing work to add support for the Marvell 9390 family of
switches. There are now two MDIO busses, one for the internal PHYs and
an external bus for external PHYs. Add support for this external bus.

This switch supports 2Gbps, 2.5Gbps and 10Gbps ports. Add phy-modes
for these speeds/interfaces. These modes are then used to configure
ports 9 and 10, which support these speeds.

The internal PHYs oddly use a Marvell Vendor ID, but have empty
product ID. Trap reads of the product ID, and if it returns 0,
instead return the switch family ID. The Marvell PHY driver is then
extended to support this ID, treating the PHY as compatible with the
1540.

Both the internal and external MDIO busses are capable of clause 45
addressing, and is required to accessing the SERDES
interfaces. Implement support for clause 45.

And two fixes are included. There is no need to port these to stable,
the current support for the 6390 is not sufficient to be usable.

Andrew Lunn (8):
  net: dsa: mv88e6xxx: Implement external MDIO bus on mv88e6390
  net: phy: Add 2000base-x, 2500base-x and rxaui modes
  net: dsa: mv88e6xxx: Fix ATU age timer for MV88E6390
  net: dsa: mv88e6xxx: Set the CMODE for mv88e6390 ports 9 & 10
  net: dsa: mv88e6xxx: Workaround missing PHY ID on mv88e6390
  net: phy: Marvell: Add mv88e6390 internal PHY
  net: dsa: mv88e6xxx: Implement Clause 45 access to SMI devices
  net: dsa: mv88e6xxx: Fix typ0 when configuring 2.5Gbps

 .../devicetree/bindings/net/dsa/marvell.txt|   4 +
 Documentation/devicetree/bindings/net/ethernet.txt |   3 +
 drivers/net/dsa/mv88e6xxx/chip.c   | 168 -
 drivers/net/dsa/mv88e6xxx/global2.c| 130 +++-
 drivers/net/dsa/mv88e6xxx/global2.h|   4 +-
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h  |  55 +--
 drivers/net/dsa/mv88e6xxx/port.c   |  66 +++-
 drivers/net/dsa/mv88e6xxx/port.h   |   3 +
 drivers/net/phy/marvell.c  |  20 +++
 include/linux/marvell_phy.h|   6 +
 include/linux/phy.h|   9 ++
 11 files changed, 409 insertions(+), 59 deletions(-)

-- 
2.11.0

[PATCH net] net: phy: Avoid deadlock during phy_error()

2017-01-20 Thread Florian Fainelli

phy_error() is called in the PHY state machine workqueue context, and
calls phy_trigger_machine() which does a cancel_delayed_work_sync() of
the workqueue we execute from, causing a deadlock situation.

Augment phy_trigger_machine() machine with a sync boolean indicating
whether we should use cancel_*_sync() or just cancel_*_work().

Fixes: 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not 
polling.")
Reported-by: Russell King 
Signed-off-by: Florian Fainelli 
---
 drivers/net/phy/phy.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 48da6e93c3f7..e687a9cb4a37 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -649,14 +649,18 @@ void phy_start_machine(struct phy_device *phydev)
  * phy_trigger_machine - trigger the state machine to run
  *
  * @phydev: the phy_device struct
+ * @sync: indicate whether we should wait for the workqueue cancelation
  *
  * Description: There has been a change in state which requires that the
  *   state machine runs.
  */
 
-static void phy_trigger_machine(struct phy_device *phydev)
+static void phy_trigger_machine(struct phy_device *phydev, bool sync)
 {
-   cancel_delayed_work_sync(>state_queue);
+   if (sync)
+   cancel_delayed_work_sync(>state_queue);
+   else
+   cancel_delayed_work(>state_queue);
queue_delayed_work(system_power_efficient_wq, >state_queue, 0);
 }
 
@@ -693,7 +697,7 @@ static void phy_error(struct phy_device *phydev)
phydev->state = PHY_HALTED;
mutex_unlock(>lock);
 
-   phy_trigger_machine(phydev);
+   phy_trigger_machine(phydev, false);
 }
 
 /**
@@ -840,7 +844,7 @@ void phy_change(struct phy_device *phydev)
}
 
/* reschedule state queue work to run as soon as possible */
-   phy_trigger_machine(phydev);
+   phy_trigger_machine(phydev, true);
return;
 
 ignore:
@@ -942,7 +946,7 @@ void phy_start(struct phy_device *phydev)
if (do_resume)
phy_resume(phydev);
 
-   phy_trigger_machine(phydev);
+   phy_trigger_machine(phydev, true);
 }
 EXPORT_SYMBOL(phy_start);
 
-- 
2.11.0

[PATCH net-next 7/8] net: dsa: mv88e6xxx: Implement Clause 45 access to SMI devices

2017-01-20 Thread Andrew Lunn

The mv88e6390 SERDES devices need clause 45 MDIO to access them.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx/global2.c   | 108 --
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h |   7 +++
 2 files changed, 111 insertions(+), 4 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/global2.c 
b/drivers/net/dsa/mv88e6xxx/global2.c
index e2b00d1b0c28..d1a0b7d3ebb8 100644
--- a/drivers/net/dsa/mv88e6xxx/global2.c
+++ b/drivers/net/dsa/mv88e6xxx/global2.c
@@ -501,8 +501,60 @@ static int mv88e6xxx_g2_smi_phy_cmd(struct mv88e6xxx_chip 
*chip, u16 cmd)
return mv88e6xxx_g2_smi_phy_wait(chip);
 }
 
-int mv88e6xxx_g2_smi_phy_read(struct mv88e6xxx_chip *chip, int addr, int reg,
- u16 *val, bool external)
+static int mv88e6xxx_g2_smi_phy_write_addr(struct mv88e6xxx_chip *chip,
+  int addr, int device, int reg,
+  bool external)
+{
+   int cmd = SMI_CMD_OP_45_WRITE_ADDR | (addr << 5) | device;
+   int err;
+
+   if (external)
+   cmd |= GLOBAL2_SMI_PHY_CMD_EXTERNAL;
+
+   err = mv88e6xxx_g2_smi_phy_wait(chip);
+   if (err)
+   return err;
+
+   err = mv88e6xxx_g2_write(chip, GLOBAL2_SMI_PHY_DATA, reg);
+   if (err)
+   return err;
+
+   return mv88e6xxx_g2_smi_phy_cmd(chip, cmd);
+}
+
+int mv88e6xxx_g2_smi_phy_read_c45(struct mv88e6xxx_chip *chip, int addr,
+ int reg_c45, u16 *val, bool external)
+{
+   int device = (reg_c45 >> 16) & 0x1f;
+   int reg = reg_c45 & 0x;
+   int err;
+   u16 cmd;
+
+   err = mv88e6xxx_g2_smi_phy_write_addr(chip, addr, device, reg,
+ external);
+   if (err)
+   return err;
+
+   cmd = GLOBAL2_SMI_PHY_CMD_OP_45_READ_DATA | (addr << 5) | device;
+
+   if (external)
+   cmd |= GLOBAL2_SMI_PHY_CMD_EXTERNAL;
+
+   err = mv88e6xxx_g2_smi_phy_cmd(chip, cmd);
+   if (err)
+   return err;
+
+   err = mv88e6xxx_g2_read(chip, GLOBAL2_SMI_PHY_DATA, val);
+   if (err)
+   return err;
+
+   err = *val;
+
+   return 0;
+}
+
+int mv88e6xxx_g2_smi_phy_read_c22(struct mv88e6xxx_chip *chip, int addr,
+ int reg, u16 *val, bool external)
 {
u16 cmd = GLOBAL2_SMI_PHY_CMD_OP_22_READ_DATA | (addr << 5) | reg;
int err;
@@ -535,8 +587,46 @@ int mv88e6xxx_g2_smi_phy_read(struct mv88e6xxx_chip *chip, 
int addr, int reg,
return 0;
 }
 
-int mv88e6xxx_g2_smi_phy_write(struct mv88e6xxx_chip *chip, int addr, int reg,
-  u16 val, bool external)
+int mv88e6xxx_g2_smi_phy_read(struct mv88e6xxx_chip *chip, int addr, int reg,
+ u16 *val, bool external)
+{
+   if (reg & MII_ADDR_C45)
+   return mv88e6xxx_g2_smi_phy_read_c45(chip, addr, reg, val,
+external);
+   return mv88e6xxx_g2_smi_phy_read_c22(chip, addr, reg, val, external);
+}
+
+int mv88e6xxx_g2_smi_phy_write_c45(struct mv88e6xxx_chip *chip, int addr,
+  int reg_c45, u16 val, bool external)
+{
+   int device = (reg_c45 >> 16) & 0x1f;
+   int reg = reg_c45 & 0x;
+   int err;
+   u16 cmd;
+
+   err = mv88e6xxx_g2_smi_phy_write_addr(chip, addr, device, reg,
+ external);
+   if (err)
+   return err;
+
+   cmd = GLOBAL2_SMI_PHY_CMD_OP_45_WRITE_DATA | (addr << 5) | device;
+
+   if (external)
+   cmd |= GLOBAL2_SMI_PHY_CMD_EXTERNAL;
+
+   err = mv88e6xxx_g2_write(chip, GLOBAL2_SMI_PHY_DATA, val);
+   if (err)
+   return err;
+
+   err = mv88e6xxx_g2_smi_phy_cmd(chip, cmd);
+   if (err)
+   return err;
+
+   return 0;
+}
+
+int mv88e6xxx_g2_smi_phy_write_c22(struct mv88e6xxx_chip *chip, int addr,
+  int reg, u16 val, bool external)
 {
u16 cmd = GLOBAL2_SMI_PHY_CMD_OP_22_WRITE_DATA | (addr << 5) | reg;
int err;
@@ -555,6 +645,16 @@ int mv88e6xxx_g2_smi_phy_write(struct mv88e6xxx_chip 
*chip, int addr, int reg,
return mv88e6xxx_g2_smi_phy_cmd(chip, cmd);
 }
 
+int mv88e6xxx_g2_smi_phy_write(struct mv88e6xxx_chip *chip, int addr, int reg,
+  u16 val, bool external)
+{
+   if (reg & MII_ADDR_C45)
+   return mv88e6xxx_g2_smi_phy_write_c45(chip, addr, reg, val,
+ external);
+
+   return mv88e6xxx_g2_smi_phy_write_c22(chip, addr, reg, val, external);
+}
+
 static void mv88e6xxx_g2_irq_mask(struct irq_data *d)
 {
struct mv88e6xxx_chip *chip = irq_data_get_irq_chip_data(d);
diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h

[PATCH net-next 6/8] net: phy: Marvell: Add mv88e6390 internal PHY

2017-01-20 Thread Andrew Lunn

The mv88e6390 Ethernet switch has internal PHYs. These PHYs don't have
an model ID in the ID2 register. So the MDIO driver in the switch
intercepts reads to this register, and returns the switch family ID.
Extend the Marvell PHY driver by including this ID, and tread the PHY
as a 88E1540.

Signed-off-by: Andrew Lunn 
---
 drivers/net/phy/marvell.c   | 20 
 include/linux/marvell_phy.h |  6 ++
 2 files changed, 26 insertions(+)

diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c
index 64229976ace1..d02bb58a8c99 100644
--- a/drivers/net/phy/marvell.c
+++ b/drivers/net/phy/marvell.c
@@ -2141,6 +2141,25 @@ static struct phy_driver marvell_drivers[] = {
.get_strings = marvell_get_strings,
.get_stats = marvell_get_stats,
},
+   {
+   .phy_id = MARVELL_PHY_ID_88E6390,
+   .phy_id_mask = MARVELL_PHY_ID_MASK,
+   .name = "Marvell 88E6390",
+   .features = PHY_GBIT_FEATURES,
+   .flags = PHY_HAS_INTERRUPT,
+   .probe = marvell_probe,
+   .config_init = _config_init,
+   .config_aneg = _config_aneg,
+   .read_status = _read_status,
+   .ack_interrupt = _ack_interrupt,
+   .config_intr = _config_intr,
+   .did_interrupt = _did_interrupt,
+   .resume = _resume,
+   .suspend = _suspend,
+   .get_sset_count = marvell_get_sset_count,
+   .get_strings = marvell_get_strings,
+   .get_stats = marvell_get_stats,
+   },
 };
 
 module_phy_driver(marvell_drivers);
@@ -2159,6 +2178,7 @@ static struct mdio_device_id __maybe_unused marvell_tbl[] 
= {
{ MARVELL_PHY_ID_88E1510, MARVELL_PHY_ID_MASK },
{ MARVELL_PHY_ID_88E1540, MARVELL_PHY_ID_MASK },
{ MARVELL_PHY_ID_88E3016, MARVELL_PHY_ID_MASK },
+   { MARVELL_PHY_ID_88E6390, MARVELL_PHY_ID_MASK },
{ }
 };
 
diff --git a/include/linux/marvell_phy.h b/include/linux/marvell_phy.h
index a57f0dfb6db7..3d616d7f65bf 100644
--- a/include/linux/marvell_phy.h
+++ b/include/linux/marvell_phy.h
@@ -19,6 +19,12 @@
 #define MARVELL_PHY_ID_88E1540 0x01410eb0
 #define MARVELL_PHY_ID_88E3016 0x01410e60
 
+/* The MV88e6390 Ethernet switch contains embedded PHYs. These PHYs do
+ * not have a model ID. So the switch driver traps reads to the ID2
+ * register and returns the switch family ID
+ */
+#define MARVELL_PHY_ID_88E6390 0x01410f90
+
 /* struct phy_device dev_flags definitions */
 #define MARVELL_PHY_M1145_FLAGS_RESISTANCE 0x0001
 #define MARVELL_PHY_M1118_DNS323_LEDS  0x0002
-- 
2.11.0

[PATCH net-next 4/8] net: dsa: mv88e6xxx: Set the CMODE for mv88e6390 ports 9 & 10

2017-01-20 Thread Andrew Lunn

Unlike most ports, ports 9 and 10 of the 6390X family have configurable
PHY modes. Set the mode as part of adjust_link().

Ordering is important, because the SERDES interfaces connected to
ports 9 and 10 can be split and assigned to other ports. The CMODE has
to be correctly set before the SERDES interface on another port can be
configured. Such configuration is likely to be performed in
port_enable() and port_disabled(), called on slave_open() and
slave_close().

The simple case is port 9 and 10 are used for 'CPU' or 'DSA'. In this
case, the CMODE is set via a phy-mode in dsa_cpu_dsa_setup(), which is
called early in the switch setup.

When ports 9 or 10 are used as user ports, and have a fixed-phy, when
the fixed fixed-phy is attached, dsa_slave_adjust_link() is called,
which results in the adjust_link function being called, setting the
cmode. The port_enable() will for other ports will be called much
later.

When ports 9 or 10 are used as user ports and have a real phy attached
which does not use all the available SERDES interface, e.g. a 1Gbps
SGMII, there is currently no mechanism in place to set the CMODE of
the port from software. It must be hoped the stripping resistors are
correct.

At the same time, add a function to get the cmode. This will be needed
when configuring the SERDES interfaces.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx/chip.c  |  8 +
 drivers/net/dsa/mv88e6xxx/mv88e6xxx.h |  9 +
 drivers/net/dsa/mv88e6xxx/port.c  | 64 +++
 drivers/net/dsa/mv88e6xxx/port.h  |  3 ++
 4 files changed, 84 insertions(+)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index e79c8d09a95f..4c53db6ceffc 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -718,6 +718,12 @@ static int mv88e6xxx_port_setup_mac(struct mv88e6xxx_chip 
*chip, int port,
goto restore_link;
}
 
+   if (chip->info->ops->port_set_cmode) {
+   err = chip->info->ops->port_set_cmode(chip, port, mode);
+   if (err && err != -EOPNOTSUPP)
+   goto restore_link;
+   }
+
err = 0;
 restore_link:
if (chip->info->ops->port_set_link(chip, port, link))
@@ -3497,6 +3503,7 @@ static const struct mv88e6xxx_ops mv88e6290_ops = {
.port_set_egress_unknowns = mv88e6351_port_set_egress_unknowns,
.port_set_ether_type = mv88e6351_port_set_ether_type,
.port_pause_config = mv88e6390_port_pause_config,
+   .port_set_cmode = mv88e6390x_port_set_cmode,
.stats_snapshot = mv88e6390_g1_stats_snapshot,
.stats_set_histogram = mv88e6390_g1_stats_set_histogram,
.stats_get_sset_count = mv88e6320_stats_get_sset_count,
@@ -3688,6 +3695,7 @@ static const struct mv88e6xxx_ops mv88e6390x_ops = {
.port_jumbo_config = mv88e6165_port_jumbo_config,
.port_egress_rate_limiting = mv88e6097_port_egress_rate_limiting,
.port_pause_config = mv88e6390_port_pause_config,
+   .port_set_cmode = mv88e6390x_port_set_cmode,
.stats_snapshot = mv88e6390_g1_stats_snapshot,
.stats_set_histogram = mv88e6390_g1_stats_set_histogram,
.stats_get_sset_count = mv88e6320_stats_get_sset_count,
diff --git a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h 
b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
index b7f60a1bd4dc..7961467f709e 100644
--- a/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
+++ b/drivers/net/dsa/mv88e6xxx/mv88e6xxx.h
@@ -58,6 +58,9 @@
 #define PORT_STATUS_CMODE_100BASE_X0x8
 #define PORT_STATUS_CMODE_1000BASE_X   0x9
 #define PORT_STATUS_CMODE_SGMII0xa
+#define PORT_STATUS_CMODE_2500BASEX0xb
+#define PORT_STATUS_CMODE_XAUI 0xc
+#define PORT_STATUS_CMODE_RXAUI0xd
 #define PORT_PCS_CTRL  0x01
 #define PORT_PCS_CTRL_RGMII_DELAY_RXCLKBIT(15)
 #define PORT_PCS_CTRL_RGMII_DELAY_TXCLKBIT(14)
@@ -833,6 +836,12 @@ struct mv88e6xxx_ops {
int (*port_egress_rate_limiting)(struct mv88e6xxx_chip *chip, int port);
int (*port_pause_config)(struct mv88e6xxx_chip *chip, int port);
 
+   /* CMODE control what PHY mode the MAC will use, eg. SGMII, RGMII, etc.
+* Some chips allow this to be configured on specific ports.
+*/
+   int (*port_set_cmode)(struct mv88e6xxx_chip *chip, int port,
+ phy_interface_t mode);
+
/* Snapshot the statistics for a port. The statistics can then
 * be read back a leisure but still with a consistent view.
 */
diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c
index 0db7fa0373ae..e253ecb6624b 100644
--- a/drivers/net/dsa/mv88e6xxx/port.c
+++ b/drivers/net/dsa/mv88e6xxx/port.c
@@ -11,6 +11,7 @@
  * (at your option) any later version.
  */
 
+#include 
 #include "mv88e6xxx.h"
 #include "port.h"
 
@@ -304,6 +305,69 @@ int mv88e6390x_port_set_speed(struct

[PATCH] [net-next] net: qcom/emac: rename emac_phy to emac_sgmii and move it

2017-01-20 Thread Timur Tabi

The EMAC has an internal PHY that is often called the "SGMII".  This
SGMII is also connected to an external PHY, which is managed by phylib.
These dual PHYs often cause confusion.  In this case, the data structure
for managing the SGMII was mis-named and located in the wrong header file.

Structure emac_phy is renamed to emac_sgmii to clearly indicate it applies
to the internal PHY only.  It also also moved from emac_phy.h (which
supports the external PHY) to emac_sgmii.h (where it belongs).

To keep the changes minimal, only the structure name is changed, not
the names of any variables of that type.

Signed-off-by: Timur Tabi 
---
 drivers/net/ethernet/qualcomm/emac/emac-phy.c   |  2 --
 drivers/net/ethernet/qualcomm/emac/emac-phy.h   | 13 -
 drivers/net/ethernet/qualcomm/emac/emac-sgmii-fsm9900.c |  2 +-
 drivers/net/ethernet/qualcomm/emac/emac-sgmii-qdf2400.c |  2 +-
 drivers/net/ethernet/qualcomm/emac/emac-sgmii-qdf2432.c |  2 +-
 drivers/net/ethernet/qualcomm/emac/emac-sgmii.c |  8 
 drivers/net/ethernet/qualcomm/emac/emac-sgmii.h | 13 +
 drivers/net/ethernet/qualcomm/emac/emac.c   |  2 +-
 drivers/net/ethernet/qualcomm/emac/emac.h   |  3 ++-
 9 files changed, 23 insertions(+), 24 deletions(-)

diff --git a/drivers/net/ethernet/qualcomm/emac/emac-phy.c 
b/drivers/net/ethernet/qualcomm/emac/emac-phy.c
index 2851b4c..1d7852f 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-phy.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-phy.c
@@ -22,8 +22,6 @@
 #include 
 #include "emac.h"
 #include "emac-mac.h"
-#include "emac-phy.h"
-#include "emac-sgmii.h"
 
 /* EMAC base register offsets */
 #define EMAC_MDIO_CTRL0x001414
diff --git a/drivers/net/ethernet/qualcomm/emac/emac-phy.h 
b/drivers/net/ethernet/qualcomm/emac/emac-phy.h
index 49f3701..c0c301c 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-phy.h
+++ b/drivers/net/ethernet/qualcomm/emac/emac-phy.h
@@ -13,19 +13,6 @@
 #ifndef _EMAC_PHY_H_
 #define _EMAC_PHY_H_
 
-typedef int (*emac_sgmii_initialize)(struct emac_adapter *adpt);
-
-/** emac_phy - internal emac phy
- * @base base address
- * @digital per-lane digital block
- * @initialize initialization function
- */
-struct emac_phy {
-   void __iomem*base;
-   void __iomem*digital;
-   emac_sgmii_initialize   initialize;
-};
-
 struct emac_adapter;
 
 int emac_phy_config(struct platform_device *pdev, struct emac_adapter *adpt);
diff --git a/drivers/net/ethernet/qualcomm/emac/emac-sgmii-fsm9900.c 
b/drivers/net/ethernet/qualcomm/emac/emac-sgmii-fsm9900.c
index af690e1..10de8d0 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-sgmii-fsm9900.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-sgmii-fsm9900.c
@@ -214,7 +214,7 @@ static void emac_reg_write_all(void __iomem *base,
 
 int emac_sgmii_init_fsm9900(struct emac_adapter *adpt)
 {
-   struct emac_phy *phy = >phy;
+   struct emac_sgmii *phy = >phy;
unsigned int i;
 
emac_reg_write_all(phy->base, physical_coding_sublayer_programming,
diff --git a/drivers/net/ethernet/qualcomm/emac/emac-sgmii-qdf2400.c 
b/drivers/net/ethernet/qualcomm/emac/emac-sgmii-qdf2400.c
index 5b84194..f62c215 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-sgmii-qdf2400.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-sgmii-qdf2400.c
@@ -174,7 +174,7 @@ static void emac_reg_write_all(void __iomem *base,
 
 int emac_sgmii_init_qdf2400(struct emac_adapter *adpt)
 {
-   struct emac_phy *phy = >phy;
+   struct emac_sgmii *phy = >phy;
void __iomem *phy_regs = phy->base;
void __iomem *laned = phy->digital;
unsigned int i;
diff --git a/drivers/net/ethernet/qualcomm/emac/emac-sgmii-qdf2432.c 
b/drivers/net/ethernet/qualcomm/emac/emac-sgmii-qdf2432.c
index 6170200..b9c0df7 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-sgmii-qdf2432.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-sgmii-qdf2432.c
@@ -167,7 +167,7 @@ static void emac_reg_write_all(void __iomem *base,
 
 int emac_sgmii_init_qdf2432(struct emac_adapter *adpt)
 {
-   struct emac_phy *phy = >phy;
+   struct emac_sgmii *phy = >phy;
void __iomem *phy_regs = phy->base;
void __iomem *laned = phy->digital;
unsigned int i;
diff --git a/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c 
b/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
index bf722a9..0149b52 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-sgmii.c
@@ -50,7 +50,7 @@
 static int emac_sgmii_link_init(struct emac_adapter *adpt)
 {
struct phy_device *phydev = adpt->phydev;
-   struct emac_phy *phy = >phy;
+   struct emac_sgmii *phy = >phy;
u32 val;
 
val = readl(phy->base + EMAC_SGMII_PHY_AUTONEG_CFG2);
@@ -89,7 +89,7 @@ static int emac_sgmii_link_init(struct emac_adapter *adpt)
 
 static int

[PATCH] [net-next][v2] net: qcom/emac: claim the irq only when the device is opened

2017-01-20 Thread Timur Tabi

During reset, functions emac_mac_down() and emac_mac_up() are called,
so we don't want to free and claim the IRQ unnecessarily.  Move those
operations to open/close.

Signed-off-by: Timur Tabi 
---

Notes:
v2: keep synchronize_irq call where it is

 drivers/net/ethernet/qualcomm/emac/emac-mac.c | 13 -
 drivers/net/ethernet/qualcomm/emac/emac.c | 11 +++
 drivers/net/ethernet/qualcomm/emac/emac.h |  1 -
 3 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/qualcomm/emac/emac-mac.c 
b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
index 98570eb..e4793d7 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac-mac.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac-mac.c
@@ -314,8 +314,6 @@ struct emac_skb_cb {
RX_PKT_INT2 |\
RX_PKT_INT3)
 
-#define EMAC_MAC_IRQ_RES   "core0"
-
 void emac_mac_multicast_addr_set(struct emac_adapter *adpt, u8 *addr)
 {
u32 crc32, bit, reg, mta;
@@ -977,26 +975,16 @@ static void emac_adjust_link(struct net_device *netdev)
 int emac_mac_up(struct emac_adapter *adpt)
 {
struct net_device *netdev = adpt->netdev;
-   struct emac_irq *irq = >irq;
int ret;
 
emac_mac_rx_tx_ring_reset_all(adpt);
emac_mac_config(adpt);
-
-   ret = request_irq(irq->irq, emac_isr, 0, EMAC_MAC_IRQ_RES, irq);
-   if (ret) {
-   netdev_err(adpt->netdev, "could not request %s irq\n",
-  EMAC_MAC_IRQ_RES);
-   return ret;
-   }
-
emac_mac_rx_descs_refill(adpt, >rx_q);
 
ret = phy_connect_direct(netdev, adpt->phydev, emac_adjust_link,
 PHY_INTERFACE_MODE_SGMII);
if (ret) {
netdev_err(adpt->netdev, "could not connect phy\n");
-   free_irq(irq->irq, irq);
return ret;
}
 
@@ -1030,7 +1018,6 @@ void emac_mac_down(struct emac_adapter *adpt)
writel(DIS_INT, adpt->base + EMAC_INT_STATUS);
writel(0, adpt->base + EMAC_INT_MASK);
synchronize_irq(adpt->irq.irq);
-   free_irq(adpt->irq.irq, >irq);
 
phy_disconnect(adpt->phydev);
 
diff --git a/drivers/net/ethernet/qualcomm/emac/emac.c 
b/drivers/net/ethernet/qualcomm/emac/emac.c
index b74ec7f..3e1be91 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac.c
@@ -256,18 +256,27 @@ static int emac_change_mtu(struct net_device *netdev, int 
new_mtu)
 static int emac_open(struct net_device *netdev)
 {
struct emac_adapter *adpt = netdev_priv(netdev);
+   struct emac_irq *irq = >irq;
int ret;
 
+   ret = request_irq(irq->irq, emac_isr, 0, "emac-core0", irq);
+   if (ret) {
+   netdev_err(adpt->netdev, "could not request emac-core0 irq\n");
+   return ret;
+   }
+
/* allocate rx/tx dma buffer & descriptors */
ret = emac_mac_rx_tx_rings_alloc_all(adpt);
if (ret) {
netdev_err(adpt->netdev, "error allocating rx/tx rings\n");
+   free_irq(irq->irq, irq);
return ret;
}
 
ret = emac_mac_up(adpt);
if (ret) {
emac_mac_rx_tx_rings_free_all(adpt);
+   free_irq(irq->irq, irq);
return ret;
}
 
@@ -286,6 +295,8 @@ static int emac_close(struct net_device *netdev)
emac_mac_down(adpt);
emac_mac_rx_tx_rings_free_all(adpt);
 
+   free_irq(adpt->irq.irq, >irq);
+
mutex_unlock(>reset_lock);
 
return 0;
diff --git a/drivers/net/ethernet/qualcomm/emac/emac.h 
b/drivers/net/ethernet/qualcomm/emac/emac.h
index 1368440..2725507 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac.h
+++ b/drivers/net/ethernet/qualcomm/emac/emac.h
@@ -331,7 +331,6 @@ struct emac_adapter {
 
 int emac_reinit_locked(struct emac_adapter *adpt);
 void emac_reg_update32(void __iomem *addr, u32 mask, u32 val);
-irqreturn_t emac_isr(int irq, void *data);
 
 void emac_set_ethtool_ops(struct net_device *netdev);
 void emac_update_hw_stats(struct emac_adapter *adpt);
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

Re: [PATCH v5 2/2] net: dsa: mv88e6xxx: Add support for ethernet switch 88E6341

2017-01-20 Thread Andrew Lunn

On Fri, Jan 20, 2017 at 12:30:16PM -0500, Vivien Didelot wrote:
> Hi Gregory,
> 
> Gregory CLEMENT  writes:
> 
> > If there a series about to be merged I can rebase my series on it. Else
> > I propose to keep it and convert the family check to ops when you will
> > send the series for it.
> 
> I am reworking the VTU operations, but not these port operations yet.

I'm working on them. I have a patch ready for testing on my hardware.

Andrew

[PATCH] net: adaptec: starfire: add checks for dma mapping errors

2017-01-20 Thread Alexey Khoroshilov

init_ring() and refill_rx_ring() don't check if mapping dma memory succeed.
The patch adds the checks and failure handling.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov 
---
 drivers/net/ethernet/adaptec/starfire.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/net/ethernet/adaptec/starfire.c 
b/drivers/net/ethernet/adaptec/starfire.c
index c12d2618eebf..e27043d051e1 100644
--- a/drivers/net/ethernet/adaptec/starfire.c
+++ b/drivers/net/ethernet/adaptec/starfire.c
@@ -1152,6 +1152,12 @@ static void init_ring(struct net_device *dev)
if (skb == NULL)
break;
np->rx_info[i].mapping = pci_map_single(np->pci_dev, skb->data, 
np->rx_buf_sz, PCI_DMA_FROMDEVICE);
+   if (pci_dma_mapping_error(np->pci_dev,
+ np->rx_info[i].mapping)) {
+   dev_kfree_skb(skb);
+   np->rx_info[i].skb = NULL;
+   break;
+   }
/* Grrr, we cannot offset to correctly align the IP header. */
np->rx_ring[i].rxaddr = cpu_to_dma(np->rx_info[i].mapping | 
RxDescValid);
}
@@ -1569,6 +1575,12 @@ static void refill_rx_ring(struct net_device *dev)
break;  /* Better luck next round. */
np->rx_info[entry].mapping =
pci_map_single(np->pci_dev, skb->data, 
np->rx_buf_sz, PCI_DMA_FROMDEVICE);
+   if (pci_dma_mapping_error(np->pci_dev,
+   np->rx_info[entry].mapping)) {
+   dev_kfree_skb(skb);
+   np->rx_info[entry].skb = NULL;
+   break;
+   }
np->rx_ring[entry].rxaddr =
cpu_to_dma(np->rx_info[entry].mapping | 
RxDescValid);
}
-- 
2.7.4

Re: fs, net: deadlock between bind/splice on af_unix

2017-01-20 Thread Dmitry Vyukov

On Fri, Jan 20, 2017 at 5:57 AM, Cong Wang  wrote:
>> > Why do we do autobind there, anyway, and why is it conditional on
>> > SOCK_PASSCRED?  Note that e.g. for SOCK_STREAM we can bloody well get
>> > to sending stuff without autobind ever done - just use socketpair()
>> > to create that sucker and we won't be going through the connect()
>> > at all.
>>
>> In the case Dmitry reported, unix_dgram_sendmsg() calls unix_autobind(),
>> not SOCK_STREAM.
>
> Yes, I've noticed.  What I'm asking is what in there needs autobind 
> triggered
> on sendmsg and why doesn't the same need affect the SOCK_STREAM case?
>
>> I guess some lock, perhaps the u->bindlock could be dropped before
>> acquiring the next one (sb_writer), but I need to double check.
>
> Bad idea, IMO - do you *want* autobind being able to come through while
> bind(2) is busy with mknod?


 Ping. This is still happening on HEAD.

>>>
>>> Thanks for your reminder. Mind to give the attached patch (compile only)
>>> a try? I take another approach to fix this deadlock, which moves the
>>> unix_mknod() out of unix->bindlock. Not sure if there is any unexpected
>>> impact with this way.
>>
>>
>> I instantly hit:
>>
>
> Oh, sorry about it, I forgot to initialize struct path...
>
> Attached is the updated version, I just did a boot test, no crash at least. ;)
>
> Thanks!

This works! I did not see the deadlock warning, nor any other related crashes.

Tested-by: Dmitry Vyukov

Re: [PATCH v3] net/irda: fix lockdep annotation

2017-01-20 Thread Dmitry Vyukov

On Thu, Jan 19, 2017 at 5:27 PM, David Miller  wrote:
> From: Dmitry Vyukov 
> Date: Thu, 19 Jan 2017 11:05:36 +0100
>
>> Thanks for looking into it! This particular issue bothers my fuzzers
>> considerably. I agree that removing recursion is better.
>> So do how we proceed? Will you mail this as a real patch?
>
> Someone needs to test this:


I've stressed this with the fuzzer for a day.
It gets rid of the lockdep warning, and I did not notice any other
related crashes.

Tested-by: Dmitry Vyukov 

Thanks!


> diff --git a/net/irda/irqueue.c b/net/irda/irqueue.c
> index acbe61c..160dc89 100644
> --- a/net/irda/irqueue.c
> +++ b/net/irda/irqueue.c
> @@ -383,9 +383,6 @@ EXPORT_SYMBOL(hashbin_new);
>   *for deallocating this structure if it's complex. If not the user can
>   *just supply kfree, which should take care of the job.
>   */
> -#ifdef CONFIG_LOCKDEP
> -static int hashbin_lock_depth = 0;
> -#endif
>  int hashbin_delete( hashbin_t* hashbin, FREE_FUNC free_func)
>  {
> irda_queue_t* queue;
> @@ -396,22 +393,27 @@ int hashbin_delete( hashbin_t* hashbin, FREE_FUNC 
> free_func)
> IRDA_ASSERT(hashbin->magic == HB_MAGIC, return -1;);
>
> /* Synchronize */
> -   if ( hashbin->hb_type & HB_LOCK ) {
> -   spin_lock_irqsave_nested(>hb_spinlock, flags,
> -hashbin_lock_depth++);
> -   }
> +   if (hashbin->hb_type & HB_LOCK)
> +   spin_lock_irqsave(>hb_spinlock, flags);
>
> /*
>  *  Free the entries in the hashbin, TODO: use hashbin_clear when
>  *  it has been shown to work
>  */
> for (i = 0; i < HASHBIN_SIZE; i ++ ) {
> -   queue = dequeue_first((irda_queue_t**) >hb_queue[i]);
> -   while (queue ) {
> -   if (free_func)
> -   (*free_func)(queue);
> -   queue = dequeue_first(
> -   (irda_queue_t**) >hb_queue[i]);
> +   while (1) {
> +   queue = dequeue_first((irda_queue_t**) 
> >hb_queue[i]);
> +
> +   if (!queue)
> +   break;
> +
> +   if (free_func) {
> +   if (hashbin->hb_type & HB_LOCK)
> +   
> spin_unlock_irqrestore(>hb_spinlock, flags);
> +   free_func(queue);
> +   if (hashbin->hb_type & HB_LOCK)
> +   
> spin_lock_irqsave(>hb_spinlock, flags);
> +   }
> }
> }
>
> @@ -420,12 +422,8 @@ int hashbin_delete( hashbin_t* hashbin, FREE_FUNC 
> free_func)
> hashbin->magic = ~HB_MAGIC;
>
> /* Release lock */
> -   if ( hashbin->hb_type & HB_LOCK) {
> +   if (hashbin->hb_type & HB_LOCK)
> spin_unlock_irqrestore(>hb_spinlock, flags);
> -#ifdef CONFIG_LOCKDEP
> -   hashbin_lock_depth--;
> -#endif
> -   }
>
> /*
>  *  Free the hashbin structure

Re: [PATCH v5 0/2] Add support for the ethernet switch on the ESPRESSObin

2017-01-20 Thread Andrew Lunn

> Actually I didn't find anything related to the temperature measurement
> in the datasheet I have. For the 6390 there is a dedicated datsheet for
> the PHY part for the 6352 it is part of the same datasheet.

Hi Gregory

The temperature sensor changes have landed in net-next. If you have
time, please rebase to it and do some tests. Here are the likely
outcomes:

1) Like the 6390, it does not have a valid PHY product ID. Hence the
Marvell PHY driver is not loaded. You can see the PHY ID in

/sys/bus/mdio_bus/devices/*/phy_id

If it is 0x0141, there is no product ID. I have a workaround for
this.

2) It has a valid phy_id, but it is not known to the marvell driver.
Add an entry to the table at the bottom of drivers/net/phy/marvell.c,
and a new entry in marvell_drivers. I would copy the 1540.


3) The Marvell PHY driver does recognise it, and makes the temperature
available in /sys/class/hwmon/hwmon*/temp1_input. It always returns
-25000mC. Same problem i have with the 6390. No idea how to fix it yet.

4) The Marvell PHY driver does recognise it, and makes the temperature
available in /sys/class/hwmon/hwmon*/temp1_input. The value is O.K. It
all works :-)

Personally, i'm not betting on 4 :-)


Andrew

Re: [PATCH] net: qcom/emac: claim the irq only when the device is opened

2017-01-20 Thread Timur Tabi


On 01/20/2017 03:31 PM, Lino Sanfilippo wrote:


In emac_mac_down() however we need synchronize_irq(), since it ensures
that the irq
handler is not running any more when it (synchronize_irq) returns.


So in general, if a driver disables a interrupt but does not free it, it 
should call synchronize_irq()?


--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

Re: [PATCH] net: qcom/emac: claim the irq only when the device is opened

2017-01-20 Thread Lino Sanfilippo




On 20.01.2017 22:05, Timur Tabi wrote:

On 01/20/2017 02:44 PM, Lino Sanfilippo wrote:



On 18.01.2017 22:42, Timur Tabi wrote:

@@ -1029,8 +1017,6 @@ void emac_mac_down(struct emac_adapter *adpt)
   */
  writel(DIS_INT, adpt->base + EMAC_INT_STATUS);
  writel(0, adpt->base + EMAC_INT_MASK);
-synchronize_irq(adpt->irq.irq);


There is no reason to remove the irq synchronization, is it?
Note that the desriptors are freed after that so we must be sure that
the irq handler is not running any more.


I'm moving it to stay with the free_irq().

@@ -283,6 +292,9 @@ static int emac_close(struct net_device *netdev)

 mutex_lock(>reset_lock);

+synchronize_irq(adpt->irq.irq);
+free_irq(adpt->irq.irq, >irq);
+

However, I'll admit that I don't know why we call synchronize_irq() at 
all.




free_irq() will call synchronize_irq() if necessary, so it is pointless 
to call synchronize_irq()

right before free_irq().
In emac_mac_down() however we need synchronize_irq(), since it ensures 
that the irq

handler is not running any more when it (synchronize_irq) returns.

Regards,
Lino

[PATCH v3 09/37] RDS: IB: Remove an unused structure member

2017-01-20 Thread Bart Van Assche

Signed-off-by: Bart Van Assche 
Acked-by: Santosh Shilimkar 
Cc: David S. Miller 
Cc: linux-r...@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: rds-de...@oss.oracle.com
---
 net/rds/ib_mr.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/rds/ib_mr.h b/net/rds/ib_mr.h
index 1c754f4acbe5..24c086db4511 100644
--- a/net/rds/ib_mr.h
+++ b/net/rds/ib_mr.h
@@ -45,7 +45,6 @@
 
 struct rds_ib_fmr {
struct ib_fmr   *fmr;
-   u64 *dma;
 };
 
 enum rds_ib_fr_state {
-- 
2.11.0

Re: [PATCHv4 net-next 3/5] sctp: implement sender-side procedures for SSN/TSN Reset Request Parameter

2017-01-20 Thread marcelo . leitner

On Fri, Jan 20, 2017 at 02:00:34PM -0500, David Miller wrote:
> From: Marcelo Ricardo Leitner 
> Date: Fri, 20 Jan 2017 16:25:22 -0200
> 
> > I talked offline with Xin about this and we cannot do it this way.
> > Unfortunatelly we will have to take the long road here, because then
> > we may send data while sending the request, as the streams are not
> > closed yet.  We really need to close team, send the request, and
> > re-open if the send fails.
> 
> I am expecting another spin of this series, correct?

Yes.

Re: [PATCH] net: qcom/emac: claim the irq only when the device is opened

2017-01-20 Thread Timur Tabi


On 01/20/2017 02:44 PM, Lino Sanfilippo wrote:



On 18.01.2017 22:42, Timur Tabi wrote:

@@ -1029,8 +1017,6 @@ void emac_mac_down(struct emac_adapter *adpt)
   */
  writel(DIS_INT, adpt->base + EMAC_INT_STATUS);
  writel(0, adpt->base + EMAC_INT_MASK);
-synchronize_irq(adpt->irq.irq);


There is no reason to remove the irq synchronization, is it?
Note that the desriptors are freed after that so we must be sure that
the irq handler is not running any more.


I'm moving it to stay with the free_irq().

@@ -283,6 +292,9 @@ static int emac_close(struct net_device *netdev)

mutex_lock(>reset_lock);

+   synchronize_irq(adpt->irq.irq);
+   free_irq(adpt->irq.irq, >irq);
+

However, I'll admit that I don't know why we call synchronize_irq() at all.

--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

[PATCH net v2] net: mpls: Fix multipath selection for LSR use case

2017-01-20 Thread David Ahern

MPLS multipath for LSR is broken -- always selecting the first nexthop
in the one label case. For example:

$ ip -f mpls ro ls
100
nexthop as to 200 via inet 172.16.2.2  dev virt12
nexthop as to 300 via inet 172.16.3.2  dev virt13
101
nexthop as to 201 via inet6 2000:2::2  dev virt12
nexthop as to 301 via inet6 2000:3::2  dev virt13

In this example incoming packets have a single MPLS labels which means
BOS bit is set. The BOS bit is passed from mpls_forward down to
mpls_multipath_hash which never processes the hash loop because BOS is 1.

Update mpls_multipath_hash to process the entire label stack. mpls_hdr_len
tracks the total mpls header length on each pass (on pass N mpls_hdr_len
is N * sizeof(mpls_shim_hdr)). When the label is found with the BOS set
it verifies the skb has sufficient header for ipv4 or ipv6, and find the
IPv4 and IPv6 header by using the last mpls_hdr pointer and adding 1 to
advance past it.

With these changes I have verified the code correctly sees the label,
BOS, IPv4 and IPv6 addresses in the network header and icmp/tcp/udp
traffic for ipv4 and ipv6 are distributed across the nexthops.

Fixes: 1c78efa8319ca ("mpls: flow-based multipath selection")
Acked-by: Robert Shearman 
Signed-off-by: David Ahern 
---
v2
- rebase against net/master; v1 was mistakenly based against net-next
- updated commit message based on Robert's comment about skipping the
  first label

 net/mpls/af_mpls.c | 48 +---
 1 file changed, 25 insertions(+), 23 deletions(-)

diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
index 15fe97644ffe..5b77377e5a15 100644
--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -98,18 +98,19 @@ bool mpls_pkt_too_big(const struct sk_buff *skb, unsigned 
int mtu)
 }
 EXPORT_SYMBOL_GPL(mpls_pkt_too_big);
 
-static u32 mpls_multipath_hash(struct mpls_route *rt,
-  struct sk_buff *skb, bool bos)
+static u32 mpls_multipath_hash(struct mpls_route *rt, struct sk_buff *skb)
 {
struct mpls_entry_decoded dec;
+   unsigned int mpls_hdr_len = 0;
struct mpls_shim_hdr *hdr;
bool eli_seen = false;
int label_index;
u32 hash = 0;
 
-   for (label_index = 0; label_index < MAX_MP_SELECT_LABELS && !bos;
+   for (label_index = 0; label_index < MAX_MP_SELECT_LABELS;
 label_index++) {
-   if (!pskb_may_pull(skb, sizeof(*hdr) * label_index))
+   mpls_hdr_len += sizeof(*hdr);
+   if (!pskb_may_pull(skb, mpls_hdr_len))
break;
 
/* Read and decode the current label */
@@ -134,37 +135,38 @@ static u32 mpls_multipath_hash(struct mpls_route *rt,
eli_seen = true;
}
 
-   bos = dec.bos;
-   if (bos && pskb_may_pull(skb, sizeof(*hdr) * label_index +
-sizeof(struct iphdr))) {
+   if (!dec.bos)
+   continue;
+
+   /* found bottom label; does skb have room for a header? */
+   if (pskb_may_pull(skb, mpls_hdr_len + sizeof(struct iphdr))) {
const struct iphdr *v4hdr;
 
-   v4hdr = (const struct iphdr *)(mpls_hdr(skb) +
-  label_index);
+   v4hdr = (const struct iphdr *)(hdr + 1);
if (v4hdr->version == 4) {
hash = jhash_3words(ntohl(v4hdr->saddr),
ntohl(v4hdr->daddr),
v4hdr->protocol, hash);
} else if (v4hdr->version == 6 &&
-   pskb_may_pull(skb, sizeof(*hdr) * label_index +
- sizeof(struct ipv6hdr))) {
+  pskb_may_pull(skb, mpls_hdr_len +
+sizeof(struct ipv6hdr))) {
const struct ipv6hdr *v6hdr;
 
-   v6hdr = (const struct ipv6hdr *)(mpls_hdr(skb) +
-   label_index);
-
+   v6hdr = (const struct ipv6hdr *)(hdr + 1);
hash = __ipv6_addr_jhash(>saddr, hash);
hash = __ipv6_addr_jhash(>daddr, hash);
hash = jhash_1word(v6hdr->nexthdr, hash);
}
}
+
+   break;
}
 
return hash;
 }
 
 static struct mpls_nh *mpls_select_multipath(struct mpls_route *rt,
-struct sk_buff *skb, bool bos)
+struct sk_buff *skb)
 {

Re: [PATCH] net: qcom/emac: claim the irq only when the device is opened

2017-01-20 Thread Lino Sanfilippo


Hi,

On 18.01.2017 22:42, Timur Tabi wrote:

@@ -1029,8 +1017,6 @@ void emac_mac_down(struct emac_adapter *adpt)
 */
writel(DIS_INT, adpt->base + EMAC_INT_STATUS);
writel(0, adpt->base + EMAC_INT_MASK);
-   synchronize_irq(adpt->irq.irq);


There is no reason to remove the irq synchronization, is it?
Note that the desriptors are freed after that so we must be sure that
the irq handler is not running any more.

Regards,
Lino

[PATCH net-next 7/7] net: phy: bcm7xxx: Implement EGPHY workaround for 7278

2017-01-20 Thread Florian Fainelli

Implement the HW design team recommended workaround in for 7278. Since
the GPHY now returns its revision information in MII_PHYS_ID[23] we need
to check whether the revision provided in flags is 0 or not.

Signed-off-by: Florian Fainelli 
---
 drivers/net/phy/bcm7xxx.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/drivers/net/phy/bcm7xxx.c b/drivers/net/phy/bcm7xxx.c
index fb11927de0ff..aa01020ab1b9 100644
--- a/drivers/net/phy/bcm7xxx.c
+++ b/drivers/net/phy/bcm7xxx.c
@@ -167,6 +167,31 @@ static int bcm7xxx_28nm_e0_plus_afe_config_init(struct 
phy_device *phydev)
return 0;
 }
 
+static int bcm7xxx_28nm_a0_patch_afe_config_init(struct phy_device *phydev)
+{
+   /* +1 RC_CAL codes for RL centering for both LT and HT conditions */
+   bcm_phy_write_misc(phydev, AFE_RXCONFIG_2, 0xd003);
+
+   /* Cut master bias current by 2% to compensate for RC_CAL offset */
+   bcm_phy_write_misc(phydev, DSP_TAP10, 0x791b);
+
+   /* Improve hybrid leakage */
+   bcm_phy_write_misc(phydev, AFE_HPF_TRIM_OTHERS, 0x10e3);
+
+   /* Change rx_on_tune 8 to 0xf */
+   bcm_phy_write_misc(phydev, 0x21, 0x2, 0x87f6);
+
+   /* Change 100Tx EEE bandwidth */
+   bcm_phy_write_misc(phydev, 0x22, 0x2, 0x017d);
+
+   /* Enable ffe zero detection for Vitesse interoperability */
+   bcm_phy_write_misc(phydev, 0x26, 0x2, 0x0015);
+
+   r_rc_cal_reset(phydev);
+
+   return 0;
+}
+
 static int bcm7xxx_28nm_config_init(struct phy_device *phydev)
 {
u8 rev = PHY_BRCM_7XXX_REV(phydev->dev_flags);
@@ -174,6 +199,12 @@ static int bcm7xxx_28nm_config_init(struct phy_device 
*phydev)
u8 count;
int ret = 0;
 
+   /* Newer devices have moved the revision information back into a
+* standard location in MII_PHYS_ID[23]
+*/
+   if (rev == 0)
+   rev = phydev->phy_id & ~phydev->drv->phy_id_mask;
+
pr_info_once("%s: %s PHY revision: 0x%02x, patch: %d\n",
 phydev_name(phydev), phydev->drv->name, rev, patch);
 
@@ -197,6 +228,9 @@ static int bcm7xxx_28nm_config_init(struct phy_device 
*phydev)
case 0x10:
ret = bcm7xxx_28nm_e0_plus_afe_config_init(phydev);
break;
+   case 0x01:
+   ret = bcm7xxx_28nm_a0_patch_afe_config_init(phydev);
+   break;
default:
break;
}
-- 
2.9.3

[PATCH net-next 6/7] net: phy: bcm7xxx: Add entry for BCM7278

2017-01-20 Thread Florian Fainelli

Add support for the BCM7278 28nm process Gigabit Ethernet PHY.

Signed-off-by: Florian Fainelli 
---
 drivers/net/phy/bcm7xxx.c | 2 ++
 include/linux/brcmphy.h   | 1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/net/phy/bcm7xxx.c b/drivers/net/phy/bcm7xxx.c
index 264b085d796b..fb11927de0ff 100644
--- a/drivers/net/phy/bcm7xxx.c
+++ b/drivers/net/phy/bcm7xxx.c
@@ -416,6 +416,7 @@ static int bcm7xxx_28nm_probe(struct phy_device *phydev)
 
 static struct phy_driver bcm7xxx_driver[] = {
BCM7XXX_28NM_GPHY(PHY_ID_BCM7250, "Broadcom BCM7250"),
+   BCM7XXX_28NM_GPHY(PHY_ID_BCM7278, "Broadcom BCM7278"),
BCM7XXX_28NM_GPHY(PHY_ID_BCM7364, "Broadcom BCM7364"),
BCM7XXX_28NM_GPHY(PHY_ID_BCM7366, "Broadcom BCM7366"),
BCM7XXX_28NM_GPHY(PHY_ID_BCM7439, "Broadcom BCM7439"),
@@ -430,6 +431,7 @@ static struct phy_driver bcm7xxx_driver[] = {
 
 static struct mdio_device_id __maybe_unused bcm7xxx_tbl[] = {
{ PHY_ID_BCM7250, 0xfff0, },
+   { PHY_ID_BCM7278, 0xfff0, },
{ PHY_ID_BCM7364, 0xfff0, },
{ PHY_ID_BCM7366, 0xfff0, },
{ PHY_ID_BCM7346, 0xfff0, },
diff --git a/include/linux/brcmphy.h b/include/linux/brcmphy.h
index 4f7d8be9ddbf..295fb3e73de5 100644
--- a/include/linux/brcmphy.h
+++ b/include/linux/brcmphy.h
@@ -24,6 +24,7 @@
 #define PHY_ID_BCM577800x03625d90
 
 #define PHY_ID_BCM7250 0xae025280
+#define PHY_ID_BCM7278 0xae0251a0
 #define PHY_ID_BCM7364 0xae025260
 #define PHY_ID_BCM7366 0x600d8490
 #define PHY_ID_BCM7346 0x600d8650
-- 
2.9.3

[PATCH net-next 1/7] net: dsa: bcm_sf2: Make SF2_IO64_MACRO() utilize 32-bit macro

2017-01-20 Thread Florian Fainelli

There is no point inlining the 32-bit direct register read/write part,
just infer it from the existing macro. This will make it easier to
centralize the address rewriting that we are going to introduce later
on.

Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/bcm_sf2.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/bcm_sf2.h b/drivers/net/dsa/bcm_sf2.h
index 44692673e1d5..4531c2333e86 100644
--- a/drivers/net/dsa/bcm_sf2.h
+++ b/drivers/net/dsa/bcm_sf2.h
@@ -125,7 +125,7 @@ static inline u64 name##_readq(struct bcm_sf2_priv *priv, 
u32 off)  \
 {  \
u32 indir, dir; \
spin_lock(>indir_lock);   \
-   dir = __raw_readl(priv->name + off);\
+   dir = name##_readl(priv, off);  \
indir = reg_readl(priv, REG_DIR_DATA_READ); \
spin_unlock(>indir_lock); \
return (u64)indir << 32 | dir;  \
@@ -135,7 +135,7 @@ static inline void name##_writeq(struct bcm_sf2_priv *priv, 
u64 val,\
 {  \
spin_lock(>indir_lock);   \
reg_writel(priv, upper_32_bits(val), REG_DIR_DATA_WRITE);   \
-   __raw_writel(lower_32_bits(val), priv->name + off); \
+   name##_writel(priv, lower_32_bits(val), off);   \
spin_unlock(>indir_lock); \
 }
 
-- 
2.9.3

[PATCH net-next 3/7] net: dsa: bcm_sf2: Add support for BCM7278 integrated switch

2017-01-20 Thread Florian Fainelli

Add support for the integrated switch found on BCM7278:

- core_reg_align is set to 1, to force a translation into the target
  address space which is 8 bytes aligned
- an alternate SWITCH_REG layout is provided since registers are largely
  bit/masks compatible but have different offsets
- conditional for all CORE_STS_OVERRIDE_{IMP,GMII_P} since those got
  moved way out of the traditional register space

Signed-off-by: Florian Fainelli 
---
 .../bindings/net/brcm,bcm7445-switch-v4.0.txt  |  2 +-
 drivers/net/dsa/b53/b53_common.c   | 12 +
 drivers/net/dsa/b53/b53_priv.h |  4 +-
 drivers/net/dsa/bcm_sf2.c  | 56 ++
 drivers/net/dsa/bcm_sf2_regs.h |  4 ++
 5 files changed, 68 insertions(+), 10 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt 
b/Documentation/devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt
index fb40891ee606..e1b2c3e32859 100644
--- a/Documentation/devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt
+++ b/Documentation/devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt
@@ -2,7 +2,7 @@
 
 Required properties:
 
-- compatible: should be "brcm,bcm7445-switch-v4.0"
+- compatible: should be "brcm,bcm7445-switch-v4.0" or 
"brcm,bcm7278-switch-v4.0"
 - reg: addresses and length of the register sets for the device, must be 6
   pairs of register addresses and lengths
 - interrupts: interrupts for the devices, must be two interrupts
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 5102a3701a1a..5cbb14f6a03b 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1685,6 +1685,18 @@ static const struct b53_chip_data b53_switch_chips[] = {
.jumbo_pm_reg = B53_JUMBO_PORT_MASK,
.jumbo_size_reg = B53_JUMBO_MAX_SIZE,
},
+   {
+   .chip_id = BCM7278_DEVICE_ID,
+   .dev_name = "BCM7278",
+   .vlans = 4096,
+   .enabled_ports = 0x1ff,
+   .arl_entries= 4,
+   .cpu_port = B53_CPU_PORT,
+   .vta_regs = B53_VTA_REGS,
+   .duplex_reg = B53_DUPLEX_STAT_GE,
+   .jumbo_pm_reg = B53_JUMBO_PORT_MASK,
+   .jumbo_size_reg = B53_JUMBO_MAX_SIZE,
+   },
 };
 
 static int b53_switch_init(struct b53_device *dev)
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index 86f125d55aaf..a8031b382c55 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -62,6 +62,7 @@ enum {
BCM53019_DEVICE_ID = 0x53019,
BCM58XX_DEVICE_ID = 0x5800,
BCM7445_DEVICE_ID = 0x7445,
+   BCM7278_DEVICE_ID = 0x7278,
 };
 
 #define B53_N_PORTS9
@@ -179,7 +180,8 @@ static inline int is5301x(struct b53_device *dev)
 static inline int is58xx(struct b53_device *dev)
 {
return dev->chip_id == BCM58XX_DEVICE_ID ||
-   dev->chip_id == BCM7445_DEVICE_ID;
+   dev->chip_id == BCM7445_DEVICE_ID ||
+   dev->chip_id == BCM7278_DEVICE_ID;
 }
 
 #define B53_CPU_PORT_255
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index d952099afc60..02afa0598b24 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -64,7 +64,12 @@ static void bcm_sf2_imp_vlan_setup(struct dsa_switch *ds, 
int cpu_port)
 static void bcm_sf2_imp_setup(struct dsa_switch *ds, int port)
 {
struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
-   u32 reg, val;
+   u32 reg, val, offset;
+
+   if (priv->type == BCM7445_DEVICE_ID)
+   offset = CORE_STS_OVERRIDE_IMP;
+   else
+   offset = CORE_STS_OVERRIDE_IMP2;
 
/* Enable the port memories */
reg = core_readl(priv, CORE_MEM_PSM_VDD_CTRL);
@@ -121,9 +126,9 @@ static void bcm_sf2_imp_setup(struct dsa_switch *ds, int 
port)
core_writel(priv, reg, CORE_BRCM_HDR_TX_DIS);
 
/* Force link status for IMP port */
-   reg = core_readl(priv, CORE_STS_OVERRIDE_IMP);
+   reg = core_readl(priv, offset);
reg |= (MII_SW_OR | LINK_STS);
-   core_writel(priv, reg, CORE_STS_OVERRIDE_IMP);
+   core_writel(priv, reg, offset);
 }
 
 static void bcm_sf2_eee_enable_set(struct dsa_switch *ds, int port, bool 
enable)
@@ -591,7 +596,12 @@ static void bcm_sf2_sw_adjust_link(struct dsa_switch *ds, 
int port,
struct ethtool_eee *p = >port_sts[port].eee;
u32 id_mode_dis = 0, port_mode;
const char *str = NULL;
-   u32 reg;
+   u32 reg, offset;
+
+   if (priv->type == BCM7445_DEVICE_ID)
+   offset = CORE_STS_OVERRIDE_GMIIP_PORT(port);
+   else
+   offset = CORE_STS_OVERRIDE_GMIIP2_PORT(port);
 
switch (phydev->interface) {
case PHY_INTERFACE_MODE_RGMII:
@@ -662,7 +672,7 @@ static void bcm_sf2_sw_adjust_link(struct dsa_switch

[PATCH net-next 4/7] net: dsa: bcm_sf2: Move code enabling Broadcom tags

2017-01-20 Thread Florian Fainelli

In preparation for enabling Broadcom tags on different ports based on
configuration information, dedicate a function that is responsible for
enabling Broadcom tags for a given port and update the IMP port setup to
call it.

Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/bcm_sf2.c | 61 ++-
 1 file changed, 34 insertions(+), 27 deletions(-)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 02afa0598b24..571e112c8e34 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -61,34 +61,9 @@ static void bcm_sf2_imp_vlan_setup(struct dsa_switch *ds, 
int cpu_port)
}
 }
 
-static void bcm_sf2_imp_setup(struct dsa_switch *ds, int port)
+static void bcm_sf2_brcm_hdr_setup(struct bcm_sf2_priv *priv, int port)
 {
-   struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
-   u32 reg, val, offset;
-
-   if (priv->type == BCM7445_DEVICE_ID)
-   offset = CORE_STS_OVERRIDE_IMP;
-   else
-   offset = CORE_STS_OVERRIDE_IMP2;
-
-   /* Enable the port memories */
-   reg = core_readl(priv, CORE_MEM_PSM_VDD_CTRL);
-   reg &= ~P_TXQ_PSM_VDD(port);
-   core_writel(priv, reg, CORE_MEM_PSM_VDD_CTRL);
-
-   /* Enable Broadcast, Multicast, Unicast forwarding to IMP port */
-   reg = core_readl(priv, CORE_IMP_CTL);
-   reg |= (RX_BCST_EN | RX_MCST_EN | RX_UCST_EN);
-   reg &= ~(RX_DIS | TX_DIS);
-   core_writel(priv, reg, CORE_IMP_CTL);
-
-   /* Enable forwarding */
-   core_writel(priv, SW_FWDG_EN, CORE_SWMODE);
-
-   /* Enable IMP port in dumb mode */
-   reg = core_readl(priv, CORE_SWITCH_CTRL);
-   reg |= MII_DUMB_FWDG_EN;
-   core_writel(priv, reg, CORE_SWITCH_CTRL);
+   u32 reg, val;
 
/* Resolve which bit controls the Broadcom tag */
switch (port) {
@@ -124,6 +99,38 @@ static void bcm_sf2_imp_setup(struct dsa_switch *ds, int 
port)
reg = core_readl(priv, CORE_BRCM_HDR_TX_DIS);
reg &= ~(1 << port);
core_writel(priv, reg, CORE_BRCM_HDR_TX_DIS);
+}
+
+static void bcm_sf2_imp_setup(struct dsa_switch *ds, int port)
+{
+   struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
+   u32 reg, offset;
+
+   if (priv->type == BCM7445_DEVICE_ID)
+   offset = CORE_STS_OVERRIDE_IMP;
+   else
+   offset = CORE_STS_OVERRIDE_IMP2;
+
+   /* Enable the port memories */
+   reg = core_readl(priv, CORE_MEM_PSM_VDD_CTRL);
+   reg &= ~P_TXQ_PSM_VDD(port);
+   core_writel(priv, reg, CORE_MEM_PSM_VDD_CTRL);
+
+   /* Enable Broadcast, Multicast, Unicast forwarding to IMP port */
+   reg = core_readl(priv, CORE_IMP_CTL);
+   reg |= (RX_BCST_EN | RX_MCST_EN | RX_UCST_EN);
+   reg &= ~(RX_DIS | TX_DIS);
+   core_writel(priv, reg, CORE_IMP_CTL);
+
+   /* Enable forwarding */
+   core_writel(priv, SW_FWDG_EN, CORE_SWMODE);
+
+   /* Enable IMP port in dumb mode */
+   reg = core_readl(priv, CORE_SWITCH_CTRL);
+   reg |= MII_DUMB_FWDG_EN;
+   core_writel(priv, reg, CORE_SWITCH_CTRL);
+
+   bcm_sf2_brcm_hdr_setup(priv, port);
 
/* Force link status for IMP port */
reg = core_readl(priv, offset);
-- 
2.9.3

[PATCH net-next 5/7] net: dsa: bcm_sf2: Allow non-IMP ports to have Broadcom tags enabled

2017-01-20 Thread Florian Fainelli

Parse the "brcm,use-bcm-hdr" boolean property during ports
identification to fill a bitmask of ports that should have Broadcom tags
enabled. This is needed in some configurations where per-packet metadata
can be exchanged using Broadcom tags between the switch and an on-chip
acceleration device.

Signed-off-by: Florian Fainelli 
---
 .../devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt  | 8 
 drivers/net/dsa/bcm_sf2.c | 7 +++
 drivers/net/dsa/bcm_sf2.h | 3 +++
 3 files changed, 18 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt 
b/Documentation/devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt
index e1b2c3e32859..9a734d808aa7 100644
--- a/Documentation/devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt
+++ b/Documentation/devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt
@@ -41,6 +41,13 @@ Optional properties:
   Admission Control Block supports reporting the number of packets in-flight 
in a
   switch queue
 
+Port subnodes:
+
+Optional properties:
+
+- brcm,use-bcm-hdr: boolean property, if present, indicates that the switch
+  port has Broadcom tags enabled (per-packet metadata)
+
 Example:
 
 switch_top@f0b0 {
@@ -114,6 +121,7 @@ switch_top@f0b0 {
port@0 {
label = "gphy";
reg = <0>;
+   brcm,use-bcm-hdr;
};
...
};
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 571e112c8e34..8eecfd227e06 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -236,6 +236,10 @@ static int bcm_sf2_port_setup(struct dsa_switch *ds, int 
port,
reg &= ~P_TXQ_PSM_VDD(port);
core_writel(priv, reg, CORE_MEM_PSM_VDD_CTRL);
 
+   /* Enable Broadcom tags for that port if requested */
+   if (priv->brcm_tag_mask & BIT(port))
+   bcm_sf2_brcm_hdr_setup(priv, port);
+
/* Clear the Rx and Tx disable bits and set to no spanning tree */
core_writel(priv, 0, CORE_G_PCTL_PORT(port));
 
@@ -515,6 +519,9 @@ static void bcm_sf2_identify_ports(struct bcm_sf2_priv 
*priv,
 
if (mode == PHY_INTERFACE_MODE_MOCA)
priv->moca_port = port_num;
+
+   if (of_property_read_bool(port, "brcm,use-bcm-hdr"))
+   priv->brcm_tag_mask |= 1 << port_num;
}
 }
 
diff --git a/drivers/net/dsa/bcm_sf2.h b/drivers/net/dsa/bcm_sf2.h
index a1430866bd79..6e1f74e4d471 100644
--- a/drivers/net/dsa/bcm_sf2.h
+++ b/drivers/net/dsa/bcm_sf2.h
@@ -100,6 +100,9 @@ struct bcm_sf2_priv {
struct device_node  *master_mii_dn;
struct mii_bus  *slave_mii_bus;
struct mii_bus  *master_mii_bus;
+
+   /* Bitmask of ports needing BRCM tags */
+   unsigned intbrcm_tag_mask;
 };
 
 static inline struct bcm_sf2_priv *bcm_sf2_to_priv(struct dsa_switch *ds)
-- 
2.9.3

[PATCH net-next 0/7] net: dsa: bcm_sf2: Add support for BCM7278

2017-01-20 Thread Florian Fainelli

Hi all,

This patch series adds support for the Broadcom BCM7278 integrated switch
which is a successor of the BCM7445 switch. We have a little bit of
register shuffling going on, which is why most of the functional changes
are to deal with that.

Thanks!

Florian Fainelli (7):
  net: dsa: bcm_sf2: Make SF2_IO64_MACRO() utilize 32-bit macro
  net: dsa: bcm_sf2: Prepare for different register layouts
  net: dsa: bcm_sf2: Add support for BCM7278 integrated switch
  net: dsa: bcm_sf2: Move code enabling Broadcom tags
  net: dsa: bcm_sf2: Allow non-IMP ports to have Broadcom tags enabled
  net: phy: bcm7xxx: Add entry for BCM7278
  net: phy: bcm7xxx: Implement EGPHY workaround for 7278

 .../bindings/net/brcm,bcm7445-switch-v4.0.txt  |  10 +-
 drivers/net/dsa/b53/b53_common.c   |  12 ++
 drivers/net/dsa/b53/b53_priv.h |   4 +-
 drivers/net/dsa/bcm_sf2.c  | 167 -
 drivers/net/dsa/bcm_sf2.h  |  41 -
 drivers/net/dsa/bcm_sf2_regs.h |  47 +++---
 drivers/net/phy/bcm7xxx.c  |  36 +
 include/linux/brcmphy.h|   1 +
 8 files changed, 261 insertions(+), 57 deletions(-)

-- 
2.9.3

[PATCH net-next 2/7] net: dsa: bcm_sf2: Prepare for different register layouts

2017-01-20 Thread Florian Fainelli

In preparation for supporting a new device with a slightly different
register layout, affecting the SWITCH_REG and SWITCH_CORE address
spaces, perform a few preparatory steps:

- allow matching the compatible string against a data description
- convert the SWITCH_REG register accesses into an indirection table
- prepare for supporting a SWITCH_CORE register alignment requirement

Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/bcm_sf2.c  | 57 +-
 drivers/net/dsa/bcm_sf2.h  | 34 +++--
 drivers/net/dsa/bcm_sf2_regs.h | 43 ++-
 3 files changed, 109 insertions(+), 25 deletions(-)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 31d017086f8b..d952099afc60 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -1009,10 +1009,49 @@ static const struct dsa_switch_ops bcm_sf2_ops = {
.port_fdb_del   = b53_fdb_del,
 };
 
+struct bcm_sf2_of_data {
+   u32 type;
+   const u16 *reg_offsets;
+   unsigned int core_reg_align;
+};
+
+/* Register offsets for the SWITCH_REG_* block */
+static const u16 bcm_sf2_7445_reg_offsets[] = {
+   [REG_SWITCH_CNTRL]  = 0x00,
+   [REG_SWITCH_STATUS] = 0x04,
+   [REG_DIR_DATA_WRITE]= 0x08,
+   [REG_DIR_DATA_READ] = 0x0C,
+   [REG_SWITCH_REVISION]   = 0x18,
+   [REG_PHY_REVISION]  = 0x1C,
+   [REG_SPHY_CNTRL]= 0x2C,
+   [REG_RGMII_0_CNTRL] = 0x34,
+   [REG_RGMII_1_CNTRL] = 0x40,
+   [REG_RGMII_2_CNTRL] = 0x4c,
+   [REG_LED_0_CNTRL]   = 0x90,
+   [REG_LED_1_CNTRL]   = 0x94,
+   [REG_LED_2_CNTRL]   = 0x98,
+};
+
+static const struct bcm_sf2_of_data bcm_sf2_7445_data = {
+   .type   = BCM7445_DEVICE_ID,
+   .core_reg_align = 0,
+   .reg_offsets= bcm_sf2_7445_reg_offsets,
+};
+
+static const struct of_device_id bcm_sf2_of_match[] = {
+   { .compatible = "brcm,bcm7445-switch-v4.0",
+ .data = _sf2_7445_data
+   },
+   { /* sentinel */ },
+};
+MODULE_DEVICE_TABLE(of, bcm_sf2_of_match);
+
 static int bcm_sf2_sw_probe(struct platform_device *pdev)
 {
const char *reg_names[BCM_SF2_REGS_NUM] = BCM_SF2_REGS_NAME;
struct device_node *dn = pdev->dev.of_node;
+   const struct of_device_id *of_id = NULL;
+   const struct bcm_sf2_of_data *data;
struct b53_platform_data *pdata;
struct dsa_switch_ops *ops;
struct bcm_sf2_priv *priv;
@@ -1040,11 +1079,22 @@ static int bcm_sf2_sw_probe(struct platform_device 
*pdev)
if (!pdata)
return -ENOMEM;
 
+   of_id = of_match_node(bcm_sf2_of_match, dn);
+   if (!of_id || !of_id->data)
+   return -EINVAL;
+
+   data = of_id->data;
+
+   /* Set SWITCH_REG register offsets and SWITCH_CORE align factor */
+   priv->type = data->type;
+   priv->reg_offsets = data->reg_offsets;
+   priv->core_reg_align = data->core_reg_align;
+
/* Auto-detection using standard registers will not work, so
 * provide an indication of what kind of device we are for
 * b53_common to work with
 */
-   pdata->chip_id = BCM7445_DEVICE_ID;
+   pdata->chip_id = priv->type;
dev->pdata = pdata;
 
priv->dev = dev;
@@ -1190,11 +1240,6 @@ static int bcm_sf2_resume(struct device *dev)
 static SIMPLE_DEV_PM_OPS(bcm_sf2_pm_ops,
 bcm_sf2_suspend, bcm_sf2_resume);
 
-static const struct of_device_id bcm_sf2_of_match[] = {
-   { .compatible = "brcm,bcm7445-switch-v4.0" },
-   { /* sentinel */ },
-};
-MODULE_DEVICE_TABLE(of, bcm_sf2_of_match);
 
 static struct platform_driver bcm_sf2_driver = {
.probe  = bcm_sf2_sw_probe,
diff --git a/drivers/net/dsa/bcm_sf2.h b/drivers/net/dsa/bcm_sf2.h
index 4531c2333e86..a1430866bd79 100644
--- a/drivers/net/dsa/bcm_sf2.h
+++ b/drivers/net/dsa/bcm_sf2.h
@@ -61,6 +61,11 @@ struct bcm_sf2_priv {
void __iomem*fcb;
void __iomem*acb;
 
+   /* Register offsets indirection tables */
+   u32 type;
+   const u16   *reg_offsets;
+   unsigned intcore_reg_align;
+
/* spinlock protecting access to the indirect registers */
spinlock_t  indir_lock;
 
@@ -104,6 +109,11 @@ static inline struct bcm_sf2_priv *bcm_sf2_to_priv(struct 
dsa_switch *ds)
return dev->priv;
 }
 
+static inline u32 bcm_sf2_mangle_addr(struct bcm_sf2_priv *priv, u32 off)
+{
+   return off << priv->core_reg_align;
+}
+
 #define SF2_IO_MACRO(name) \
 static inline u32 name##_readl(struct bcm_sf2_priv *priv, u32 off) \
 {  \
@@ -153,8 +163,28 @@ static inline void intrl2_##which##_mask_set(struct 
bcm_sf2_priv *priv, \

Re: [PATCH net 0/2] net: Fix oops on state free after lwt module unload

2017-01-20 Thread Robert Shearman

On 20/01/17 17:03, David Miller wrote:

From: Robert Shearman 
Date: Wed, 18 Jan 2017 15:32:01 +

This patchset fixes an oops in lwtstate_free and a memory leak that
would otherwise be exposed by ensuring that references are taken on
modules that need to stay around to clean up lwt state. To faciliate
this all ops that implement destroy_state and that can be configured
to build as a module are changed specify the owner module in the
ops. The intersection of those two sets is just ila at the moment.

Two things:

1) Under no circumstances should we allow a lwtunnel ops implementing
   module to unload while there is a rule using those ops which is
   alive.

   Therefore, we should not special case the destroy op.  We should
   unconditionally grab the module reference.

2) Please add the new 'owner' field and add an appropriate assignment
   for ops->owner to _every_ lwtunnel implementation, and do so in
   your first patch.  Please do not only do this for ILA.

Thanks.

Very clear, makes sense, will do.

Thanks,
Rob

Re: [PATCH net v3] bridge: netlink: call br_changelink() during br_dev_newlink()

2017-01-20 Thread David Miller

From: Jiri Pirko 
Date: Fri, 20 Jan 2017 19:10:42 +0100

> Fri, Jan 20, 2017 at 06:12:17PM CET, c...@cera.cz wrote:
>>Any bridge options specified during link creation (e.g. ip link add)
>>are ignored as br_dev_newlink() does not process them.
>>Use br_changelink() to do it.
>>
>>Fixes: 1332351 ("bridge: implement rtnl_link_ops->changelink")
> 
> Should have 12 chars. Other than that,
> 
> Reviewed-by: Jiri Pirko 

I fixed up the SHA1-ID.

Applied and queued up for -stable, thanks.

Re: [Xen-devel] xennet_start_xmit assumptions

2017-01-20 Thread Sowmini Varadhan

On (01/20/17 14:30), David Miller wrote:
> 
> CAP_SYS_RAWIO or not, the contract we have with the device is that
> there will be at least enough bytes to cover a link layer header.

I see. If that's the case (for all the kernel-driver interfaces), 
then the xen_netfront driver is probably not required to check for
variants of sk_buffs like pure-non-linear etc.

> This probably requires a little bit of an adjustment to the calling
> convention.  Perhaps:
> 
>   int dev_validate_header(const struct net_device *dev,
>   char *ll_header, int len);
> 
> So then you can go:
> 
>   new_len = dev_validate_header(dev, skb->data, len);
>   if (new_len < 0)
>   goto out_cleanup_err;
>   if (new_len > len)
>   __skb_put(skb, new_len - len);
> 
> Or something like that.

ok let me work with that and get back (hopefully with an 
RFC patch).

--Sowmini

Re: [PATCH net] net: mpls: Fix multipath selection for LSR use case

2017-01-20 Thread David Miller

From: David Ahern 
Date: Thu, 19 Jan 2017 16:51:03 -0800

> MPLS multipath for LSR is broken -- always selecting the first nexthop
> in the one label case. For example:
 ...

David, this doesn't apply cleanly to the net tree, please respin.

Thanks.

Re: [PATCH v2 0/2] net: dsa: Move temperature sensor code into PHY.

2017-01-20 Thread David Miller

From: Andrew Lunn 
Date: Fri, 20 Jan 2017 01:37:48 +0100

> Marvell Ethernet switches contain a temperature sensor. There appears
> to be one sensor, which is shared by each of the internal PHYs. Each
> PHY has independent registers to read this sensor, and to set a limit
> for when an alarm should be raised.
> 
> Some Marvell discrete PHY also have the same sensor and registers.
> Moving the HWMON code from DSA into the PHY makes the sensor available
> in discrete PHYs, and removes the layering violation, the switch
> driver poking around in PHY registers.
> 
> While moving the code into the PHY driver, it has been re-written to
> use the new HWMON APIs.
> 
> v2:
> 
> Better Cover note explaining one sensor, but multiple independent
> registers
> 
> Simply error checking.

I know there was minor request for a respin, but I'm not going to hold
this up any more just for that.

Series applied, thanks Andrew.

Re: [PATCH] inet: don't use sk_v6_rcv_saddr directly

2017-01-20 Thread David Miller

From: Josef Bacik 
Date: Thu, 19 Jan 2017 17:47:46 -0500

> When comparing two sockets we need to use inet6_rcv_saddr so we get a NULL
> sk_v6_rcv_saddr if the socket isn't AF_INET6, otherwise our comparison 
> function
> can be wrong.
> 
> Fixes: 637bc8b ("inet: reset tb->fastreuseport when adding a reuseport sk")
> Signed-off-by: Josef Bacik 

Applied, thanks.

Re: [PATCH net-next v2 0/2] net: ipv6: Improve user experience with multipath routes

2017-01-20 Thread David Ahern

On 1/19/17 11:10 PM, David Ahern wrote:
> This series closes a couple of gaps between IPv4 and IPv6 with respect
> to multipath routes:
...
> In both cases, the new behavior requires users to opt in by setting a new
> flag, RTM_F_ALL_NEXTHOPS, in the rtm_flags of struct rtmsg which is
> expected to be the ancillary header in the netlink request received from
> the user. A program must opt in to the new behavior so as to not break
> any existing applications.
> 
> The opt-in behavior works for both route deletes and dumps (the two
> differences noted above), but not for notifications as notifications
> do not take user input to specify flags. The only way to have
> notifications generate RTA_MULTIPATH encodings is to have a gobal
> flag -- e.g., sysctl. I'd prefer not to add a sysctl knob for this
> backwards compatibility.

BTW, I am in favor of not requiring a user API for this but just doing it. I 
can't imagine anyone working with multipath routes not wanting the efficiency 
of the RTA_MULTIPATH attribute. These patches require an API only because of 
the rule not to break userspace. If we conclude to just do it without an API, 
the multipath_add and multipath_del need to be modified to only send a 
notification once at the end of the actions.

Re: [Xen-devel] xennet_start_xmit assumptions

2017-01-20 Thread David Miller

From: Sowmini Varadhan 
Date: Thu, 19 Jan 2017 17:41:23 -0500

> On (01/19/17 13:47), Sowmini Varadhan wrote:
>> > Specifically I'm talking about the dev_validate_header() check.
>> > That is supposed to protect us from these kinds of situations.
>> 
>> ah, but I run my pf_packet application as root, so I have 
>> capable(CAP_SYS_RAWIO), so I slip through the dev_validate_header()
>> check.
> 
> and in that light, should dev_validate_header()
> always return false if len == 0?
> 
> that will take care of all the send paths in af_packet.c
> but it impacts all drivers as well (even though it is the
> logically correct thing to do..)

I think dev_validate_header() almost does the correct thing in
the SYS_RAWIO case.

It clears out the not-provided hard header bytes, but it doesn't
adjust the skb->len.  I think that is a real requirement in this
situation.

CAP_SYS_RAWIO or not, the contract we have with the device is that
there will be at least enough bytes to cover a link layer header.

This probably requires a little bit of an adjustment to the calling
convention.  Perhaps:

int dev_validate_header(const struct net_device *dev,
char *ll_header, int len);

So then you can go:

new_len = dev_validate_header(dev, skb->data, len);
if (new_len < 0)
goto out_cleanup_err;
if (new_len > len)
__skb_put(skb, new_len - len);

Or something like that.

Re: [PATCH v5 0/2] Add support for the ethernet switch on the ESPRESSObin

2017-01-20 Thread David Miller

From: Gregory CLEMENT 
Date: Thu, 19 Jan 2017 22:49:32 +0100

> I created a new family for this switch and filled the ops structure
> by selecting which seems the more appropriate functions. I rebased
> the series on net-next/master which allowed me to benefit to the
> eeprom functions introduced for the 6390.

It looks like there will be at least one more respin of this series,
specifically to remove the new family as Vivien seems to object to
this.

Re: [pull request][net-next 00/15] Mellanox mlx5 updates 2017-01-19

2017-01-20 Thread David Miller

From: Saeed Mahameed 
Date: Fri, 20 Jan 2017 00:38:53 +0200

> This pull request includes some small mlx5 updates and two new features,
> The 1st exposes new HW counters to "ethtool -S" and the other introduces
> mlx5 ptp 1pps support. Details are down bleow.
> 
> Please pull and let me know if there's any problem.

Pulled, thank you.

Re: [PATCHv4 net-next 3/5] sctp: implement sender-side procedures for SSN/TSN Reset Request Parameter

2017-01-20 Thread David Miller

From: Marcelo Ricardo Leitner 
Date: Fri, 20 Jan 2017 16:25:22 -0200

> I talked offline with Xin about this and we cannot do it this way.
> Unfortunatelly we will have to take the long road here, because then
> we may send data while sending the request, as the streams are not
> closed yet.  We really need to close team, send the request, and
> re-open if the send fails.

I am expecting another spin of this series, correct?

Re: [PATCH v2] xen-netfront: Fix Rx stall during network stress and OOM

2017-01-20 Thread David Miller

From: Vineeth Remanan Pillai 
Date: Thu, 19 Jan 2017 08:35:39 -0800

> From: Vineeth Remanan Pillai 
> 
> During an OOM scenario, request slots could not be created as skb
> allocation fails. So the netback cannot pass in packets and netfront
> wrongly assumes that there is no more work to be done and it disables
> polling. This causes Rx to stall.
> 
> The issue is with the retry logic which schedules the timer if the
> created slots are less than NET_RX_SLOTS_MIN. The count of new request
> slots to be pushed are calculated as a difference between new req_prod
> and rsp_cons which could be more than the actual slots, if there are
> unconsumed responses.
> 
> The fix is to calculate the count of newly created slots as the
> difference between new req_prod and old req_prod.
> 
> Signed-off-by: Vineeth Remanan Pillai 
> Reviewed-by: Juergen Gross 
> ---
> Changes in v2:
>   - Removed the old implementation of enabling polling on
> skb allocation error.
>   - Corrected the refill timer logic to schedule when newly
> created slots since last push is less than NET_RX_SLOTS_MIN.

Applied.

[PATCH net-next 2/2] net: systemport: Add support for SYSTEMPORT Lite

2017-01-20 Thread Florian Fainelli

Add supporf for the SYSTEMPORT Lite Ethernet controller, this piece of hardware
is largely based on the full-blown SYSTEMPORT and differs in the following:

- no full-blown UniMAC, instead we have the MagicPacket matching from UniMAC at
  same offset, and a GMII Interface Block (GIB) for the MAC-level stuff, since
  we are always interfaced to an Ethernet switch which is fully Ethernet 
compliant
  shortcuts could be made

- 16 transmit queues, whose interrupts are moved into the first Level-2 
interrupt
  controller bank

- slight TDMA offset change (a register was inserted after TDMA_STATUS, *sigh*)

- 256 RX descriptors (512 words) and 256 TX descriptors (not visible)

As a consequence of these two things, update the code paths accordingly to
differentiate the full-blown from the light version.

Signed-off-by: Florian Fainelli 
---
 .../devicetree/bindings/net/brcm,systemport.txt|   5 +-
 drivers/net/ethernet/broadcom/bcmsysport.c | 323 +
 drivers/net/ethernet/broadcom/bcmsysport.h |  78 -
 3 files changed, 327 insertions(+), 79 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/brcm,systemport.txt 
b/Documentation/devicetree/bindings/net/brcm,systemport.txt
index 877da34145b0..83f29e0e11ba 100644
--- a/Documentation/devicetree/bindings/net/brcm,systemport.txt
+++ b/Documentation/devicetree/bindings/net/brcm,systemport.txt
@@ -1,7 +1,10 @@
 * Broadcom BCM7xxx Ethernet Systemport Controller (SYSTEMPORT)
 
 Required properties:
-- compatible: should be one of "brcm,systemport-v1.00" or "brcm,systemport"
+- compatible: should be one of:
+ "brcm,systemport-v1.00"
+ "brcm,systemportlite-v1.00" or
+ "brcm,systemport"
 - reg: address and length of the register set for the device.
 - interrupts: interrupts for the device, first cell must be for the rx
   interrupts, and the second cell should be for the transmit queues. An
diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c 
b/drivers/net/ethernet/broadcom/bcmsysport.c
index 31bb2c3696ec..a68d4889f5db 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -43,14 +43,43 @@ static inline void name##_writel(struct bcm_sysport_priv 
*priv, \
 BCM_SYSPORT_IO_MACRO(intrl2_0, SYS_PORT_INTRL2_0_OFFSET);
 BCM_SYSPORT_IO_MACRO(intrl2_1, SYS_PORT_INTRL2_1_OFFSET);
 BCM_SYSPORT_IO_MACRO(umac, SYS_PORT_UMAC_OFFSET);
+BCM_SYSPORT_IO_MACRO(gib, SYS_PORT_GIB_OFFSET);
 BCM_SYSPORT_IO_MACRO(tdma, SYS_PORT_TDMA_OFFSET);
-BCM_SYSPORT_IO_MACRO(rdma, SYS_PORT_RDMA_OFFSET);
 BCM_SYSPORT_IO_MACRO(rxchk, SYS_PORT_RXCHK_OFFSET);
 BCM_SYSPORT_IO_MACRO(txchk, SYS_PORT_TXCHK_OFFSET);
 BCM_SYSPORT_IO_MACRO(rbuf, SYS_PORT_RBUF_OFFSET);
 BCM_SYSPORT_IO_MACRO(tbuf, SYS_PORT_TBUF_OFFSET);
 BCM_SYSPORT_IO_MACRO(topctrl, SYS_PORT_TOPCTRL_OFFSET);
 
+/* On SYSTEMPORT Lite, any register after RDMA_STATUS has the exact
+ * same layout, except it has been moved by 4 bytes up, *sigh*
+ */
+static inline u32 rdma_readl(struct bcm_sysport_priv *priv, u32 off)
+{
+   if (priv->is_lite && off >= RDMA_STATUS)
+   off += 4;
+   return __raw_readl(priv->base + SYS_PORT_RDMA_OFFSET + off);
+}
+
+static inline void rdma_writel(struct bcm_sysport_priv *priv, u32 val, u32 off)
+{
+   if (priv->is_lite && off >= RDMA_STATUS)
+   off += 4;
+   __raw_writel(val, priv->base + SYS_PORT_RDMA_OFFSET + off);
+}
+
+static inline u32 tdma_control_bit(struct bcm_sysport_priv *priv, u32 bit)
+{
+   if (!priv->is_lite) {
+   return BIT(bit);
+   } else {
+   if (bit >= ACB_ALGO)
+   return BIT(bit + 1);
+   else
+   return BIT(bit);
+   }
+}
+
 /* L2-interrupt masking/unmasking helpers, does automatic saving of the applied
  * mask in a software copy to avoid CPU_MASK_STATUS reads in hot-paths.
   */
@@ -143,9 +172,9 @@ static int bcm_sysport_set_tx_csum(struct net_device *dev,
priv->tsb_en = !!(wanted & (NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM));
reg = tdma_readl(priv, TDMA_CONTROL);
if (priv->tsb_en)
-   reg |= TSB_EN;
+   reg |= tdma_control_bit(priv, TSB_EN);
else
-   reg &= ~TSB_EN;
+   reg &= ~tdma_control_bit(priv, TSB_EN);
tdma_writel(priv, reg, TDMA_CONTROL);
 
return 0;
@@ -281,11 +310,35 @@ static void bcm_sysport_set_msglvl(struct net_device 
*dev, u32 enable)
priv->msg_enable = enable;
 }
 
+static inline bool bcm_sysport_lite_stat_valid(enum bcm_sysport_stat_type type)
+{
+   switch (type) {
+   case BCM_SYSPORT_STAT_NETDEV:
+   case BCM_SYSPORT_STAT_RXCHK:
+   case BCM_SYSPORT_STAT_RBUF:
+   case BCM_SYSPORT_STAT_SOFT:
+   return true;
+   default:
+   return false;
+   }
+}
+
 static int bcm_sysport_get_sset_count(struct net_device *dev, int

[PATCH net-next 1/2] net: systemport: Dynamically allocate number of TX rings

2017-01-20 Thread Florian Fainelli

In preparation for adding SYSTEMPORT Lite, which has twice as less transmit
queues than SYSTEMPORT make sure we do allocate TX rings based on the
systemport,txq property to get an appropriate memory footprint.

Signed-off-by: Florian Fainelli 
---
 drivers/net/ethernet/broadcom/bcmsysport.c | 11 +++
 drivers/net/ethernet/broadcom/bcmsysport.h |  2 +-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c 
b/drivers/net/ethernet/broadcom/bcmsysport.c
index 744ed6ddaf37..31bb2c3696ec 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -1752,6 +1752,10 @@ static int bcm_sysport_probe(struct platform_device 
*pdev)
if (of_property_read_u32(dn, "systemport,num-rxq", ))
rxq = 1;
 
+   /* Sanity check the number of transmit queues */
+   if (!txq || txq > TDMA_NUM_RINGS)
+   return -EINVAL;
+
dev = alloc_etherdev_mqs(sizeof(*priv), txq, rxq);
if (!dev)
return -ENOMEM;
@@ -1759,6 +1763,13 @@ static int bcm_sysport_probe(struct platform_device 
*pdev)
/* Initialize private members */
priv = netdev_priv(dev);
 
+   /* Allocate number of TX rings */
+   priv->tx_rings = devm_kcalloc(>dev, txq,
+ sizeof(struct bcm_sysport_tx_ring),
+ GFP_KERNEL);
+   if (!priv->tx_rings)
+   return -ENOMEM;
+
priv->irq0 = platform_get_irq(pdev, 0);
priv->irq1 = platform_get_irq(pdev, 1);
priv->wol_irq = platform_get_irq(pdev, 2);
diff --git a/drivers/net/ethernet/broadcom/bcmsysport.h 
b/drivers/net/ethernet/broadcom/bcmsysport.h
index 1c82e3da69a7..f051356b0274 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.h
+++ b/drivers/net/ethernet/broadcom/bcmsysport.h
@@ -659,7 +659,7 @@ struct bcm_sysport_priv {
int wol_irq;
 
/* Transmit rings */
-   struct bcm_sysport_tx_ring tx_rings[TDMA_NUM_RINGS];
+   struct bcm_sysport_tx_ring *tx_rings;
 
/* Receive queue */
void __iomem*rx_bds;
-- 
2.11.0

[PATCH net-next 0/2] net: systemport: Add support for SYSTEMPORT lite

2017-01-20 Thread Florian Fainelli

Hi David,

This patch series adds support for SYSTEMPORT Lite which is an evolution
of the existing SYSTEMPORT adapter.

The two generations are largely identical as far as the transmit/receive
path are concerned, and there were just a few control path changes here
and there.

Thanks!

Florian Fainelli (2):
  net: systemport: Dynamically allocate number of TX rings
  net: systemport: Add support for SYSTEMPORT Lite

 .../devicetree/bindings/net/brcm,systemport.txt|   5 +-
 drivers/net/ethernet/broadcom/bcmsysport.c | 334 +
 drivers/net/ethernet/broadcom/bcmsysport.h |  80 -
 3 files changed, 339 insertions(+), 80 deletions(-)

-- 
2.11.0

[PATCH v2 iproute2] f_flower: don't set TCA_FLOWER_KEY_ETH_TYPE for "protocol all"

2017-01-20 Thread Benjamin LaHaise

v2 - update to address changes in 00697ca19ae3e1118f2af82c3b41ac4335fe918b.

When using the tc flower filter, rules marked with "protocol all" do not
actually match all packets.  This is due to a bug in f_flower.c that passes
in ETH_P_ALL in the TCA_FLOWER_KEY_ETH_TYPE attribute when adding a rule.
Fix this by omitting TCA_FLOWER_KEY_ETH_TYPE if the protocol is set to
ETH_P_ALL.

Fixes: 488b41d020fb ("tc: flower no need to specify the ethertype")
Cc: Jamal Hadi Salim 
Signed-off-by: Benjamin LaHaise 
Signed-off-by: Benjamin LaHaise 

diff --git a/tc/f_flower.c b/tc/f_flower.c
index 314c2dd..145a856 100644
--- a/tc/f_flower.c
+++ b/tc/f_flower.c
@@ -529,9 +529,11 @@ parse_done:
if (ret)
return ret;
 
-   ret = addattr16(n, MAX_MSG, TCA_FLOWER_KEY_ETH_TYPE, eth_type);
-   if (ret)
-   return ret;
+   if (eth_type != htons(ETH_P_ALL)) {
+   ret = addattr16(n, MAX_MSG, TCA_FLOWER_KEY_ETH_TYPE, eth_type);
+   if (ret)
+   return ret;
+   }
 
tail->rta_len = (((void *)n)+n->nlmsg_len) - (void *)tail;

Re: [PATCH v3 2/3] NFC: trf7970a: Add device tree option of 1.8 Volt IO voltage

2017-01-20 Thread Mark Greer

On Wed, Dec 21, 2016 at 11:18:33PM -0500, Geoff Lansberry wrote:
> The TRF7970A has configuration options for supporting hardware designs
> with 1.8 Volt or 3.3 Volt IO.   This commit adds a device tree option,
> using a fixed regulator binding, for setting the io voltage to match
> the hardware configuration. If no option is supplied it defaults to
> 3.3 volt configuration.
> 
> Signed-off-by: Geoff Lansberry 
> ---
>  .../devicetree/bindings/net/nfc/trf7970a.txt   |  2 ++
>  drivers/nfc/trf7970a.c | 26 
> +-

Acked-by: Mark Greer

Re: [PATCH iproute2 net-next V5] tc: flower: Refactor matching flags to be more user friendly

2017-01-20 Thread Stephen Hemminger

On Thu, 19 Jan 2017 16:27:53 +0200
Paul Blakey  wrote:

> Instead of "magic numbers" we can now specify each flag
> by name. Prefix of "no"  (e.g nofrag) unsets the flag,
> otherwise it wil be set.
> 
> Example:
> # add a flower filter that will drop fragmented packets
> tc filter add dev ens4f0 protocol ip parent : \
> flower \
> src_mac e4:1d:2d:fd:8b:01 \
> dst_mac e4:1d:2d:fd:8b:02 \
> indev ens4f0 \
> ip_flags frag \
> action drop
> 
> # add a flower filter that will drop non-fragmented packets
> tc filter add dev ens4f0 protocol ip parent : \
> flower \
> src_mac e4:1d:2d:fd:8b:01 \
> dst_mac e4:1d:2d:fd:8b:02 \
> indev ens4f0 \
> ip_flags nofrag \
> action drop
> 
> Fixes: 22a8f019891c ('tc: flower: support matching flags')
> Signed-off-by: Paul Blakey 
> Reviewed-by: Roi Dayan 
> ---
> 
> Hi,
> Added a framework to add new flags more easily, such 
> as the upcoming tcp_flags (see kernel cls_flower), and other ip_flags.
> 
> Thanks,
>  Paul.
> 
> 
> Changelog:
> 
> v5:
> Fixed wrong use of strtok to skip old prefix.
> 
> v4:
> Changed prefix in manpage as well.
> 
> v3:
> Changed prefix to "no" instead of "no_".
> 
> v2:
> Changed delimiter to "/" to avoid shell pipe errors.
> 
> 
>  man/man8/tc-flower.8 |  12 +-
>  tc/f_flower.c| 117 
> ---
>  2 files changed, 102 insertions(+), 27 deletions(-)
> 

Applied to net-next (defuzzed)

Re: [PATCHv2 iproute2 net-next 1/5] iplink: bridge: add support for IFLA_BR_FDB_FLUSH

2017-01-20 Thread Stephen Hemminger

On Wed, 18 Jan 2017 14:12:47 +0800
Hangbin Liu  wrote:

> This patch implements support for the IFLA_BR_FDB_FLUSH attribute
> in iproute2 so it can flush bridge fdb dynamic entries.
> 
> Reviewed-by: Nikolay Aleksandrov 
> Signed-off-by: Hangbin Liu 
> ---
>  ip/iplink_bridge.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 

Applied all of these to net-next, please update man pages.

Re: [PATCHv4 net-next 3/5] sctp: implement sender-side procedures for SSN/TSN Reset Request Parameter

2017-01-20 Thread Marcelo Ricardo Leitner

On Sat, Jan 21, 2017 at 02:00:37AM +0800, Xin Long wrote:
> This patch is to implement Sender-Side Procedures for the SSN/TSN
> Reset Request Parameter descibed in rfc6525 section 5.1.4.
> 
> It is also to add sockopt SCTP_RESET_ASSOC in rfc6525 section 6.3.3
> for users.
> 
> Signed-off-by: Xin Long 
> ---
>  include/net/sctp/sctp.h   |  1 +
>  include/uapi/linux/sctp.h |  1 +
>  net/sctp/socket.c | 29 +
>  net/sctp/stream.c | 37 +
>  4 files changed, 68 insertions(+)
> 
> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
> index 3cfd365b..b93820f 100644
> --- a/include/net/sctp/sctp.h
> +++ b/include/net/sctp/sctp.h
> @@ -198,6 +198,7 @@ int sctp_offload_init(void);
>   */
>  int sctp_send_reset_streams(struct sctp_association *asoc,
>   struct sctp_reset_streams *params);
> +int sctp_send_reset_assoc(struct sctp_association *asoc);
>  
>  /*
>   * Module global variables
> diff --git a/include/uapi/linux/sctp.h b/include/uapi/linux/sctp.h
> index 03c27ce..c0bd8c3 100644
> --- a/include/uapi/linux/sctp.h
> +++ b/include/uapi/linux/sctp.h
> @@ -117,6 +117,7 @@ typedef __s32 sctp_assoc_t;
>  #define SCTP_PR_ASSOC_STATUS 115
>  #define SCTP_ENABLE_STREAM_RESET 118
>  #define SCTP_RESET_STREAMS   119
> +#define SCTP_RESET_ASSOC 120
>  
>  /* PR-SCTP policies */
>  #define SCTP_PR_SCTP_NONE0x
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index bee4dd3..2c5c9ca 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -3812,6 +3812,32 @@ static int sctp_setsockopt_reset_streams(struct sock 
> *sk,
>   return retval;
>  }
>  
> +static int sctp_setsockopt_reset_assoc(struct sock *sk,
> +char __user *optval,
> +unsigned int optlen)
> +{
> + struct sctp_association *asoc;
> + sctp_assoc_t associd;
> + int retval = -EINVAL;
> +
> + if (optlen != sizeof(associd))
> + goto out;
> +
> + if (copy_from_user(, optval, optlen)) {
> + retval = -EFAULT;
> + goto out;
> + }
> +
> + asoc = sctp_id2assoc(sk, associd);
> + if (!asoc)
> + goto out;
> +
> + retval = sctp_send_reset_assoc(asoc);
> +
> +out:
> + return retval;
> +}
> +
>  /* API 6.2 setsockopt(), getsockopt()
>   *
>   * Applications use setsockopt() and getsockopt() to set or retrieve
> @@ -3984,6 +4010,9 @@ static int sctp_setsockopt(struct sock *sk, int level, 
> int optname,
>   case SCTP_RESET_STREAMS:
>   retval = sctp_setsockopt_reset_streams(sk, optval, optlen);
>   break;
> + case SCTP_RESET_ASSOC:
> + retval = sctp_setsockopt_reset_assoc(sk, optval, optlen);
> + break;
>   default:
>   retval = -ENOPROTOOPT;
>   break;
> diff --git a/net/sctp/stream.c b/net/sctp/stream.c
> index 53c67d6..3b872a8 100644
> --- a/net/sctp/stream.c
> +++ b/net/sctp/stream.c
> @@ -166,3 +166,40 @@ int sctp_send_reset_streams(struct sctp_association 
> *asoc,
>  out:
>   return retval;
>  }
> +
> +int sctp_send_reset_assoc(struct sctp_association *asoc)
> +{
> + struct sctp_chunk *chunk = NULL;
> + int retval;
> + __u16 i;
> +
> + if (!asoc->peer.reconf_capable ||
> + !(asoc->strreset_enable & SCTP_ENABLE_RESET_ASSOC_REQ))
> + return -ENOPROTOOPT;
> +
> + if (asoc->strreset_outstanding)
> + return -EINPROGRESS;
> +
> + chunk = sctp_make_strreset_tsnreq(asoc);
> + if (!chunk)
> + return -ENOMEM;
> +
> + asoc->strreset_chunk = chunk;
> + sctp_chunk_hold(asoc->strreset_chunk);
> +
> + retval = sctp_send_reconf(asoc, chunk);
> + if (retval) {
> + sctp_chunk_put(asoc->strreset_chunk);
> + asoc->strreset_chunk = NULL;
> +
> + return retval;
> + }
> +
> + /* Block further xmit of data until this request is completed */
> + for (i = 0; i < asoc->stream->outcnt; i++)
> + asoc->stream->out[i].state = SCTP_STREAM_CLOSED;

I talked offline with Xin about this and we cannot do it this way.
Unfortunatelly we will have to take the long road here, because then we
may send data while sending the request, as the streams are not closed
yet.
We really need to close team, send the request, and re-open if the send
fails.

  Marcelo

> +
> + asoc->strreset_outstanding = 1;
> +
> + return 0;
> +}
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

[PATCH net v1 2/2] amd-xgbe: Check xgbe_init() return code

2017-01-20 Thread Tom Lendacky

The xgbe_init() routine returns a return code indicating success or
failure, but the return code is not checked. Add code to xgbe_init()
to issue a message when failures are seen and add code to check the
xgbe_init() return code.

Signed-off-by: Tom Lendacky 
---
 drivers/net/ethernet/amd/xgbe/xgbe-dev.c |4 +++-
 drivers/net/ethernet/amd/xgbe/xgbe-drv.c |4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c 
b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
index c8e8a4a..a7d16db 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
@@ -3407,8 +3407,10 @@ static int xgbe_init(struct xgbe_prv_data *pdata)
 
/* Flush Tx queues */
ret = xgbe_flush_tx_queues(pdata);
-   if (ret)
+   if (ret) {
+   netdev_err(pdata->netdev, "error flushing TX queues\n");
return ret;
+   }
 
/*
 * Initialize DMA related features
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c 
b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
index 9943629..1c87cc2 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
@@ -1070,7 +1070,9 @@ static int xgbe_start(struct xgbe_prv_data *pdata)
 
DBGPR("-->xgbe_start\n");
 
-   hw_if->init(pdata);
+   ret = hw_if->init(pdata);
+   if (ret)
+   return ret;
 
xgbe_napi_enable(pdata, 1);

[PATCH net v1 0/2] amd-xgbe: AMD XGBE driver fixes 2017-01-20

2017-01-20 Thread Tom Lendacky

This patch series addresses some issues in the AMD XGBE driver.

The following fixes are included in this driver update series:

- Add a fix for a version of the hardware that uses different register
  offset values for a device with the same PCI device ID
- Add support to check the return code from the xgbe_init() function

This patch series is based on net.

---

Tom Lendacky (2):
  amd-xgbe: Add a hardware quirk for register definitions
  amd-xgbe: Check xgbe_init() return code


 drivers/net/ethernet/amd/xgbe/xgbe-common.h |2 ++
 drivers/net/ethernet/amd/xgbe/xgbe-dev.c|8 +---
 drivers/net/ethernet/amd/xgbe/xgbe-drv.c|4 +++-
 drivers/net/ethernet/amd/xgbe/xgbe-pci.c|   15 ++-
 drivers/net/ethernet/amd/xgbe/xgbe.h|2 ++
 5 files changed, 26 insertions(+), 5 deletions(-)

-- 
Tom Lendacky

[PATCH net v1 1/2] amd-xgbe: Add a hardware quirk for register definitions

2017-01-20 Thread Tom Lendacky

A newer version of the hardware is using the same PCI ids for the network
device but has altered register definitions for determining the window
settings for the indirect PCS access.  Add support to check for this
hardware and if found use the new register values.

Signed-off-by: Tom Lendacky 
---
 drivers/net/ethernet/amd/xgbe/xgbe-common.h |2 ++
 drivers/net/ethernet/amd/xgbe/xgbe-dev.c|4 ++--
 drivers/net/ethernet/amd/xgbe/xgbe-pci.c|   15 ++-
 drivers/net/ethernet/amd/xgbe/xgbe.h|2 ++
 4 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-common.h 
b/drivers/net/ethernet/amd/xgbe/xgbe-common.h
index 5b7ba25..8a280e7 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-common.h
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-common.h
@@ -891,6 +891,8 @@
 #define PCS_V1_WINDOW_SELECT   0x03fc
 #define PCS_V2_WINDOW_DEF  0x9060
 #define PCS_V2_WINDOW_SELECT   0x9064
+#define PCS_V2_RV_WINDOW_DEF   0x1060
+#define PCS_V2_RV_WINDOW_SELECT0x1064
 
 /* PCS register entry bit positions and sizes */
 #define PCS_V2_WINDOW_DEF_OFFSET_INDEX 6
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c 
b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
index aaf0350..c8e8a4a 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
@@ -1151,7 +1151,7 @@ static int xgbe_read_mmd_regs_v2(struct xgbe_prv_data 
*pdata, int prtad,
offset = pdata->xpcs_window + (mmd_address & pdata->xpcs_window_mask);
 
spin_lock_irqsave(>xpcs_lock, flags);
-   XPCS32_IOWRITE(pdata, PCS_V2_WINDOW_SELECT, index);
+   XPCS32_IOWRITE(pdata, pdata->xpcs_window_sel_reg, index);
mmd_data = XPCS16_IOREAD(pdata, offset);
spin_unlock_irqrestore(>xpcs_lock, flags);
 
@@ -1183,7 +1183,7 @@ static void xgbe_write_mmd_regs_v2(struct xgbe_prv_data 
*pdata, int prtad,
offset = pdata->xpcs_window + (mmd_address & pdata->xpcs_window_mask);
 
spin_lock_irqsave(>xpcs_lock, flags);
-   XPCS32_IOWRITE(pdata, PCS_V2_WINDOW_SELECT, index);
+   XPCS32_IOWRITE(pdata, pdata->xpcs_window_sel_reg, index);
XPCS16_IOWRITE(pdata, offset, mmd_data);
spin_unlock_irqrestore(>xpcs_lock, flags);
 }
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-pci.c 
b/drivers/net/ethernet/amd/xgbe/xgbe-pci.c
index e76b7f6..c2730f1 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-pci.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-pci.c
@@ -265,6 +265,7 @@ static int xgbe_pci_probe(struct pci_dev *pdev, const 
struct pci_device_id *id)
struct xgbe_prv_data *pdata;
struct device *dev = >dev;
void __iomem * const *iomap_table;
+   struct pci_dev *rdev;
unsigned int ma_lo, ma_hi;
unsigned int reg;
int bar_mask;
@@ -326,8 +327,20 @@ static int xgbe_pci_probe(struct pci_dev *pdev, const 
struct pci_device_id *id)
if (netif_msg_probe(pdata))
dev_dbg(dev, "xpcs_regs  = %p\n", pdata->xpcs_regs);
 
+   /* Set the PCS indirect addressing definition registers */
+   rdev = pci_get_domain_bus_and_slot(0, 0, PCI_DEVFN(0, 0));
+   if (rdev &&
+   (rdev->vendor == PCI_VENDOR_ID_AMD) && (rdev->device == 0x15d0)) {
+   pdata->xpcs_window_def_reg = PCS_V2_RV_WINDOW_DEF;
+   pdata->xpcs_window_sel_reg = PCS_V2_RV_WINDOW_SELECT;
+   } else {
+   pdata->xpcs_window_def_reg = PCS_V2_WINDOW_DEF;
+   pdata->xpcs_window_sel_reg = PCS_V2_WINDOW_SELECT;
+   }
+   pci_dev_put(rdev);
+
/* Configure the PCS indirect addressing support */
-   reg = XPCS32_IOREAD(pdata, PCS_V2_WINDOW_DEF);
+   reg = XPCS32_IOREAD(pdata, pdata->xpcs_window_def_reg);
pdata->xpcs_window = XPCS_GET_BITS(reg, PCS_V2_WINDOW_DEF, OFFSET);
pdata->xpcs_window <<= 6;
pdata->xpcs_window_size = XPCS_GET_BITS(reg, PCS_V2_WINDOW_DEF, SIZE);
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe.h 
b/drivers/net/ethernet/amd/xgbe/xgbe.h
index f52a9bd..0010881 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe.h
+++ b/drivers/net/ethernet/amd/xgbe/xgbe.h
@@ -955,6 +955,8 @@ struct xgbe_prv_data {
 
/* XPCS indirect addressing lock */
spinlock_t xpcs_lock;
+   unsigned int xpcs_window_def_reg;
+   unsigned int xpcs_window_sel_reg;
unsigned int xpcs_window;
unsigned int xpcs_window_size;
unsigned int xpcs_window_mask;

Re: [PATCH net v3] bridge: netlink: call br_changelink() during br_dev_newlink()

2017-01-20 Thread Jiri Pirko

Fri, Jan 20, 2017 at 06:12:17PM CET, c...@cera.cz wrote:
>Any bridge options specified during link creation (e.g. ip link add)
>are ignored as br_dev_newlink() does not process them.
>Use br_changelink() to do it.
>
>Fixes: 1332351 ("bridge: implement rtnl_link_ops->changelink")

Should have 12 chars. Other than that,

Reviewed-by: Jiri Pirko

[PATCHv4 net-next 5/5] sctp: implement sender-side procedures for Add Incoming/Outgoing Streams Request Parameter

2017-01-20 Thread Xin Long

This patch is to implement Sender-Side Procedures for the Add
Outgoing and Incoming Streams Request Parameter described in
rfc6525 section 5.1.5-5.1.6.

It is also to add sockopt SCTP_ADD_STREAMS in rfc6525 section
6.3.4 for users.

Signed-off-by: Xin Long 
---
 include/net/sctp/sctp.h   |  2 ++
 include/uapi/linux/sctp.h |  7 
 net/sctp/socket.c | 29 +
 net/sctp/stream.c | 81 +++
 4 files changed, 119 insertions(+)

diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
index b93820f..68ee1a6 100644
--- a/include/net/sctp/sctp.h
+++ b/include/net/sctp/sctp.h
@@ -199,6 +199,8 @@ int sctp_offload_init(void);
 int sctp_send_reset_streams(struct sctp_association *asoc,
struct sctp_reset_streams *params);
 int sctp_send_reset_assoc(struct sctp_association *asoc);
+int sctp_send_add_streams(struct sctp_association *asoc,
+ struct sctp_add_streams *params);
 
 /*
  * Module global variables
diff --git a/include/uapi/linux/sctp.h b/include/uapi/linux/sctp.h
index c0bd8c3..a91a9cc 100644
--- a/include/uapi/linux/sctp.h
+++ b/include/uapi/linux/sctp.h
@@ -118,6 +118,7 @@ typedef __s32 sctp_assoc_t;
 #define SCTP_ENABLE_STREAM_RESET   118
 #define SCTP_RESET_STREAMS 119
 #define SCTP_RESET_ASSOC   120
+#define SCTP_ADD_STREAMS   121
 
 /* PR-SCTP policies */
 #define SCTP_PR_SCTP_NONE  0x
@@ -1027,4 +1028,10 @@ struct sctp_reset_streams {
uint16_t srs_stream_list[]; /* list if srs_num_streams is not 0 */
 };
 
+struct sctp_add_streams {
+   sctp_assoc_t sas_assoc_id;
+   uint16_t sas_instrms;
+   uint16_t sas_outstrms;
+};
+
 #endif /* _UAPI_SCTP_H */
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 2c5c9ca..ae0a99e 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -3838,6 +3838,32 @@ static int sctp_setsockopt_reset_assoc(struct sock *sk,
return retval;
 }
 
+static int sctp_setsockopt_add_streams(struct sock *sk,
+  char __user *optval,
+  unsigned int optlen)
+{
+   struct sctp_association *asoc;
+   struct sctp_add_streams params;
+   int retval = -EINVAL;
+
+   if (optlen != sizeof(params))
+   goto out;
+
+   if (copy_from_user(, optval, optlen)) {
+   retval = -EFAULT;
+   goto out;
+   }
+
+   asoc = sctp_id2assoc(sk, params.sas_assoc_id);
+   if (!asoc)
+   goto out;
+
+   retval = sctp_send_add_streams(asoc, );
+
+out:
+   return retval;
+}
+
 /* API 6.2 setsockopt(), getsockopt()
  *
  * Applications use setsockopt() and getsockopt() to set or retrieve
@@ -4013,6 +4039,9 @@ static int sctp_setsockopt(struct sock *sk, int level, 
int optname,
case SCTP_RESET_ASSOC:
retval = sctp_setsockopt_reset_assoc(sk, optval, optlen);
break;
+   case SCTP_ADD_STREAMS:
+   retval = sctp_setsockopt_add_streams(sk, optval, optlen);
+   break;
default:
retval = -ENOPROTOOPT;
break;
diff --git a/net/sctp/stream.c b/net/sctp/stream.c
index 3b872a8..cb255e6 100644
--- a/net/sctp/stream.c
+++ b/net/sctp/stream.c
@@ -203,3 +203,84 @@ int sctp_send_reset_assoc(struct sctp_association *asoc)
 
return 0;
 }
+
+int sctp_send_add_streams(struct sctp_association *asoc,
+ struct sctp_add_streams *params)
+{
+   struct sctp_stream *stream = asoc->stream;
+   struct sctp_chunk *chunk = NULL;
+   int retval = -ENOMEM;
+   __u16 out, in, nums;
+
+   if (!asoc->peer.reconf_capable ||
+   !(asoc->strreset_enable & SCTP_ENABLE_CHANGE_ASSOC_REQ)) {
+   retval = -ENOPROTOOPT;
+   goto out;
+   }
+
+   if (asoc->strreset_outstanding) {
+   retval = -EINPROGRESS;
+   goto out;
+   }
+
+   out = params->sas_outstrms;
+   in  = params->sas_instrms;
+   if (stream->outcnt + out > SCTP_MAX_STREAM ||
+   stream->incnt + in > SCTP_MAX_STREAM || (!out && !in)) {
+   retval = -EINVAL;
+   goto out;
+   }
+
+   nums = stream->outcnt + out;
+   /* Use ksize to check if stream array really needs to realloc */
+   if (out && ksize(stream->out) < nums * sizeof(*stream->out)) {
+   struct sctp_stream_out *streamout;
+
+   streamout = kcalloc(nums, sizeof(*streamout), GFP_KERNEL);
+   if (!streamout)
+   goto out;
+
+   memcpy(streamout, stream->out,
+  sizeof(*streamout) * stream->outcnt);
+
+   kfree(stream->out);
+   stream->out = streamout;
+   }
+
+   nums = stream->incnt + in;
+   if (in && ksize(stream->in) < nums * sizeof(*stream->in)) {
+   struct

[PATCHv4 net-next 2/5] sctp: add support for generating stream reconf ssn/tsn reset request chunk

2017-01-20 Thread Xin Long

This patch is to define SSN/TSN Reset Request Parameter described
in rfc6525 section 4.3.

Signed-off-by: Xin Long 
---
 include/linux/sctp.h |  5 +
 include/net/sctp/sm.h|  2 ++
 net/sctp/sm_make_chunk.c | 29 +
 3 files changed, 36 insertions(+)

diff --git a/include/linux/sctp.h b/include/linux/sctp.h
index a9e7906..95b8ed3 100644
--- a/include/linux/sctp.h
+++ b/include/linux/sctp.h
@@ -737,4 +737,9 @@ struct sctp_strreset_inreq {
__u16 list_of_streams[0];
 } __packed;
 
+struct sctp_strreset_tsnreq {
+   sctp_paramhdr_t param_hdr;
+   __u32 request_seq;
+} __packed;
+
 #endif /* __LINUX_SCTP_H__ */
diff --git a/include/net/sctp/sm.h b/include/net/sctp/sm.h
index 430ed13..ac37c17 100644
--- a/include/net/sctp/sm.h
+++ b/include/net/sctp/sm.h
@@ -265,6 +265,8 @@ struct sctp_chunk *sctp_make_strreset_req(
const struct sctp_association *asoc,
__u16 stream_num, __u16 *stream_list,
bool out, bool in);
+struct sctp_chunk *sctp_make_strreset_tsnreq(
+   const struct sctp_association *asoc);
 void sctp_chunk_assign_tsn(struct sctp_chunk *);
 void sctp_chunk_assign_ssn(struct sctp_chunk *);
 
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index ad3445b..801450c 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -3660,3 +3660,32 @@ struct sctp_chunk *sctp_make_strreset_req(
 
return retval;
 }
+
+/* RE-CONFIG 4.3 (SSN/TSN RESET ALL)
+ *   0   1   2   3
+ *   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ *  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ *  | Parameter Type = 15   |  Parameter Length = 8 |
+ *  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ *  | Re-configuration Request Sequence Number  |
+ *  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ */
+struct sctp_chunk *sctp_make_strreset_tsnreq(
+   const struct sctp_association *asoc)
+{
+   struct sctp_strreset_tsnreq tsnreq;
+   __u16 length = sizeof(tsnreq);
+   struct sctp_chunk *retval;
+
+   retval = sctp_make_reconf(asoc, length);
+   if (!retval)
+   return NULL;
+
+   tsnreq.param_hdr.type = SCTP_PARAM_RESET_TSN_REQUEST;
+   tsnreq.param_hdr.length = htons(length);
+   tsnreq.request_seq = htonl(asoc->strreset_outseq);
+
+   sctp_addto_chunk(retval, sizeof(tsnreq), );
+
+   return retval;
+}
-- 
2.1.0

[PATCHv4 net-next 3/5] sctp: implement sender-side procedures for SSN/TSN Reset Request Parameter

2017-01-20 Thread Xin Long

This patch is to implement Sender-Side Procedures for the SSN/TSN
Reset Request Parameter descibed in rfc6525 section 5.1.4.

It is also to add sockopt SCTP_RESET_ASSOC in rfc6525 section 6.3.3
for users.

Signed-off-by: Xin Long 
---
 include/net/sctp/sctp.h   |  1 +
 include/uapi/linux/sctp.h |  1 +
 net/sctp/socket.c | 29 +
 net/sctp/stream.c | 37 +
 4 files changed, 68 insertions(+)

diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
index 3cfd365b..b93820f 100644
--- a/include/net/sctp/sctp.h
+++ b/include/net/sctp/sctp.h
@@ -198,6 +198,7 @@ int sctp_offload_init(void);
  */
 int sctp_send_reset_streams(struct sctp_association *asoc,
struct sctp_reset_streams *params);
+int sctp_send_reset_assoc(struct sctp_association *asoc);
 
 /*
  * Module global variables
diff --git a/include/uapi/linux/sctp.h b/include/uapi/linux/sctp.h
index 03c27ce..c0bd8c3 100644
--- a/include/uapi/linux/sctp.h
+++ b/include/uapi/linux/sctp.h
@@ -117,6 +117,7 @@ typedef __s32 sctp_assoc_t;
 #define SCTP_PR_ASSOC_STATUS   115
 #define SCTP_ENABLE_STREAM_RESET   118
 #define SCTP_RESET_STREAMS 119
+#define SCTP_RESET_ASSOC   120
 
 /* PR-SCTP policies */
 #define SCTP_PR_SCTP_NONE  0x
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index bee4dd3..2c5c9ca 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -3812,6 +3812,32 @@ static int sctp_setsockopt_reset_streams(struct sock *sk,
return retval;
 }
 
+static int sctp_setsockopt_reset_assoc(struct sock *sk,
+  char __user *optval,
+  unsigned int optlen)
+{
+   struct sctp_association *asoc;
+   sctp_assoc_t associd;
+   int retval = -EINVAL;
+
+   if (optlen != sizeof(associd))
+   goto out;
+
+   if (copy_from_user(, optval, optlen)) {
+   retval = -EFAULT;
+   goto out;
+   }
+
+   asoc = sctp_id2assoc(sk, associd);
+   if (!asoc)
+   goto out;
+
+   retval = sctp_send_reset_assoc(asoc);
+
+out:
+   return retval;
+}
+
 /* API 6.2 setsockopt(), getsockopt()
  *
  * Applications use setsockopt() and getsockopt() to set or retrieve
@@ -3984,6 +4010,9 @@ static int sctp_setsockopt(struct sock *sk, int level, 
int optname,
case SCTP_RESET_STREAMS:
retval = sctp_setsockopt_reset_streams(sk, optval, optlen);
break;
+   case SCTP_RESET_ASSOC:
+   retval = sctp_setsockopt_reset_assoc(sk, optval, optlen);
+   break;
default:
retval = -ENOPROTOOPT;
break;
diff --git a/net/sctp/stream.c b/net/sctp/stream.c
index 53c67d6..3b872a8 100644
--- a/net/sctp/stream.c
+++ b/net/sctp/stream.c
@@ -166,3 +166,40 @@ int sctp_send_reset_streams(struct sctp_association *asoc,
 out:
return retval;
 }
+
+int sctp_send_reset_assoc(struct sctp_association *asoc)
+{
+   struct sctp_chunk *chunk = NULL;
+   int retval;
+   __u16 i;
+
+   if (!asoc->peer.reconf_capable ||
+   !(asoc->strreset_enable & SCTP_ENABLE_RESET_ASSOC_REQ))
+   return -ENOPROTOOPT;
+
+   if (asoc->strreset_outstanding)
+   return -EINPROGRESS;
+
+   chunk = sctp_make_strreset_tsnreq(asoc);
+   if (!chunk)
+   return -ENOMEM;
+
+   asoc->strreset_chunk = chunk;
+   sctp_chunk_hold(asoc->strreset_chunk);
+
+   retval = sctp_send_reconf(asoc, chunk);
+   if (retval) {
+   sctp_chunk_put(asoc->strreset_chunk);
+   asoc->strreset_chunk = NULL;
+
+   return retval;
+   }
+
+   /* Block further xmit of data until this request is completed */
+   for (i = 0; i < asoc->stream->outcnt; i++)
+   asoc->stream->out[i].state = SCTP_STREAM_CLOSED;
+
+   asoc->strreset_outstanding = 1;
+
+   return 0;
+}
-- 
2.1.0

[PATCHv4 net-next 4/5] sctp: add support for generating stream reconf add incoming/outgoing streams request chunk

2017-01-20 Thread Xin Long

This patch is to define Add Incoming/Outgoing Streams Request
Parameter described in rfc6525 section 4.5 and 4.6. They can
be in one same chunk trunk as rfc6525 section 3.1-7 describes,
so make them in one function.

Signed-off-by: Xin Long 
---
 include/linux/sctp.h |  7 +++
 include/net/sctp/sm.h|  3 +++
 net/sctp/sm_make_chunk.c | 46 ++
 3 files changed, 56 insertions(+)

diff --git a/include/linux/sctp.h b/include/linux/sctp.h
index 95b8ed3..f1f494f 100644
--- a/include/linux/sctp.h
+++ b/include/linux/sctp.h
@@ -742,4 +742,11 @@ struct sctp_strreset_tsnreq {
__u32 request_seq;
 } __packed;
 
+struct sctp_strreset_addstrm {
+   sctp_paramhdr_t param_hdr;
+   __u32 request_seq;
+   __u16 number_of_streams;
+   __u16 reserved;
+} __packed;
+
 #endif /* __LINUX_SCTP_H__ */
diff --git a/include/net/sctp/sm.h b/include/net/sctp/sm.h
index ac37c17..3675fde 100644
--- a/include/net/sctp/sm.h
+++ b/include/net/sctp/sm.h
@@ -267,6 +267,9 @@ struct sctp_chunk *sctp_make_strreset_req(
bool out, bool in);
 struct sctp_chunk *sctp_make_strreset_tsnreq(
const struct sctp_association *asoc);
+struct sctp_chunk *sctp_make_strreset_addstrm(
+   const struct sctp_association *asoc,
+   __u16 out, __u16 in);
 void sctp_chunk_assign_tsn(struct sctp_chunk *);
 void sctp_chunk_assign_ssn(struct sctp_chunk *);
 
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index 801450c..a44546d 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -3689,3 +3689,49 @@ struct sctp_chunk *sctp_make_strreset_tsnreq(
 
return retval;
 }
+
+/* RE-CONFIG 4.5/4.6 (ADD STREAM)
+ *   0   1   2   3
+ *   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ *  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ *  | Parameter Type = 17   |  Parameter Length = 12|
+ *  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ *  |  Re-configuration Request Sequence Number |
+ *  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ *  |  Number of new streams| Reserved  |
+ *  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ */
+struct sctp_chunk *sctp_make_strreset_addstrm(
+   const struct sctp_association *asoc,
+   __u16 out, __u16 in)
+{
+   struct sctp_strreset_addstrm addstrm;
+   __u16 size = sizeof(addstrm);
+   struct sctp_chunk *retval;
+
+   retval = sctp_make_reconf(asoc, (!!out + !!in) * size);
+   if (!retval)
+   return NULL;
+
+   if (out) {
+   addstrm.param_hdr.type = SCTP_PARAM_RESET_ADD_OUT_STREAMS;
+   addstrm.param_hdr.length = htons(size);
+   addstrm.number_of_streams = htons(out);
+   addstrm.request_seq = htonl(asoc->strreset_outseq);
+   addstrm.reserved = 0;
+
+   sctp_addto_chunk(retval, size, );
+   }
+
+   if (in) {
+   addstrm.param_hdr.type = SCTP_PARAM_RESET_ADD_IN_STREAMS;
+   addstrm.param_hdr.length = htons(size);
+   addstrm.number_of_streams = htons(in);
+   addstrm.request_seq = htonl(asoc->strreset_outseq + !!out);
+   addstrm.reserved = 0;
+
+   sctp_addto_chunk(retval, size, );
+   }
+
+   return retval;
+}
-- 
2.1.0

[PATCHv4 net-next 0/5] sctp: add sender-side procedures for stream reconf asoc reset and add streams

2017-01-20 Thread Xin Long

Patch 3/5 is to implement sender-side procedures for the SSN/TSN Reset
Request Parameter described in rfc6525 section 5.1.4, patch 2/5 is
ahead of it to define a function to make the request chunk for it.

Patch 5/5 is to implement sender-side procedures for the Add Incoming
and Outgoing Streams Request Parameter Request Parameter described in
rfc6525 section 5.1.5 and 5.1.6, patch 4/5 is ahead of it to define a
function to make the request chunk for it.

Patch 1/5 is a fix to make streams be closed only when request is sent
successfully.

v1->v2:
  - put these into a smaller group.
  - rename some temporary variables in the codes.
  - rename the titles of the commits and improve some changelogs.
v2->v3:
  - re-split the patchset and make sure it has no dead codes for review.
  - move some codes into stream.c from socket.c.
v3->v4:
  - add one more patch to fix a send reset stream request issue.
  - doing actual work only when request is sent successfully.
  - reduce some indents in sctp_send_add_streams.

Xin Long (5):
  sctp: streams should be closed when stream reset request is sent
successfully.
  sctp: add support for generating stream reconf ssn/tsn reset request
chunk
  sctp: implement sender-side procedures for SSN/TSN Reset Request
Parameter
  sctp: add support for generating stream reconf add incoming/outgoing
streams request chunk
  sctp: implement sender-side procedures for Add Incoming/Outgoing
Streams Request Parameter

 include/linux/sctp.h  |  12 +
 include/net/sctp/sctp.h   |   3 ++
 include/net/sctp/sm.h |   5 ++
 include/uapi/linux/sctp.h |   8 +++
 net/sctp/sm_make_chunk.c  |  75 
 net/sctp/socket.c |  58 ++
 net/sctp/stream.c | 124 +-
 7 files changed, 284 insertions(+), 1 deletion(-)

-- 
2.1.0

1 2 3 >

1 - 100 of 208 matches

Mail list logo