date:20150528

Re: [PATCH v4 net-next 05/11] net: Add full IPv6 addresses to flow_keys

2015-05-28 Thread Tom Herbert

On Thu, May 28, 2015 at 2:44 PM, Eric Dumazet eric.duma...@gmail.com wrote:
 On Thu, 2015-05-28 at 11:19 -0700, Tom Herbert wrote:

 @@ -566,11 +640,15 @@ static const struct flow_dissector_key 
 flow_keys_dissector_keys[] = {
   },
   {
   .key_id = FLOW_DISSECTOR_KEY_IPV4_ADDRS,
 - .offset = offsetof(struct flow_keys, addrs),
 + .offset = offsetof(struct flow_keys, addrs.v4addrs),
 + },
 + {
 + .key_id = FLOW_DISSECTOR_KEY_IPV6_ADDRS,
 + .offset = offsetof(struct flow_keys, addrs.v6addrs),
   },
   {
   .key_id = FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS,
 - .offset = offsetof(struct flow_keys, addrs),
 + .offset = offsetof(struct flow_keys, addrs.v4addrs),

 Shouldn't it be offsetof(struct flow_keys, addrs.v6addrs), ?

This is to hash 128 bit IP addresses into 32 bit values which fit in
the v4addrs area. This completely goes away in 07 patch in this set.



--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next 0/2] hv_netvsc: Implement NUMA aware memory allocation

2015-05-28 Thread K. Y. Srinivasan

Allocate both receive buffer and send buffer from the NUMA node assigned to the
primary channel.

K. Y. Srinivasan (2):
  hv_netvsc: Allocate the receive buffer from the correct NUMA node
  hv_netvsc: Allocate the sendbuf in a NUMA aware way

 drivers/net/hyperv/netvsc.c |   11 +--
 1 files changed, 9 insertions(+), 2 deletions(-)

-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH V2 net-next 1/2] hv_netvsc: Allocate the receive buffer from the correct NUMA node

2015-05-28 Thread K. Y. Srinivasan

Allocate the receive bufer from the NUMA node assigned to the primary
channel.

Signed-off-by: K. Y. Srinivasan k...@microsoft.com
---
V2: Specify the tree for this patch.

 drivers/net/hyperv/netvsc.c |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index b024968..d187965 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -227,13 +227,18 @@ static int netvsc_init_buf(struct hv_device *device)
struct netvsc_device *net_device;
struct nvsp_message *init_packet;
struct net_device *ndev;
+   int node;
 
net_device = get_outbound_net_device(device);
if (!net_device)
return -ENODEV;
ndev = net_device-ndev;
 
-   net_device-recv_buf = vzalloc(net_device-recv_buf_size);
+   node = cpu_to_node(device-channel-target_cpu);
+   net_device-recv_buf = vzalloc_node(net_device-recv_buf_size, node);
+   if (!net_device-recv_buf)
+   net_device-recv_buf = vzalloc(net_device-recv_buf_size);
+
if (!net_device-recv_buf) {
netdev_err(ndev, unable to allocate receive 
buffer of size %d\n, net_device-recv_buf_size);
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next 2/2] hv_netvsc: Allocate the sendbuf in a NUMA aware way

2015-05-28 Thread K. Y. Srinivasan

Allocate the send buffer in a NUMA aware way.

Signed-off-by: K. Y. Srinivasan k...@microsoft.com
---
 drivers/net/hyperv/netvsc.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index d187965..06de98a 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -326,7 +326,9 @@ static int netvsc_init_buf(struct hv_device *device)
 
/* Now setup the send buffer.
 */
-   net_device-send_buf = vzalloc(net_device-send_buf_size);
+   net_device-send_buf = vzalloc_node(net_device-send_buf_size, node);
+   if (!net_device-send_buf)
+   net_device-send_buf = vzalloc(net_device-send_buf_size);
if (!net_device-send_buf) {
netdev_err(ndev, unable to allocate send 
   buffer of size %d\n, net_device-send_buf_size);
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: iproute2: missing patches in branch net-next

2015-05-28 Thread Stephen Hemminger

On Thu, 28 May 2015 18:32:32 +0200
Daniel Borkmann dan...@iogearbox.net wrote:

 On 05/28/2015 06:19 PM, Stephen Hemminger wrote:
  On Thu, 28 May 2015 13:31:08 +0200
  Nicolas Dichtel nicolas.dich...@6wind.com wrote:
 
  Hi Stephen,
 
  some patches that were recently included in iproute2 branch net-next are 
  not
  visible anymore on kernel.org. It seems that the branch has been overridden
  (note the forced update when I've fetched it):
 
  $ git fetch
  remote: Counting objects: 65, done.
  remote: Compressing objects: 100% (65/65), done.
  remote: Total 65 (delta 58), reused 0 (delta 0)
  Unpacking objects: 100% (65/65), done.
From git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2
 + aacee2695a90...eb9d6e794b52 net-next   - origin/net-next  (forced 
  update)
   f043759dd492..c52827e9077f  master - origin/master
 
 
  The following patches are lost:
  aacee2695a90 tc: gred: Add support for TCA_GRED_LIMIT attribute
  b6ec53e3008a xfrmmonitor: allows to monitor in several netns
  449b824ad196 ipmonitor: allows to monitor in several netns
  3b0006f8183e ipmonitor: introduce print_headers
  0628cddd9d5c libnetlink: introduce rtnl_listen_filter_t
  2503247d58c3 man: update ip monitor page
  6fc1f8add30b iplink_bond: add support for ad_actor and port_key options
  df1c7d9138ea codel: add ce_threshold support to codel  fc_codel
  30eb304ecd1d tc: add support for Flower classifier
  1a4dda7103bc ss: add support for bytes_acked  bytes_received
  908755dc49df iproute2: GENEVE support
  f9b004020a89 Merge branch 'master' into net-next
  8f42ceaf2491 Update kernels for net-next
 
 
  Regards,
  Nicolas
 
  Ah found it was botched merge. The commits were still there locally.
  Should be fixed now, but had to force back to known good state on net-next 
  branch.
 
 Okay, but now the iproute2 -next patches from last days are gone,
 right? I noticed the tc man page bits applied from yesterday are
 not in the -next tree anymore. Do you re-push those on top of the
 current restored state?
 
 Thanks,
 Daniel

I will go back and recreate what is missing.
Sorry for the confusion.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: iproute2: missing patches in branch net-next

2015-05-28 Thread Daniel Borkmann


On 05/29/2015 01:12 AM, Stephen Hemminger wrote:
...

I will go back and recreate what is missing.
Sorry for the confusion.


Great thanks, no problem.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 6/9] ssb: drop unneeded goto

2015-05-28 Thread Julia Lawall

From: Julia Lawall julia.law...@lip6.fr

Delete jump to a label on the next line, when that label is not
used elsewhere.

A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)

// smpl
@r@
identifier l;
@@

-if (...) goto l;
-l:
// /smpl

Also drop the unneeded err variable.

Signed-off-by: Julia Lawall julia.law...@lip6.fr

---
 drivers/ssb/pci.c |8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/ssb/pci.c b/drivers/ssb/pci.c
index 0f28c08..d6ca4d3 100644
--- a/drivers/ssb/pci.c
+++ b/drivers/ssb/pci.c
@@ -1173,17 +1173,11 @@ void ssb_pci_exit(struct ssb_bus *bus)
 int ssb_pci_init(struct ssb_bus *bus)
 {
struct pci_dev *pdev;
-   int err;
 
if (bus-bustype != SSB_BUSTYPE_PCI)
return 0;
 
pdev = bus-host_pci;
mutex_init(bus-sprom_mutex);
-   err = device_create_file(pdev-dev, dev_attr_ssb_sprom);
-   if (err)
-   goto out;
-
-out:
-   return err;
+   return device_create_file(pdev-dev, dev_attr_ssb_sprom);
 }

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC 1/3] net: dsa: add basic support for VLAN ndo

2015-05-28 Thread Vivien Didelot

This patch adds the ndo_vlan_rx_add_vid, ndo_vlan_rx_kill_vid, and
ndo_bridge_setlink wrapper operations, used to create and remove VLAN
entries in a DSA switch VLAN database.

The switch drivers have to implement the port_vlan_add, port_vlan_kill,
and port_bridge_setlink functions, in order to support VLANs.

Signed-off-by: Vivien Didelot vivien.dide...@savoirfairelinux.com
---
 include/net/dsa.h |  9 +++
 net/dsa/slave.c   | 76 +--
 2 files changed, 83 insertions(+), 2 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index fbca63b..cf02357 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -19,6 +19,7 @@
 #include linux/phy.h
 #include linux/phy_fixed.h
 #include linux/ethtool.h
+#include uapi/linux/if_bridge.h
 
 enum dsa_tag_protocol {
DSA_TAG_PROTO_NONE = 0,
@@ -302,6 +303,14 @@ struct dsa_switch_driver {
   const unsigned char *addr, u16 vid);
int (*fdb_getnext)(struct dsa_switch *ds, int port,
   unsigned char *addr, bool *is_static);
+
+   /*
+* VLAN support
+*/
+   int (*port_vlan_add)(struct dsa_switch *ds, int port, u16 vid);
+   int (*port_vlan_kill)(struct dsa_switch *ds, int port, u16 vid);
+   int (*port_bridge_setlink)(struct dsa_switch *ds, int port,
+  struct bridge_vlan_info *vinfo);
 };
 
 void register_switch_driver(struct dsa_switch_driver *type);
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 827cda56..72c3ff0 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -412,6 +412,71 @@ static netdev_tx_t dsa_slave_notag_xmit(struct sk_buff 
*skb,
return NETDEV_TX_OK;
 }
 
+static int dsa_slave_vlan_rx_add_vid(struct net_device *dev,
+__be16 proto, u16 vid)
+{
+   struct dsa_slave_priv *p = netdev_priv(dev);
+   struct dsa_switch *ds = p-parent;
+
+   if (!ds-drv-port_vlan_add)
+   return -EOPNOTSUPP;
+
+   netdev_dbg(dev, adding to VLAN %d\n, vid);
+
+   return ds-drv-port_vlan_add(ds, p-port, vid);
+}
+
+static int dsa_slave_vlan_rx_kill_vid(struct net_device *dev,
+ __be16 proto, u16 vid)
+{
+   struct dsa_slave_priv *p = netdev_priv(dev);
+   struct dsa_switch *ds = p-parent;
+
+   if (!ds-drv-port_vlan_kill)
+   return -EOPNOTSUPP;
+
+   netdev_dbg(dev, removing from VLAN %d\n, vid);
+
+   return ds-drv-port_vlan_kill(ds, p-port, vid);
+}
+
+static int dsa_slave_bridge_setlink(struct net_device *dev,
+   struct nlmsghdr *nlh, u16 flags)
+{
+   struct dsa_slave_priv *p = netdev_priv(dev);
+   struct dsa_switch *ds = p-parent;
+   struct nlattr *afspec;
+   struct nlattr *attr;
+   struct bridge_vlan_info *vinfo = NULL;
+   int rem;
+
+   if (!ds-drv-port_bridge_setlink)
+   return -EOPNOTSUPP;
+
+   afspec = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg), IFLA_AF_SPEC);
+   if (!afspec)
+   return -EINVAL;
+
+   nla_for_each_nested(attr, afspec, rem) {
+   if (nla_type(attr) != IFLA_BRIDGE_VLAN_INFO)
+   continue;
+
+   if (nla_len(attr) != sizeof(struct bridge_vlan_info))
+   return -EINVAL;
+
+   vinfo = nla_data(attr);
+   }
+
+   if (!vinfo)
+   return -EINVAL;
+
+   netdev_dbg(dev, setting link to VLAN %d%s%s\n, vinfo-vid,
+  vinfo-flags  BRIDGE_VLAN_INFO_UNTAGGED ?   untagged : ,
+  vinfo-flags  BRIDGE_VLAN_INFO_PVID ?  (default) : );
+
+   return ds-drv-port_bridge_setlink(ds, p-port, vinfo);
+}
+
 
 /* ethtool operations ***/
 static int
@@ -673,6 +738,9 @@ static const struct net_device_ops dsa_slave_netdev_ops = {
.ndo_fdb_dump   = dsa_slave_fdb_dump,
.ndo_do_ioctl   = dsa_slave_ioctl,
.ndo_get_iflink = dsa_slave_get_iflink,
+   .ndo_vlan_rx_add_vid= dsa_slave_vlan_rx_add_vid,
+   .ndo_vlan_rx_kill_vid   = dsa_slave_vlan_rx_kill_vid,
+   .ndo_bridge_setlink = dsa_slave_bridge_setlink,
 };
 
 static const struct swdev_ops dsa_slave_swdev_ops = {
@@ -854,7 +922,9 @@ int dsa_slave_create(struct dsa_switch *ds, struct device 
*parent,
if (slave_dev == NULL)
return -ENOMEM;
 
-   slave_dev-features = master-vlan_features;
+   slave_dev-features = master-vlan_features |
+   NETIF_F_VLAN_FEATURES |
+   NETIF_F_HW_SWITCH_OFFLOAD;
slave_dev-ethtool_ops = dsa_slave_ethtool_ops;
eth_hw_addr_inherit(slave_dev, master);
slave_dev-tx_queue_len = 0;
@@ -863,7 +933,9 @@ int dsa_slave_create(struct dsa_switch *ds, struct device 
*parent,
 
SET_NETDEV_DEV(slave_dev, parent);

[RFC 2/3] net: dsa: mv88e6xxx: add support for VTU operations

2015-05-28 Thread Vivien Didelot

This commit implements the port_vlan_add, port_vlan_kill, and
port_bridge_setlink dsa_switch_driver functions to access the VTU, and
thus add support for adding, removing VLANs, and joining ports to them.

Signed-off-by: Vivien Didelot vivien.dide...@savoirfairelinux.com
---
 drivers/net/dsa/mv88e6xxx.c | 309 
 drivers/net/dsa/mv88e6xxx.h |  28 
 2 files changed, 337 insertions(+)

diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
index cf309aa9..2f4c99f 100644
--- a/drivers/net/dsa/mv88e6xxx.c
+++ b/drivers/net/dsa/mv88e6xxx.c
@@ -2,6 +2,9 @@
  * net/dsa/mv88e6xxx.c - Marvell 88e6xxx switch chip support
  * Copyright (c) 2008 Marvell Semiconductor
  *
+ * Copyright (c) 2015 CMC Electronics, Inc.
+ * Added support for 802.1q VTU operations
+ *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
  * the Free Software Foundation; either version 2 of the License, or
@@ -1241,6 +1244,312 @@ static void mv88e6xxx_bridge_work(struct work_struct 
*work)
}
 }
 
+static int _mv88e6xxx_vtu_wait(struct dsa_switch *ds)
+{
+   return _mv88e6xxx_wait(ds, REG_GLOBAL, GLOBAL_VTU_OP,
+  GLOBAL_VTU_OP_BUSY);
+}
+
+static int _mv88e6xxx_vtu_cmd(struct dsa_switch *ds, u16 op)
+{
+   int ret;
+
+   ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_OP, op);
+   if (ret  0)
+   return ret;
+
+   return _mv88e6xxx_vtu_wait(ds);
+}
+
+static int _mv88e6xxx_stu_loadpurge(struct dsa_switch *ds, u8 sid, bool valid)
+{
+   int ret, data;
+
+   ret = _mv88e6xxx_vtu_wait(ds);
+   if (ret  0)
+   return ret;
+
+   data = sid  GLOBAL_VTU_SID_MASK;
+   if (valid)
+   data |= GLOBAL_VTU_VID_VALID;
+
+   ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_VID, data);
+   if (ret  0)
+   return ret;
+
+   /* Unused (yet) data registers */
+   ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_DATA_0_3, 0);
+   if (ret  0)
+   return ret;
+
+   ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_DATA_4_7, 0);
+   if (ret  0)
+   return ret;
+
+   ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_DATA_8_11, 0);
+   if (ret  0)
+   return ret;
+
+   return _mv88e6xxx_vtu_cmd(ds, GLOBAL_VTU_OP_STU_LOAD_PURGE);
+}
+
+static int _mv88e6xxx_vtu_getnext(struct dsa_switch *ds, u16 vid,
+ struct mv88e6xxx_vtu_entry *entry)
+{
+   int ret, i;
+
+   ret = _mv88e6xxx_vtu_wait(ds);
+   if (ret  0)
+   return ret;
+
+   ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_VID,
+  vid  GLOBAL_VTU_VID_MASK);
+   if (ret  0)
+   return ret;
+
+   ret = _mv88e6xxx_vtu_cmd(ds, GLOBAL_VTU_OP_VTU_GET_NEXT);
+   if (ret  0)
+   return ret;
+
+   ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_VID);
+   if (ret  0)
+   return ret;
+
+   entry-vid = ret  GLOBAL_VTU_VID_MASK;
+   entry-valid = !!(ret  GLOBAL_VTU_VID_VALID);
+
+   if (entry-valid) {
+   /* Ports 0-3, offsets 0, 4, 8, 12 */
+   ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_DATA_0_3);
+   if (ret  0)
+   return ret;
+
+   for (i = 0; i  4; ++i)
+   entry-tags[i] = (ret  (i * 4))  3;
+
+   /* Ports 4-6, offsets 0, 4, 8 */
+   ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_DATA_4_7);
+   if (ret  0)
+   return ret;
+
+   for (i = 4; i  7; ++i)
+   entry-tags[i] = (ret  ((i - 4) * 4))  3;
+
+   ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_FID);
+   if (ret  0)
+   return ret;
+
+   entry-fid = ret  GLOBAL_VTU_FID_MASK;
+
+   ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_SID);
+   if (ret  0)
+   return ret;
+
+   entry-sid = ret  GLOBAL_VTU_SID_MASK;
+   }
+
+   return 0;
+}
+
+static int _mv88e6xxx_vtu_loadpurge(struct dsa_switch *ds,
+   struct mv88e6xxx_vtu_entry *entry)
+{
+   u16 data = 0;
+   int ret, i;
+
+   ret = _mv88e6xxx_vtu_wait(ds);
+   if (ret  0)
+   return ret;
+
+   if (entry-valid) {
+   /* Set Data Register, ports 0-3, offsets 0, 4, 8, 12 */
+   for (data = i = 0; i  4; ++i)
+   data |= entry-tags[i]  (i * 4);
+   ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_DATA_0_3,
+  data);
+   if (ret  0)
+   return ret;
+
+   /* Set

[PATCH 1/1] hv_netvsc: Allocate the receive buffer from the correct NUMA node

2015-05-28 Thread K. Y. Srinivasan

Allocate the receive bufer from the NUMA node assigned to the primary
channel.

Signed-off-by: K. Y. Srinivasan k...@microsoft.com
---
 drivers/net/hyperv/netvsc.c |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index b024968..d187965 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -227,13 +227,18 @@ static int netvsc_init_buf(struct hv_device *device)
struct netvsc_device *net_device;
struct nvsp_message *init_packet;
struct net_device *ndev;
+   int node;
 
net_device = get_outbound_net_device(device);
if (!net_device)
return -ENODEV;
ndev = net_device-ndev;
 
-   net_device-recv_buf = vzalloc(net_device-recv_buf_size);
+   node = cpu_to_node(device-channel-target_cpu);
+   net_device-recv_buf = vzalloc_node(net_device-recv_buf_size, node);
+   if (!net_device-recv_buf)
+   net_device-recv_buf = vzalloc(net_device-recv_buf_size);
+
if (!net_device-recv_buf) {
netdev_err(ndev, unable to allocate receive 
buffer of size %d\n, net_device-recv_buf_size);
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net 3/3] bna: fix soft lock-up during firmware initialization failure

2015-05-28 Thread Ivan Vecera

Bug in the driver initialization causes soft-lockup if firmware
initialization timeout is reached. Polling function bfa_ioc_poll_fwinit()
incorrectly calls bfa_nw_iocpf_timeout() when the timeout is reached.
The problem is that bfa_nw_iocpf_timeout() calls again
bfa_ioc_poll_fwinit()... etc. The bfa_ioc_poll_fwinit() should directly
send timeout event for iocpf and the same should be done if firmware
download into HW fails.

Cc: Rasesh Mody rasesh.m...@qlogic.com
Signed-off-by: Ivan Vecera ivec...@redhat.com
---
 drivers/net/ethernet/brocade/bna/bfa_ioc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/brocade/bna/bfa_ioc.c 
b/drivers/net/ethernet/brocade/bna/bfa_ioc.c
index 594a2ab..68f3c13 100644
--- a/drivers/net/ethernet/brocade/bna/bfa_ioc.c
+++ b/drivers/net/ethernet/brocade/bna/bfa_ioc.c
@@ -2414,7 +2414,7 @@ bfa_ioc_boot(struct bfa_ioc *ioc, enum bfi_fwboot_type 
boot_type,
if (status == BFA_STATUS_OK)
bfa_ioc_lpu_start(ioc);
else
-   bfa_nw_iocpf_timeout(ioc);
+   bfa_fsm_send_event(ioc-iocpf, IOCPF_E_TIMEOUT);
 
return status;
 }
@@ -3029,7 +3029,7 @@ bfa_ioc_poll_fwinit(struct bfa_ioc *ioc)
}
 
if (ioc-iocpf.poll_time = BFA_IOC_TOV) {
-   bfa_nw_iocpf_timeout(ioc);
+   bfa_fsm_send_event(ioc-iocpf, IOCPF_E_TIMEOUT);
} else {
ioc-iocpf.poll_time += BFA_IOC_POLL_TOV;
mod_timer(ioc-iocpf_timer, jiffies +
-- 
2.3.6

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net 2/3] bna: remove unreasonable iocpf timer start

2015-05-28 Thread Ivan Vecera

Driver starts iocpf timer prior bnad_ioceth_enable() call and this is
unreasonable. This piece of code probably originates from Brocade/Qlogic
out-of-box driver during initial import into upstream. This driver uses
only one timer and queue to implement multiple timers and this timer is
started at this place. The upstream driver uses multiple timers instead
of this.

Cc: Rasesh Mody rasesh.m...@qlogic.com
Signed-off-by: Ivan Vecera ivec...@redhat.com
---
 drivers/net/ethernet/brocade/bna/bnad.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/net/ethernet/brocade/bna/bnad.c 
b/drivers/net/ethernet/brocade/bna/bnad.c
index 37072a8..caae6cb 100644
--- a/drivers/net/ethernet/brocade/bna/bnad.c
+++ b/drivers/net/ethernet/brocade/bna/bnad.c
@@ -3701,10 +3701,6 @@ bnad_pci_probe(struct pci_dev *pdev,
setup_timer(bnad-bna.ioceth.ioc.sem_timer, bnad_iocpf_sem_timeout,
((unsigned long)bnad));
 
-   /* Now start the timer before calling IOC */
-   mod_timer(bnad-bna.ioceth.ioc.iocpf_timer,
- jiffies + msecs_to_jiffies(BNA_IOC_TIMER_FREQ));
-
/*
 * Start the chip
 * If the call back comes with error, we bail out.
-- 
2.3.6

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC 3/3] net: dsa: mv88e6352: add support for VLAN

2015-05-28 Thread Vivien Didelot

This commit adds support for the VTU operations to the mv88e6352 driver.

Signed-off-by: Vivien Didelot vivien.dide...@savoirfairelinux.com
---
 drivers/net/dsa/mv88e6352.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c
index 8b0d54f..8396a2e 100644
--- a/drivers/net/dsa/mv88e6352.c
+++ b/drivers/net/dsa/mv88e6352.c
@@ -554,6 +554,9 @@ struct dsa_switch_driver mv88e6352_switch_driver = {
.fdb_add= mv88e6xxx_port_fdb_add,
.fdb_del= mv88e6xxx_port_fdb_del,
.fdb_getnext= mv88e6xxx_port_fdb_getnext,
+   .port_vlan_add  = mv88e6xxx_port_vlan_add,
+   .port_vlan_kill = mv88e6xxx_port_vlan_kill,
+   .port_bridge_setlink= mv88e6xxx_port_bridge_setlink,
 };
 
 MODULE_ALIAS(platform:mv88e6352);
-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC 0/3] DSA and Marvell 88E6352 802.1q support

2015-05-28 Thread Vivien Didelot

This RFC is based on v4.1-rc3.

It is meant to get a glance to the commits responsible to implement the
necessary NDOs between DSA and the Marvell 88E6352 switch driver.

With this support, I am able to create VLANs with (un)tagged ports, setting
their default VID, from a bridge.

To create a bridge containing all switch ports, with a VLAN ID 400, swp2 and
swp3 untagged (pvid), and swp4 tagged, the userspace commands look like this:

ip link add name br0 type bridge
[...]
ip link set dev swp2 up master br0
[...]
bridge vlan add vid 400 pvid untagged dev swp2
bridge vlan add vid 400 pvid untagged dev swp3
bridge vlan add vid 400 dev swp4
[...]
ip link add link br0 name br0.400 type vlan id 400
[...]
bridge vlan add dev br0 vid 400 self

The code is currently being rebased to the latest net-next/master.

Seems like the way to go now is through switchdev attr getter/setter...

Vivien Didelot (3):
  net: dsa: add basic support for VLAN ndo
  net: dsa: mv88e6xxx: add support for VTU operations
  net: dsa: mv88e6352: add support for VLAN

 drivers/net/dsa/mv88e6352.c |   3 +
 drivers/net/dsa/mv88e6xxx.c | 309 
 drivers/net/dsa/mv88e6xxx.h |  28 
 include/net/dsa.h   |   9 ++
 net/dsa/slave.c |  76 ++-
 5 files changed, 423 insertions(+), 2 deletions(-)

-- 
2.4.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH 1/1] hv_netvsc: Allocate the receive buffer from the correct NUMA node

2015-05-28 Thread KY Srinivasan



 -Original Message-
 From: K. Y. Srinivasan [mailto:k...@microsoft.com]
 Sent: Thursday, May 28, 2015 2:56 PM
 To: da...@davemloft.net; netdev@vger.kernel.org; linux-
 ker...@vger.kernel.org; de...@linuxdriverproject.org; o...@aepfle.de;
 a...@canonical.com; jasow...@redhat.com
 Cc: KY Srinivasan
 Subject: [PATCH 1/1] hv_netvsc: Allocate the receive buffer from the correct
 NUMA node
 
 Allocate the receive bufer from the NUMA node assigned to the primary
 channel.
 
 Signed-off-by: K. Y. Srinivasan k...@microsoft.com
 ---
  drivers/net/hyperv/netvsc.c |7 ++-
  1 files changed, 6 insertions(+), 1 deletions(-)
 
 diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
 index b024968..d187965 100644
 --- a/drivers/net/hyperv/netvsc.c
 +++ b/drivers/net/hyperv/netvsc.c
 @@ -227,13 +227,18 @@ static int netvsc_init_buf(struct hv_device *device)
   struct netvsc_device *net_device;
   struct nvsp_message *init_packet;
   struct net_device *ndev;
 + int node;
 
   net_device = get_outbound_net_device(device);
   if (!net_device)
   return -ENODEV;
   ndev = net_device-ndev;
 
 - net_device-recv_buf = vzalloc(net_device-recv_buf_size);
 + node = cpu_to_node(device-channel-target_cpu);
 + net_device-recv_buf = vzalloc_node(net_device-recv_buf_size,
 node);
 + if (!net_device-recv_buf)
 + net_device-recv_buf = vzalloc(net_device-recv_buf_size);
 +
   if (!net_device-recv_buf) {
   netdev_err(ndev, unable to allocate receive 
   buffer of size %d\n, net_device-recv_buf_size);
 --
 1.7.4.1

David,

Please drop this patch; I am going to resend this with another patch.

Regards,

K. Y
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 5/9] wl1251: drop unneeded goto

2015-05-28 Thread Julia Lawall

From: Julia Lawall julia.law...@lip6.fr

Delete jump to a label on the next line, when that label is not
used elsewhere.

A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)

// smpl
@r@
identifier l;
@@

-if (...) goto l;
-l:
// /smpl

Signed-off-by: Julia Lawall julia.law...@lip6.fr

---
 drivers/net/wireless/ti/wl1251/acx.c |3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/net/wireless/ti/wl1251/acx.c 
b/drivers/net/wireless/ti/wl1251/acx.c
index 5695628..d6fbdda 100644
--- a/drivers/net/wireless/ti/wl1251/acx.c
+++ b/drivers/net/wireless/ti/wl1251/acx.c
@@ -53,10 +53,7 @@ int wl1251_acx_station_id(struct wl1251 *wl)
mac-mac[i] = wl-mac_addr[ETH_ALEN - 1 - i];
 
ret = wl1251_cmd_configure(wl, DOT11_STATION_ID, mac, sizeof(*mac));
-   if (ret  0)
-   goto out;
 
-out:
kfree(mac);
return ret;
 }

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Possible issue in iproute2 package

2015-05-28 Thread Daniel Borkmann


Hi Jose,

thanks for your report!

On 05/28/2015 11:12 PM, Guzman Mosqueda, Jose R wrote:
...

We're using iproute2 in a GNU-Linux project and I'm analyzing the code
to try to find possible issues/gaps/risks.
Since I'm not too familiar with the package yet I have a question about
a particular piece of code that could result in a memory corruption:

Version: 4.0.0
File: misc/ss.c
Function: static void tcp_show_info(...)
Line: ~1903
Description: There is a memory allocation for a s.cong_alg variable:
s.cong_alg = malloc(strlen(cong_attr + 1));
The length is calculated about next position of the starting character.
But next line there is a copy of the whole content:
strcpy(s.cong_alg, cong_attr);
I think there is a mistake and it should be something like:
s.cong_alg = malloc(strlen(cong_attr) + 1);
Is this the case? Is it a real bug?
Also I don't see any checking for the value returned by the malloc call,
what if it returns a NULL pointer?


Cc'ing Vadim for ...

commit 8250bc9ff4e55a3ef397ed8c7612f1392d164295
Author: Vadim Kochan vadi...@gmail.com
Date:   Tue Jan 20 16:14:24 2015 +0200

ss: Unify inet sockets output

Signed-off-by: Vadim Kochan vadi...@gmail.com


Also I found something similar about line 1903:
s.cong_alg = malloc(strlen(cong_attr + 1));
strcpy(s.cong_alg, cong_attr);

And another possible issue that I found:

File: tc/tc_util.c
Function: void print_rate(char *buf, int len, __u64 rate)
Line: ~264

In the case that user inputs a high value for rate, the for loop will
exit in the condition meaning that variable i get the value of 5 which
will be an invalid index for the units array due to that array has
only 5 elements.

I hope you can help me by checking these issues and tell me whether they
are real issues or not since you know much better the code.
Also I don't know if you have already this reported, I didn't find a
list of issues for this package. Can you tell me where is such list?

I really appreciate any help on this.

Thanks in advance.
Jose G.







N�r��y���b�X��ǧv�^�)޺{.n�+���z�^�)���w*jg����ݢj/���z�ޖ��2�ޙ)ߡ�a�����G���h��j:+v���w�٥



--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next 0/3] net: systemport: misc improvements

2015-05-28 Thread Florian Fainelli

Hi David,

These patches are highly inspired by changes from Petri on bcmgenet, last patch
is a misc fix that I had pending for a while, but is not a candidate for 'net'
at this point.

Thanks!

Florian Fainelli (3):
  net: systemport: Pre-calculate and utilize cb-bd_addr
  net: systemport: rewrite bcm_sysport_rx_refill
  net: systemport: Add a check for oversized packets

 drivers/net/ethernet/broadcom/bcmsysport.c | 107 -
 drivers/net/ethernet/broadcom/bcmsysport.h |   2 -
 2 files changed, 58 insertions(+), 51 deletions(-)

-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: Request for advice on where to put Root Complex fix up code for downstream device

2015-05-28 Thread Casey Leedom

| From: Casey Leedom [lee...@chelsio.com]
| Sent: Thursday, May 07, 2015 4:31 PM
| 
| | From: Bjorn Helgaas [bhelg...@google.com]
| | Sent: Thursday, May 07, 2015 4:04 PM
| |
| | There are a lot of fixups in drivers/pci/quirks.c.  For things that have to
| | be worked around either before a driver claims the device or if there is no
| | driver at all, the fixup *has* to go in drivers/pci/quirks.c
| |
| | But for things like this, where the problem can only occur after a driver
| | claims the device, I think it makes more sense to put the fixup in the
| | driver itself.  The only wrinkle here is that the fixup has to be done on a
| | separate device, not the device claimed by the driver.  But I think it
| | probably still makes sense to put this fixup in the driver.
| ...
|   One complication to doing this in cxgb4 is that it attaches to Physical
| Function 4 of our T5 chip.  Meanwhile, a completely separate storage
| driver, csiostor, connections to PF5 and PF6 and there's no
| requirement at all that cxgb4 be loaded.  So if we go down the road of
| putting the fixup code in the cxgb4 driver, we'll also need to duplicate
| that code in the csiostor driver.

  I never heard back on this issue of needing to put the Root Complex fixup 
code in two different drivers -- cxgb4 and csiostor -- if we don't go down the 
path of using a PCI Quirk.  I'm happy doing either and have verified both 
solutions locally.  I'd just like to get a judgement call on this.

  It comes down to adding ~30 lines to

drivers/net/eththernet/chelsio/cxgb4/cxgb4_main.c
drivers/scsi/csiostor/csio_init.c

or ~30 lines to

drivers/pci/quirks.c

| | Can you include a pointer to the relevant part of the spec?
| 
|   Sure:
| 
| 2.2.9. Completion Rules
| ...
| Completion headers must supply the same values for
| the Attribute as were supplied in the 20 header of
| the corresponding Request, except as explicitly
| allowed when IDO is used (see Section 2.2.6.4).
| ...
| 2.3.2. Completion Handling Rules
| ...
| If a received Completion matches the Transaction ID
| of an outstanding Request, but in some other way
| does not match the corresponding Request (e.g., a
| problem with Attributes, Traffic Class, Byte Count,
| Lower Address, etc), it is strongly recommended for
| the Receiver to handle the Completion as a Malformed
| TLP. However, if the Completion is otherwise properly
| formed, it is permitted[22] for the Receiver to
| handle the Completion as an Unexpected Completion.

| | Can you use pci_upstream_bridge() here?  There are a couple places where we
| | want to find the Root Port, so we might factor that out someday.  It'll be
| | easier to find all those places if they use with pci_upstream_bridge().
| 
| It looks like pci_upstream_bridge() just traverses one like upstream toward 
the
| Root Complex?  Or am I misunderstanding that function?
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/9] ipv6: drop unneeded goto

2015-05-28 Thread Julia Lawall

From: Julia Lawall julia.law...@lip6.fr

Delete jump to a label on the next line, when that label is not
used elsewhere.

A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)

// smpl
@r@
identifier l;
@@

-if (...) goto l;
-l:
// /smpl

Also remove the unnecessary ret variable.

Signed-off-by: Julia Lawall julia.law...@lip6.fr

---
 net/ipv6/raw.c |8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 484a5c1..ca4700c 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -1327,13 +1327,7 @@ static struct inet_protosw rawv6_protosw = {
 
 int __init rawv6_init(void)
 {
-   int ret;
-
-   ret = inet6_register_protosw(rawv6_protosw);
-   if (ret)
-   goto out;
-out:
-   return ret;
+   return inet6_register_protosw(rawv6_protosw);
 }
 
 void rawv6_exit(void)

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Possible issue in iproute2 package

2015-05-28 Thread Guzman Mosqueda, Jose R


Hi all

I'm Jose Guzman from a security team at Intel.
We're using iproute2 in a GNU-Linux project and I'm analyzing the code
to try to find possible issues/gaps/risks.
Since I'm not too familiar with the package yet I have a question about
a particular piece of code that could result in a memory corruption:

Version: 4.0.0
File: misc/ss.c
Function: static void tcp_show_info(...)
Line: ~1903
Description: There is a memory allocation for a s.cong_alg variable:
s.cong_alg = malloc(strlen(cong_attr + 1));
The length is calculated about next position of the starting character.
But next line there is a copy of the whole content:
strcpy(s.cong_alg, cong_attr);
I think there is a mistake and it should be something like:
s.cong_alg = malloc(strlen(cong_attr) + 1);
Is this the case? Is it a real bug?
Also I don't see any checking for the value returned by the malloc call,
what if it returns a NULL pointer?

Also I found something similar about line 1903:
s.cong_alg = malloc(strlen(cong_attr + 1));
strcpy(s.cong_alg, cong_attr);

And another possible issue that I found:

File: tc/tc_util.c
Function: void print_rate(char *buf, int len, __u64 rate)
Line: ~264

In the case that user inputs a high value for rate, the for loop will
exit in the condition meaning that variable i get the value of 5 which
will be an invalid index for the units array due to that array has
only 5 elements.

I hope you can help me by checking these issues and tell me whether they
are real issues or not since you know much better the code.
Also I don't know if you have already this reported, I didn't find a
list of issues for this package. Can you tell me where is such list?

I really appreciate any help on this.

Thanks in advance.
Jose G.

[PATCH 0/9] drop unneeded goto

2015-05-28 Thread Julia Lawall

These patches drop gotos that jump to a label that is at the next
instruction, in the case that the label is not used elsewhere in the
function.  The complete semantic patch that performs this transformation is
as follows:

// smpl
@r@
position p;
identifier l;
@@

if (...) goto l@p;
l:

@script:ocaml s@
p  r.p;
nm;
@@

nm := (List.hd p).current_element

@ok exists@
identifier s.nm,l;
position p != r.p;
@@

nm(...) {
+... goto l@p; ...+
}

@depends on !ok@
identifier s.nm;
position r.p;
identifier l;
@@

nm(...) {
...
- if(...) goto l@p; l:
...
}
// /smpl

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: linux-next: build failure after merge of most of the trees

2015-05-28 Thread Eric Dumazet

On Thu, 2015-05-28 at 14:35 -0700, David Miller wrote:

 Bogus chunk in my local tree, didn't make it into the final commit I
 pushed out.

Thanks for taking care of this before me !



--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net v2] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware

2015-05-28 Thread Andy Gospodarek

On Thu, May 28, 2015 at 08:40:11AM -0700, Scott Feldman wrote:
 On Thu, May 28, 2015 at 2:42 AM, Jiri Pirko j...@resnulli.us wrote:
  Mon, May 18, 2015 at 10:19:16PM CEST, da...@davemloft.net wrote:
 From: Roopa Prabhu ro...@cumulusnetworks.com
 Date: Sun, 17 May 2015 16:42:05 -0700

  On most systems where you can offload routes to hardware,
  doing routing in software is not an option (the cpu limitations
  make routing impossible in software).

 You absolutely do not get to determine this policy, none of us
 do.

 What matters is that by default the damn switch device being there
 is %100 transparent to the user.

 And the way to achieve that default is to do software routes as
 a fallback.

 I am not going to entertain changes of this nature which fail
 route loading by default just because we've exceeded a device's
 HW capacity to offload.

 I thought I was _really_ clear about this at netdev 0.1

  I certainly agree that by default, transparency 1:1 sw:hw mapping is
  what we need for fib. The current code is a good start!

  I see couple of issues regarding switchdev_fib_ipv4_abort:
  1) If user adds and entry, switchdev_fib_ipv4_add fails, abort is
 executed - and, error returned. I would expect that route entry should
 be added in this case. The next attempt of adding the same entry will
 be successful.
 The current behaviour breaks the transparency you are reffering to.
  2) When switchdev_fib_ipv4_abort happens to be executed, the offload is
 disabled for good (until reboot). That is certainly not nice, alhough
 I understand that is the easiest solution for now.

  I believe that we all agree that the 1:1 transparency, although it is a
  default, may not be optimal for real-life usage. HW resources are
  limited and user does not know them. The danger of hitting _abort and
  screwing-up the whole system is huge, unacceptable.

  So here, there are couple of more or less simple things that I suggest to
  do in order to move a little bit forward:
  1) Introduce system-wide option to switch _abort to just plain fail.
 When HW does not have capacity, do not flush and fallback to sw, but
 rather just fail to add the entry. This would not break anything.
 Userspace has to be prepared that entry add could fail.
  2) Introduce a way to propagate resources to userspace. Driver knows about
 resources used/available/potentially_available. Switchdev infra could
 be extended in order to propagate the info to the user.
  3) Introduce couple of flags for entry add that would alter the default
 behaviour. Something like:
  NLM_F_SKIP_KERNEL
  NLM_F_SKIP_OFFLOAD
 Again, this does not break the current users. On the other hand, this
 gives new users a leverage to instruct kernel where the entry should
 be added to (or not added to).

  Any thoughts? Objections?

 I don't like these.  Breaks transparency and forces the user in a
 position of having to know hardware failures modes (unique to each
 hardware device).  I presented an option d) which avoids this issues;
 was it not understood?

I actually really like the way Jiri succinctly covered the different
cases to move us forward from what we have today (Thanks, Jiri!).  I
completely agree with you on both of your problem statements and the
idea that what have is fine for the short-term.  I see definite room to
improve the the user experience available via upstream kernels.  

Option 1 has appeal since userspace applications that control FDB, FIB,
etc entries could work without modification (the when in this mode the
kernel could choose to ignore any NLM_F_* flags Jiri proposed), but I
agree that a system-wide (or maybe offload-device-wide?) configuration
option needs to exist as this should not be the default behavior. 

Option 2 could also work as userspace applications could query for
space availability before attempting to add a route.  This could be
nice during bootup as then apps could periodically double check that
their view of the world is accurate.

Option 3 also has appeal since there exists the ability to allow
fine-grained control from userspace applications since less used routes
(or routes that could be summarized) could be combined in userspace if
needed.

The great part about all suggestions is that when combined they can
provide a great user experience, but doing all 3 at once is probably too
aggressive.  My vote would be to see if we can work together on a
combination of Option 1 and 3 together as they seem to provide a great
first start to this...

If an application tried to add a route (called A) to the route table
in the kernel and code to support Option 1 existed (similar to what
Roopa posted to start this series) then the kernel could fail to add
route A.  

If the user noted that some other route (called B) was lower priority
for _any_ reason, the user could delete route B from the kernel and
hardware and add route A to hardware and kernel.  Then the

Re: linux-next: build failure after merge of most of the trees

2015-05-28 Thread David Miller

From: Joe Perches j...@perches.com
Date: Thu, 28 May 2015 11:51:15 -0700

 On Thu, 2015-05-28 at 11:42 -0700, David Miller wrote:
 I've applied the following to net-next, thanks for your report.
 
 
 [PATCH] treewide: Add missing vmalloc.h inclusion.
 
 All of these files were only building on non-x86 because of
 the indirect of inclusion of vmalloc.h by, of all things,
 net/inet_hashtables.h
 []
 diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
 []
 @@ -444,6 +444,7 @@ static int skcipher_recvmsg(struct kiocb *unused, struct 
 socket *sock,
  err = skcipher_wait_for_data(sk, flags);
  if (err)
  goto unlock;
 +used = ctx-used;
 
 huh?

Bogus chunk in my local tree, didn't make it into the final commit I
pushed out.

But thanks for noticing.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 net-next 05/11] net: Add full IPv6 addresses to flow_keys

2015-05-28 Thread Eric Dumazet

On Thu, 2015-05-28 at 11:19 -0700, Tom Herbert wrote:

 @@ -566,11 +640,15 @@ static const struct flow_dissector_key 
 flow_keys_dissector_keys[] = {
   },
   {
   .key_id = FLOW_DISSECTOR_KEY_IPV4_ADDRS,
 - .offset = offsetof(struct flow_keys, addrs),
 + .offset = offsetof(struct flow_keys, addrs.v4addrs),
 + },
 + {
 + .key_id = FLOW_DISSECTOR_KEY_IPV6_ADDRS,
 + .offset = offsetof(struct flow_keys, addrs.v6addrs),
   },
   {
   .key_id = FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS,
 - .offset = offsetof(struct flow_keys, addrs),
 + .offset = offsetof(struct flow_keys, addrs.v4addrs),

Shouldn't it be offsetof(struct flow_keys, addrs.v6addrs), ?



--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next 1/3] net: systemport: Pre-calculate and utilize cb-bd_addr

2015-05-28 Thread Florian Fainelli

There is a 1:1 mapping between the software maintained control block in
priv-rx_cbs and the buffer address in priv-rx_bds, such that there is
no need to keep computing the buffer address when refiling a control
block.

Signed-off-by: Florian Fainelli f.faine...@gmail.com
---
 drivers/net/ethernet/broadcom/bcmsysport.c | 18 +-
 drivers/net/ethernet/broadcom/bcmsysport.h |  2 --
 2 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c 
b/drivers/net/ethernet/broadcom/bcmsysport.c
index 084a50a555de..267330ccd595 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -549,12 +549,7 @@ static int bcm_sysport_rx_refill(struct bcm_sysport_priv 
*priv,
}
 
dma_unmap_addr_set(cb, dma_addr, mapping);
-   dma_desc_set_addr(priv, priv-rx_bd_assign_ptr, mapping);
-
-   priv-rx_bd_assign_index++;
-   priv-rx_bd_assign_index = (priv-num_rx_bds - 1);
-   priv-rx_bd_assign_ptr = priv-rx_bds +
-   (priv-rx_bd_assign_index * DESC_SIZE);
+   dma_desc_set_addr(priv, cb-bd_addr, mapping);
 
netif_dbg(priv, rx_status, ndev, RX refill\n);
 
@@ -568,7 +563,7 @@ static int bcm_sysport_alloc_rx_bufs(struct 
bcm_sysport_priv *priv)
unsigned int i;
 
for (i = 0; i  priv-num_rx_bds; i++) {
-   cb = priv-rx_cbs[priv-rx_bd_assign_index];
+   cb = priv-rx_cbs[i];
if (cb-skb)
continue;
 
@@ -1330,14 +1325,14 @@ static inline int tdma_enable_set(struct 
bcm_sysport_priv *priv,
 
 static int bcm_sysport_init_rx_ring(struct bcm_sysport_priv *priv)
 {
+   struct bcm_sysport_cb *cb;
u32 reg;
int ret;
+   int i;
 
/* Initialize SW view of the RX ring */
priv-num_rx_bds = NUM_RX_DESC;
priv-rx_bds = priv-base + SYS_PORT_RDMA_OFFSET;
-   priv-rx_bd_assign_ptr = priv-rx_bds;
-   priv-rx_bd_assign_index = 0;
priv-rx_c_index = 0;
priv-rx_read_ptr = 0;
priv-rx_cbs = kcalloc(priv-num_rx_bds, sizeof(struct bcm_sysport_cb),
@@ -1347,6 +1342,11 @@ static int bcm_sysport_init_rx_ring(struct 
bcm_sysport_priv *priv)
return -ENOMEM;
}
 
+   for (i = 0; i  priv-num_rx_bds; i++) {
+   cb = priv-rx_cbs + i;
+   cb-bd_addr = priv-rx_bds + i * DESC_SIZE;
+   }
+
ret = bcm_sysport_alloc_rx_bufs(priv);
if (ret) {
netif_err(priv, hw, priv-netdev, SKB allocation failed\n);
diff --git a/drivers/net/ethernet/broadcom/bcmsysport.h 
b/drivers/net/ethernet/broadcom/bcmsysport.h
index 42a4b4a0bc14..f28bf545d7f4 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.h
+++ b/drivers/net/ethernet/broadcom/bcmsysport.h
@@ -663,8 +663,6 @@ struct bcm_sysport_priv {
 
/* Receive queue */
void __iomem*rx_bds;
-   void __iomem*rx_bd_assign_ptr;
-   unsigned intrx_bd_assign_index;
struct bcm_sysport_cb   *rx_cbs;
unsigned intnum_rx_bds;
unsigned intrx_read_ptr;
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next 2/3] net: systemport: rewrite bcm_sysport_rx_refill

2015-05-28 Thread Florian Fainelli

Currently, bcm_sysport_desc_rx() calls bcm_sysport_rx_refill() at the end of Rx
packet processing loop, after the current Rx packet has already been passed to
napi_gro_receive(). However, bcm_sysport_rx_refill() might fail to allocate a 
new
Rx skb, thus leaving a hole on the Rx queue where no valid Rx buffer exists.

To eliminate this situation:

1. Rewrite bcm_sysport_rx_refill() to retain the current Rx skb on the
Rx queue if a new replacement Rx skb can't be allocated and DMA-mapped.
In this case, the data on the current Rx skb is effectively dropped.

2. Modify bcm_sysport_desc_rx() to call bcm_sysport_rx_refill() at the
top of Rx packet processing loop, so that the new replacement Rx skb is
already in place before the current Rx skb is processed.

This is loosely inspired from d6707bec5986 (net: bcmgenet: rewrite
bcmgenet_rx_refill())

Signed-off-by: Florian Fainelli f.faine...@gmail.com
---
 drivers/net/ethernet/broadcom/bcmsysport.c | 81 +++---
 1 file changed, 41 insertions(+), 40 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c 
b/drivers/net/ethernet/broadcom/bcmsysport.c
index 267330ccd595..d777b0db9e63 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -524,62 +524,70 @@ static void bcm_sysport_free_cb(struct bcm_sysport_cb *cb)
dma_unmap_addr_set(cb, dma_addr, 0);
 }
 
-static int bcm_sysport_rx_refill(struct bcm_sysport_priv *priv,
-struct bcm_sysport_cb *cb)
+static struct sk_buff *bcm_sysport_rx_refill(struct bcm_sysport_priv *priv,
+struct bcm_sysport_cb *cb)
 {
struct device *kdev = priv-pdev-dev;
struct net_device *ndev = priv-netdev;
+   struct sk_buff *skb, *rx_skb;
dma_addr_t mapping;
-   int ret;
 
-   cb-skb = netdev_alloc_skb(priv-netdev, RX_BUF_LENGTH);
-   if (!cb-skb) {
+   /* Allocate a new SKB for a new packet */
+   skb = netdev_alloc_skb(priv-netdev, RX_BUF_LENGTH);
+   if (!skb) {
+   priv-mib.alloc_rx_buff_failed++;
netif_err(priv, rx_err, ndev, SKB alloc failed\n);
-   return -ENOMEM;
+   return NULL;
}
 
-   mapping = dma_map_single(kdev, cb-skb-data,
+   mapping = dma_map_single(kdev, skb-data,
 RX_BUF_LENGTH, DMA_FROM_DEVICE);
-   ret = dma_mapping_error(kdev, mapping);
-   if (ret) {
+   if (dma_mapping_error(kdev, mapping)) {
priv-mib.rx_dma_failed++;
-   bcm_sysport_free_cb(cb);
+   dev_kfree_skb_any(skb);
netif_err(priv, rx_err, ndev, DMA mapping failure\n);
-   return ret;
+   return NULL;
}
 
+   /* Grab the current SKB on the ring */
+   rx_skb = cb-skb;
+   if (likely(rx_skb))
+   dma_unmap_single(kdev, dma_unmap_addr(cb, dma_addr),
+RX_BUF_LENGTH, DMA_FROM_DEVICE);
+
+   /* Put the new SKB on the ring */
+   cb-skb = skb;
dma_unmap_addr_set(cb, dma_addr, mapping);
dma_desc_set_addr(priv, cb-bd_addr, mapping);
 
netif_dbg(priv, rx_status, ndev, RX refill\n);
 
-   return 0;
+   /* Return the current SKB to the caller */
+   return rx_skb;
 }
 
 static int bcm_sysport_alloc_rx_bufs(struct bcm_sysport_priv *priv)
 {
struct bcm_sysport_cb *cb;
-   int ret = 0;
+   struct sk_buff *skb;
unsigned int i;
 
for (i = 0; i  priv-num_rx_bds; i++) {
cb = priv-rx_cbs[i];
-   if (cb-skb)
-   continue;
-
-   ret = bcm_sysport_rx_refill(priv, cb);
-   if (ret)
-   break;
+   skb = bcm_sysport_rx_refill(priv, cb);
+   if (skb)
+   dev_kfree_skb(skb);
+   if (!cb-skb)
+   return -ENOMEM;
}
 
-   return ret;
+   return 0;
 }
 
 /* Poll the hardware for up to budget packets to process */
 static unsigned int bcm_sysport_desc_rx(struct bcm_sysport_priv *priv,
unsigned int budget)
 {
-   struct device *kdev = priv-pdev-dev;
struct net_device *ndev = priv-netdev;
unsigned int processed = 0, to_process;
struct bcm_sysport_cb *cb;
@@ -587,7 +595,6 @@ static unsigned int bcm_sysport_desc_rx(struct 
bcm_sysport_priv *priv,
unsigned int p_index;
u16 len, status;
struct bcm_rsb *rsb;
-   int ret;
 
/* Determine how much we should process since last call */
p_index = rdma_readl(priv, RDMA_PROD_INDEX);
@@ -605,13 +612,8 @@ static unsigned int bcm_sysport_desc_rx(struct 
bcm_sysport_priv *priv,
 
while ((processed  to_process)  (processed  budget)) {
cb = priv-rx_cbs[priv-rx_read_ptr];
-   skb = cb-skb;
-

[PATCH net-next 3/3] net: systemport: Add a check for oversized packets

2015-05-28 Thread Florian Fainelli

Occasionnaly we may get oversized packets from the hardware which exceed
the nomimal 2KiB buffer size we allocate SKBs with. Add an early check
which drops the packet to avoid invoking skb_over_panic() and move on to
processing the next packet.

Signed-off-by: Florian Fainelli f.faine...@gmail.com
---
 drivers/net/ethernet/broadcom/bcmsysport.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c 
b/drivers/net/ethernet/broadcom/bcmsysport.c
index d777b0db9e63..909ad7a0d480 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -638,6 +638,14 @@ static unsigned int bcm_sysport_desc_rx(struct 
bcm_sysport_priv *priv,
  p_index, priv-rx_c_index, priv-rx_read_ptr,
  len, status);
 
+   if (unlikely(len  RX_BUF_LENGTH)) {
+   netif_err(priv, rx_status, ndev, oversized packet\n);
+   ndev-stats.rx_length_errors++;
+   ndev-stats.rx_errors++;
+   dev_kfree_skb_any(skb);
+   goto next;
+   }
+
if (unlikely(!(status  DESC_EOP) || !(status  DESC_SOP))) {
netif_err(priv, rx_status, ndev, fragmented 
packet!\n);
ndev-stats.rx_dropped++;
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net 0/3] bna: misc bugfixes

2015-05-28 Thread Ivan Vecera

These patches fix several bugs found during device initialization debugging.

Cc: Rasesh Mody rasesh.m...@qlogic.com

Ivan Vecera (3):
  bna: fix firmware loading on big-endian machines
  bna: remove unreasonable iocpf timer start
  bna: fix soft lock-up during firmware initialization failure

 drivers/net/ethernet/brocade/bna/bfa_ioc.c   | 4 ++--
 drivers/net/ethernet/brocade/bna/bnad.c  | 4 
 drivers/net/ethernet/brocade/bna/cna_fwimg.c | 7 +++
 3 files changed, 9 insertions(+), 6 deletions(-)

-- 
2.3.6

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net 1/3] bna: fix firmware loading on big-endian machines

2015-05-28 Thread Ivan Vecera

Firmware required by bna is stored in appropriate files as sequence
of LE32 integers. After loading by request_firmware() they need to be
byte-swapped on big-endian arches. Without this conversion the NIC
is unusable on big-endian machines.

Cc: Rasesh Mody rasesh.m...@qlogic.com
Signed-off-by: Ivan Vecera ivec...@redhat.com
---
 drivers/net/ethernet/brocade/bna/cna_fwimg.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/brocade/bna/cna_fwimg.c 
b/drivers/net/ethernet/brocade/bna/cna_fwimg.c
index ebf462d..badea36 100644
--- a/drivers/net/ethernet/brocade/bna/cna_fwimg.c
+++ b/drivers/net/ethernet/brocade/bna/cna_fwimg.c
@@ -30,6 +30,7 @@ cna_read_firmware(struct pci_dev *pdev, u32 **bfi_image,
u32 *bfi_image_size, char *fw_name)
 {
const struct firmware *fw;
+   u32 n;
 
if (request_firmware(fw, fw_name, pdev-dev)) {
pr_alert(Can't locate firmware %s\n, fw_name);
@@ -40,6 +41,12 @@ cna_read_firmware(struct pci_dev *pdev, u32 **bfi_image,
*bfi_image_size = fw-size/sizeof(u32);
bfi_fw = fw;
 
+   /* Convert loaded firmware to host order as it is stored in file
+* as sequence of LE32 integers.
+*/
+   for (n = 0; n  *bfi_image_size; n++)
+   le32_to_cpus(*bfi_image + n);
+
return *bfi_image;
 error:
return NULL;
-- 
2.3.6

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM

2015-05-28 Thread Doug Ledford

On Thu, 2015-05-28 at 22:05 +0300, Or Gerlitz wrote:
 On Thu, May 28, 2015 at 9:22 PM, Doug Ledford dledf...@redhat.com wrote:
 
  I don't think that is what Doug said.
 
  Indeed.  There is no need to scrap things, but if the design as it
  stands, and the intended means of creating objects for use in
  containers, is going to result in an unworkable network, then we have to
  re-evaluate how the container constructs are created, and that then has
  possible consequences for how we would get from an incoming packet to
  the proper container.
 
 To be precise, do we agree that the issue here isn't in the design as
 it stands but rather in a problem we found in the intended way of
 assigning IP addresses through DHCP for the containers?

No, I would say the problem *is* in the design.  But the problem is the
selected means of identifying the netdev to get to the namespace (and
the proposed means of creating non-default namespace devices to exist in
the container), not the namespace design itself.

  I'm not trying to stop the support train here, but at the same time,
  if the train is headed for a bridge that's out
 
 So what's your concrete saying here? where should we go from here?

This excerpt is from the commit log of patch 3/12:

The IB device and port, together with the P_Key and the IP address should
be enough to uniquely identify the ULP net device.

The problem here is that this is wrong.  If we allow more than one
device per pkey with the same GUID, then DHCP breaks, which is bad in
and of itself, but it also breaks ipv6 link local addressing.  Which
means that this hunk in patch 4/12:

+#if IS_ENABLED(CONFIG_IPV6)
+   case AF_INET6:
+   if (ipv6_chk_addr(net, addr_in6-sin6_addr, dev, 1))
+   return true;
+
+   break;
+#endif

can now be tricked into returning true for incorrect devices.

Where do we go from here?

First, I'm inclined to say we should modify the add_child portion of
IPoIB to refuse to add links to a PKey if that GUID is already present
on that PKey.  You could then use different PKeys on the default GUID
for separate namespaces.  If you need separate namespaces on the same
PKey, then enable alias GUIDs for use on the local adapter and require
one GUID per namespace on the same PKey.

Then I'm inclined to say that we should map for namespaces using device,
port, guid/gid, pkey.  And in this situation, since a unique guid/gid on
any given pkey maps to a unique dhcp identifier and a unique ipv6
lladdr, this becomes freely interchangeable with device, port, pkey,
address mappings that this patchset was built around.

-- 
Doug Ledford dledf...@redhat.com
  GPG KeyID: 0E572FDD



signature.asc
Description: This is a digitally signed message part

Re: Drops in qdisc on ifb interface

2015-05-28 Thread Eric Dumazet

On Thu, 2015-05-28 at 12:33 -0400, jsulli...@opensourcedevel.com wrote:

 Our initial testing has been single flow but the ultimate purpose is 
 processing
 real time video in a complex application which ingests associated meta data,
 post to consumer facing cloud, does reporting back - so lots of different
 traffics with very different demands - a perfect tc environment.

Wait, do you really plan using TCP for real time video ?


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Connection tracking and soft lockups with certain field values

2015-05-28 Thread Andrey Korolyov

Hi,

I am currently playing with SYNPROXY target to optimize SYN filtering
performance and by occasion found that TCP SYN packets containing port
0 can result in a soft lockup when conntrack is enabled just by
itself, given high packet ratio (I`ve reached 450kpps so far with 60b
packets on a /32-/32 flood with enabled flow control at the media
level and middle-level E3 Xeon on receiver side). Same flood with port
 0 going just well, producing same ceil numbers but without visible
lockups in kernel log. I`ve tested the issue on a broad range of 3.x
kernels and all of them are seemingly affected. Fast and dirty grep
revealed special conditions for port 0 only for protocol-specific
helpers, but there are none of them.

Please find both same captures and traceback below.
[  152.001957] ixgbe :01:00.0 eth8: NIC Link is Up 10 Gbps, Flow Control: 
RX/TX
[  157.326410] sched: RT throttling activated
[  180.038105] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 21s! [rcuos/0:9]
[  180.038128] Modules linked in: xt_CT iptable_raw ipt_SYNPROXY 
nf_synproxy_core nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp xt_conntrack 
nf_conntrack iptable_filter ip_tables x_tables tun openvswitch(O) libcrc32c 
nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc bridge stp llc 
w83627ehf hwmon_vid jc42 loop fuse dm_crypt joydev hid_generic usbhid 
x86_pkg_temp_thermal intel_powerclamp ast coretemp igb ttm drm_kms_helper 
kvm_intel(O) drm iTCO_wdt iTCO_vendor_support sg pcspkr kvm(O) syscopyarea 
sysfillrect sysimgblt video thermal tpm_tis i2c_algo_bit ipmi_si 
ipmi_msghandler tpm i2c_i801 8250_fintek fan battery shpchp button ie31200_edac 
edac_core xhci_pci lpc_ich mfd_core xhci_hcd processor crct10dif_pclmul 
crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 lrw 
gf128mul glue_helper
[  180.038157]  ablk_helper cryptd mpt2sas raid_class ehci_pci ixgbe(O) 
ehci_hcd vxlan ip6_udp_tunnel udp_tunnel usbcore ptp usb_common pps_core dca 
dm_mirror dm_region_hash dm_log dm_mod
[  180.038165] CPU: 0 PID: 9 Comm: rcuos/0 Tainted: G   O   
3.18.10-default #5
[  180.038166] Hardware name: Supermicro X10SL7-F/X10SL7-F, BIOS 2.00 04/24/2014
[  180.038167] task: 880409624c20 ti: 88040963 task.ti: 
88040963
[  180.038168] RIP: 0010:[8155b972]  [8155b972] 
dev_gro_receive+0x182/0x370
[  180.038172] RSP: 0018:88041fc03d48  EFLAGS: 0296
[  180.038173] RAX: 81cf6920 RBX: 000180200020 RCX: 67632533
[  180.038174] RDX: 88040558 RSI: 8800d8392100 RDI: 8804054dc440
[  180.038175] RBP: 88041fc03d98 R08: 002e R09: 
[  180.038175] R10: 8800d8392100 R11: ea001019de00 R12: 88041fc03cb8
[  180.038176] R13: 8165b83d R14: 88041fc03d98 R15: 8800d8392100
[  180.038177] FS:  () GS:88041fc0() 
knlGS:
[  180.038178] CS:  0010 DS:  ES:  CR0: 80050033
[  180.038179] CR2: 7fed089d21b0 CR3: 01c15000 CR4: 001407f0
[  180.038180] DR0:  DR1:  DR2: 
[  180.038180] DR3:  DR6: fffe0ff0 DR7: 0400
[  180.038181] Stack:
[  180.038181]  0280 000e 000b 
81cf8860
[  180.038183]  88041fc03d98 8804054dc640 8800d8392100 
8804054dc440
[  180.038184]  000c 880407023fb0 88041fc03dc8 
8155c0d0
[  180.038185] Call Trace:
[  180.038186]  IRQ 

[  180.038189]  [8155c0d0] napi_gro_receive+0x30/0x100
[  180.038196]  [a012cc39] ixgbe_clean_rx_irq+0x8d9/0x1030 [ixgbe]
[  180.038200]  [a012e588] ixgbe_poll+0x478/0x690 [ixgbe]
[  180.038203]  [8150da90] ? show_no_turbo+0x90/0x90
[  180.038204]  [8155bd99] net_rx_action+0x149/0x250
[  180.038208]  [8106629f] __do_softirq+0xdf/0x260
[  180.038210]  [8165c4fc] do_softirq_own_stack+0x1c/0x30
[  180.038211]  EOI 

[  180.038213]  [810664d5] do_softirq+0x65/0x70
[  180.038214]  [81066574] __local_bh_enable_ip+0x94/0xa0
[  180.038216]  [810bfa15] rcu_nocb_kthread+0x155/0x580
[  180.038219]  [8109fcc0] ? finish_wait+0x80/0x80
[  180.038220]  [810bf8c0] ? rcu_eqs_exit_common.isra.60+0xe0/0xe0
[  180.038222]  [8107fad9] kthread+0xc9/0xe0
[  180.038224]  [8107fa10] ? kthread_create_on_node+0x1a0/0x1a0
[  180.038226]  [8165a798] ret_from_fork+0x58/0x90
[  180.038227]  [8107fa10] ? kthread_create_on_node+0x1a0/0x1a0
[  180.038228] Code: 00 48 8b 05 91 8a 79 00 48 89 45 c8 4c 8b 75 c8 49 81 fe 
e0 43 cf 81 49 8d 46 e0 0f 84 b6 01 00 00 66 44 39 28 74 0a 48 8b 40 20 eb db 
0f 1f 40 00 48 83 78 10 00 74 ef 48 8b 93 d8 00 00 00 48


0-80.pcap
Description: application/vnd.tcpdump.pcap


80-80.pcap
Description: application/vnd.tcpdump.pcap

Re: [PATCH net-next 4/4] net/mlx4_core: Make sure there are no pending async events when freeing CQ

2015-05-28 Thread Sergei Shtylyov


Hello.

On 05/28/2015 06:41 PM, Or Gerlitz wrote:


From: Matan Barak mat...@mellanox.com



When freeing a CQ, we need to make sure there are no
asynchronous events (on the ASYNC EQ) that could
relate to this CQ before freeing it.



This is done by introducing synchronize_irq.



Signed-off-by: Matan Barak mat...@mellanox.com
Signed-off-by: Ido Shamay i...@mellanox.com
Signed-off-by: Or Gerlitz ogerl...@mellanox.com
---
  drivers/net/ethernet/mellanox/mlx4/cq.c |4 
  1 files changed, 4 insertions(+), 0 deletions(-)



diff --git a/drivers/net/ethernet/mellanox/mlx4/cq.c 
b/drivers/net/ethernet/mellanox/mlx4/cq.c
index 7431cd4..1fc1dc5 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cq.c
@@ -369,6 +369,10 @@ void mlx4_cq_free(struct mlx4_dev *dev, struct mlx4_cq *cq)
mlx4_warn(dev, HW2SW_CQ failed (%d) for CQN %06x\n, err, 
cq-cqn);


synchronize_irq(priv-eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(cq-vector)].irq);
+   if (priv-eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(cq-vector)].irq !=
+   priv-eq_table.eq[MLX4_EQ_ASYNC].irq)
+   synchronize_irq(priv-eq_table.eq[MLX4_EQ_ASYNC].irq);
+



   I think one empty line was enough.


spin_lock_irq(cq_table-lock);
radix_tree_delete(cq_table-tree, cq-cqn);


WBR, Sergei

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] xfrm6: Do not use xfrm_local_error for path MTU issues in tunnels

2015-05-28 Thread Alexander Duyck


On 05/28/2015 01:40 AM, Steffen Klassert wrote:

On Thu, May 28, 2015 at 12:18:51AM -0700, Alexander Duyck wrote:

On 05/27/2015 10:36 PM, Steffen Klassert wrote:

On Wed, May 27, 2015 at 10:40:32AM -0700, Alexander Duyck wrote:

This change makes it so that we use icmpv6_send to report PMTU issues back
into tunnels in the case that the resulting packet is larger than the MTU
of the outgoing interface.  Previously xfrm_local_error was being used in
this case, however this was resulting in no changes, I suspect due to the
fact that the tunnel itself was being kept out of the loop.

This patch fixes PMTU problems seen on ip6_vti tunnels and is based on the
behavior seen if the socket was orphaned.  Instead of requiring the socket
to be orphaned this patch simply defaults to using icmpv6_send in the case
that the frame came though a tunnel.

We can use icmpv6_send() just in the case that the packet
was already transmitted by a tunnel device, otherwise we
get the bug back that I mentioned in my other mail.

Not sure if we have something to know that the packet
traversed a tunnel device. That's what I asked in the
thread 'Looking for a lost patch'.

Okay I will try to do some more digging.  From what I can tell right
now it looks like my ping attempts are getting hung up on the
xfrm_local_error in __xfrm6_output.  I wonder if we couldn't somehow
make use of the skb-cb to store a pointer to the tunnel that could
be checked to determine if we are going through a VTI or not.

Maybe it is as easy as the patch below, could you please test it?

Subject: [PATCH RFC] vti6: Add pmtu handling to vti6_xmit.

We currently rely on the PMTU discovery of xfrm.
However if a packet is localy sent, the PMTU mechanism
of xfrm tries to to local socket notification what
might not work for applications like ping that don't
check for this. So add pmtu handling to vti6_xmit to
report MTU changes immediately.

Signed-off-by: Steffen Klassert steffen.klass...@secunet.com
---
  net/ipv6/ip6_vti.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index ff3bd86..13cb771 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -434,6 +434,7 @@ vti6_xmit(struct sk_buff *skb, struct net_device *dev, 
struct flowi *fl)
struct dst_entry *dst = skb_dst(skb);
struct net_device *tdev;
struct xfrm_state *x;
+   int mtu;
int err = -1;
  
  	if (!dst)

@@ -468,6 +469,15 @@ vti6_xmit(struct sk_buff *skb, struct net_device *dev, 
struct flowi *fl)
skb_dst_set(skb, dst);
skb-dev = skb_dst(skb)-dev;
  
+	mtu = dst_mtu(dst);

+   if (!skb-ignore_df  skb-len  mtu) {
+   skb_dst(skb)-ops-update_pmtu(dst, NULL, skb, mtu);
+
+   icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
+
+   return -EMSGSIZE;
+   }
+
err = dst_output(skb);
if (net_xmit_eval(err) == 0) {
struct pcpu_sw_netstats *tstats = this_cpu_ptr(dev-tstats);


That seems to be working for me.  I'm able to ping and while the first 
packet fails the second one and all that follow make it through 
correctly after the ptmu update.


- Alex
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 net-next 05/11] net: Add full IPv6 addresses to flow_keys

2015-05-28 Thread Tom Herbert

This patch adds full IPv6 addresses into flow_keys and uses them as
input to the flow hash function. The implementation supports either
IPv4 or IPv6 addresses in a union, and selector is used to determine
how may words to input to jhash2.

We also add flow_get_u32_dst and flow_get_u32_src functions which are
used to get a u32 representation of the source and destination
addresses. For IPv6, ipv6_addr_hash is called. These functions retain
getting the legacy values of src and dst in flow_keys.

With this patch, Ethertype and IP protocol are now included in the
flow hash input.

Signed-off-by: Tom Herbert t...@herbertland.com
---
 drivers/net/bonding/bond_main.c|   9 +-
 drivers/net/ethernet/cisco/enic/enic_clsf.c|   8 +-
 drivers/net/ethernet/cisco/enic/enic_ethtool.c |   4 +-
 include/net/flow_dissector.h   |  52 +++
 include/net/ip.h   |  19 +++-
 include/net/ipv6.h |  21 -
 net/core/flow_dissector.c  | 116 +
 net/ethernet/eth.c |   2 +-
 net/sched/cls_flow.c   |  14 ++-
 net/sched/cls_flower.c |  11 +--
 10 files changed, 193 insertions(+), 63 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 2268438..19eb990 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3059,8 +3059,7 @@ static bool bond_flow_dissect(struct bonding *bond, 
struct sk_buff *skb,
if (unlikely(!pskb_may_pull(skb, noff + sizeof(*iph
return false;
iph = ip_hdr(skb);
-   fk-addrs.src = iph-saddr;
-   fk-addrs.dst = iph-daddr;
+   iph_to_flow_copy_v4addrs(fk, iph);
noff += iph-ihl  2;
if (!ip_is_fragment(iph))
proto = iph-protocol;
@@ -3068,8 +3067,7 @@ static bool bond_flow_dissect(struct bonding *bond, 
struct sk_buff *skb,
if (unlikely(!pskb_may_pull(skb, noff + sizeof(*iph6
return false;
iph6 = ipv6_hdr(skb);
-   fk-addrs.src = (__force __be32)ipv6_addr_hash(iph6-saddr);
-   fk-addrs.dst = (__force __be32)ipv6_addr_hash(iph6-daddr);
+   iph_to_flow_copy_v6addrs(fk, iph6);
noff += sizeof(*iph6);
proto = iph6-nexthdr;
} else {
@@ -3103,7 +3101,8 @@ u32 bond_xmit_hash(struct bonding *bond, struct sk_buff 
*skb)
hash = bond_eth_hash(skb);
else
hash = (__force u32)flow.ports.ports;
-   hash ^= (__force u32)flow.addrs.dst ^ (__force u32)flow.addrs.src;
+   hash ^= (__force u32)flow_get_u32_dst(flow) ^
+   (__force u32)flow_get_u32_src(flow);
hash ^= (hash  16);
hash ^= (hash  8);
 
diff --git a/drivers/net/ethernet/cisco/enic/enic_clsf.c 
b/drivers/net/ethernet/cisco/enic/enic_clsf.c
index a31b57a..d106186 100644
--- a/drivers/net/ethernet/cisco/enic/enic_clsf.c
+++ b/drivers/net/ethernet/cisco/enic/enic_clsf.c
@@ -33,8 +33,8 @@ int enic_addfltr_5t(struct enic *enic, struct flow_keys 
*keys, u16 rq)
return -EPROTONOSUPPORT;
};
data.type = FILTER_IPV4_5TUPLE;
-   data.u.ipv4.src_addr = ntohl(keys-addrs.src);
-   data.u.ipv4.dst_addr = ntohl(keys-addrs.dst);
+   data.u.ipv4.src_addr = ntohl(keys-addrs.v4addrs.src);
+   data.u.ipv4.dst_addr = ntohl(keys-addrs.v4addrs.dst);
data.u.ipv4.src_port = ntohs(keys-ports.src);
data.u.ipv4.dst_port = ntohs(keys-ports.dst);
data.u.ipv4.flags = FILTER_FIELDS_IPV4_5TUPLE;
@@ -158,8 +158,8 @@ static struct enic_rfs_fltr_node *htbl_key_search(struct 
hlist_head *h,
struct enic_rfs_fltr_node *tpos;
 
hlist_for_each_entry(tpos, h, node)
-   if (tpos-keys.addrs.src == k-addrs.src 
-   tpos-keys.addrs.dst == k-addrs.dst 
+   if (tpos-keys.addrs.v4addrs.src == k-addrs.v4addrs.src 
+   tpos-keys.addrs.v4addrs.dst == k-addrs.v4addrs.dst 
tpos-keys.ports.ports == k-ports.ports 
tpos-keys.basic.ip_proto == k-basic.ip_proto 
tpos-keys.basic.n_proto == k-basic.n_proto)
diff --git a/drivers/net/ethernet/cisco/enic/enic_ethtool.c 
b/drivers/net/ethernet/cisco/enic/enic_ethtool.c
index 117c096..73874b2 100644
--- a/drivers/net/ethernet/cisco/enic/enic_ethtool.c
+++ b/drivers/net/ethernet/cisco/enic/enic_ethtool.c
@@ -346,10 +346,10 @@ static int enic_grxclsrule(struct enic *enic, struct 
ethtool_rxnfc *cmd)
break;
}
 
-   fsp-h_u.tcp_ip4_spec.ip4src = n-keys.addrs.src;
+   fsp-h_u.tcp_ip4_spec.ip4src = flow_get_u32_src(n-keys);
fsp-m_u.tcp_ip4_spec.ip4src = (__u32)~0;
 
-   fsp-h_u.tcp_ip4_spec.ip4dst =

[PATCH v4 net-next 09/11] net: Add IPv6 flow label to flow_keys

2015-05-28 Thread Tom Herbert

In flow_dissector set the flow label in flow_keys for IPv6. This also
removes the shortcircuiting of flow dissection when a non-zero label
is present, the flow label can be considered to provide additional
entropy for a hash.

Signed-off-by: Tom Herbert t...@herbertland.com
---
 include/net/flow_dissector.h |  4 +++-
 net/core/flow_dissector.c| 31 +++
 2 files changed, 14 insertions(+), 21 deletions(-)

diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index 08480fb..14d8483 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -28,7 +28,8 @@ struct flow_dissector_key_basic {
 };
 
 struct flow_dissector_key_tags {
-   u32 vlan_id:12;
+   u32 vlan_id:12,
+   flow_label:20;
 };
 
 /**
@@ -111,6 +112,7 @@ enum flow_dissector_key_id {
FLOW_DISSECTOR_KEY_ETH_ADDRS, /* struct flow_dissector_key_eth_addrs */
FLOW_DISSECTOR_KEY_TIPC_ADDRS, /* struct flow_dissector_key_tipc_addrs 
*/
FLOW_DISSECTOR_KEY_VLANID, /* struct flow_dissector_key_flow_tags */
+   FLOW_DISSECTOR_KEY_FLOW_LABEL, /* struct flow_dissector_key_flow_tags */
 
FLOW_DISSECTOR_KEY_MAX,
 };
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 5c66cb2..ba089d9 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -190,7 +190,7 @@ ip:
case htons(ETH_P_IPV6): {
const struct ipv6hdr *iph;
struct ipv6hdr _iph;
-   __be32 flow_label;
+   u32 flow_label;
 
 ipv6:
iph = __skb_header_pointer(skb, nhoff, sizeof(_iph), data, 
hlen, _iph);
@@ -210,30 +210,17 @@ ipv6:
 
memcpy(key_ipv6_addrs, iph-saddr, 
sizeof(*key_ipv6_addrs));
key_control-addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS;
-   goto flow_label;
}
-   break;
-flow_label:
+
flow_label = ip6_flowlabel(iph);
if (flow_label) {
-   /* Awesome, IPv6 packet has a flow label so we can
-* use that to represent the ports without any
-* further dissection.
-*/
-
-   key_basic-n_proto = proto;
-   key_basic-ip_proto = ip_proto;
-   key_control-thoff = (u16)nhoff;
-
if (skb_flow_dissector_uses_key(flow_dissector,
-   
FLOW_DISSECTOR_KEY_PORTS)) {
-   key_ports = 
skb_flow_dissector_target(flow_dissector,
- 
FLOW_DISSECTOR_KEY_PORTS,
- 
target_container);
-   key_ports-ports = flow_label;
+   FLOW_DISSECTOR_KEY_FLOW_LABEL)) {
+   key_tags = 
skb_flow_dissector_target(flow_dissector,
+
FLOW_DISSECTOR_KEY_FLOW_LABEL,
+
target_container);
+   key_tags-flow_label = ntohl(flow_label);
}
-
-   return true;
}
 
break;
@@ -659,6 +646,10 @@ static const struct flow_dissector_key 
flow_keys_dissector_keys[] = {
.key_id = FLOW_DISSECTOR_KEY_VLANID,
.offset = offsetof(struct flow_keys, tags),
},
+   {
+   .key_id = FLOW_DISSECTOR_KEY_FLOW_LABEL,
+   .offset = offsetof(struct flow_keys, tags),
+   },
 };
 
 static const struct flow_dissector_key flow_keys_buf_dissector_keys[] = {
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 net-next 03/11] net: Remove superfluous setting of key_basic

2015-05-28 Thread Tom Herbert

key_basic is set twice in __skb_flow_dissect which seems unnecessary.
Remove second one.

Acked-by: Jiri Pirko j...@resnulli.us
Signed-off-by: Tom Herbert t...@herbertland.com
---
 net/core/flow_dissector.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 7f69916..0763795 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -343,12 +343,6 @@ flow_label:
break;
}
 
-   /* It is ensured by skb_flow_dissector_init() that basic key will
-* be always present.
-*/
-   key_basic = skb_flow_dissector_target(flow_dissector,
- FLOW_DISSECTOR_KEY_BASIC,
- target_container);
key_basic-n_proto = proto;
key_basic-ip_proto = ip_proto;
key_basic-thoff = (u16) nhoff;
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 net-next 00/11] net: Increase inputs to flow_keys hashing

2015-05-28 Thread Tom Herbert

This patch set adds new fields to the flow_keys structure and hashes
over these fields to get a better flow hash. In particular, these
patches now include hashing over the full IPv6 addresses in order
to defend against address spoofing that always results in the
same hash. The new input also includes the Ethertype, L4 protocol,
VLAN, flow label, GRE keyid, and MPLS entropy label.

In order to increase hash inputs, we switch to using jhash2
which operates an an array of u32's. jhash2 operates on multiples of
three words. The data in the hash is constructed for that, and there
are are two variants for IPv4 and Ipv6 addressing. For IPv4 addresses,
jhash is performed over six u32's and for IPv6 it is done over twelve.

flow_keys can store either IPv4 or IPv6 addresses (addr_proto field
is a selector). ipv6_addr_hash is no longer used to convert addresses
for setting in flow table. For legacy uses of flow keys outside of
flow_dissector the flow_get_u32_src and flow_get_u32_dst functions
have been added to get u32 representation representations of addresses
in flow_keys.

For flow lables we also eliminate the short circuit in flow_dissector
for non-zero flow label. The flow label is now considered additional
input to ports.

Testing: Ran netperf TCP_RR for 200 flows using IPv4 and IPv6 comparing
before the patches and with the patches. Did not detect any performance
degradation.

v2:
  - Took out MPLS entropy label. Will add this later.
v3:
  - Ensure hash start offset is a four byte boundary. Add BUG_BUILD_ON
to check for this.
  - Fixes sparse error in GRE to get entropy from keyid.
v4:
  - Rebase to Jiri changes to generalize flow dissection
  - Support TIPC as its own address
  - Bring back MPLS entropy label dissection
  - Remove FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS

v5:
  - Minor fixes from feedback

Tom Herbert (11):
  net: Simplify GRE case in flow_dissector
  mpls: Add definition for IPPROTO_MPLS
  net: Remove superfluous setting of key_basic
  net: Get skb hash over flow_keys structure
  net: Add full IPv6 addresses to flow_keys
  net: Add keys for TIPC address
  net: Get rid of IPv6 hash addresses flow keys
  net: Add VLAN ID to flow_keys
  net: Add IPv6 flow label to flow_keys
  net: Add GRE keyid in flow_keys
  mpls: Add MPLS entropy label in flow_keys

 drivers/net/bonding/bond_main.c|   9 +-
 drivers/net/ethernet/cisco/enic/enic_clsf.c|   8 +-
 drivers/net/ethernet/cisco/enic/enic_ethtool.c |   4 +-
 include/linux/skbuff.h |   2 +-
 include/net/flow_dissector.h   |  97 ++--
 include/net/ip.h   |  21 +-
 include/net/ipv6.h |  23 +-
 include/uapi/linux/in.h|   2 +
 net/core/flow_dissector.c  | 329 ++---
 net/ethernet/eth.c |   2 +-
 net/sched/cls_flow.c   |  14 +-
 net/sched/cls_flower.c |  13 +-
 12 files changed, 388 insertions(+), 136 deletions(-)

-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 net-next 02/11] mpls: Add definition for IPPROTO_MPLS

2015-05-28 Thread Tom Herbert

Add uapi define for MPLS over IP.

Acked-by: Jiri Pirko j...@resnulli.us
Signed-off-by: Tom Herbert t...@herbertland.com
---
 include/uapi/linux/in.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
index 589ced0..641338b 100644
--- a/include/uapi/linux/in.h
+++ b/include/uapi/linux/in.h
@@ -69,6 +69,8 @@ enum {
 #define IPPROTO_SCTP   IPPROTO_SCTP
   IPPROTO_UDPLITE = 136,   /* UDP-Lite (RFC 3828)  */
 #define IPPROTO_UDPLITEIPPROTO_UDPLITE
+  IPPROTO_MPLS = 137,  /* MPLS in IP (RFC 4023)*/
+#define IPPROTO_MPLS   IPPROTO_MPLS
   IPPROTO_RAW = 255,   /* Raw IP packets   */
 #define IPPROTO_RAWIPPROTO_RAW
   IPPROTO_MAX
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 net-next 04/11] net: Get skb hash over flow_keys structure

2015-05-28 Thread Tom Herbert

This patch changes flow hashing to use jhash2 over the flow_keys
structure instead just doing jhash_3words over src, dst, and ports.
This method will allow us take more input into the hashing function
so that we can include full IPv6 addresses, VLAN, flow labels etc.
without needing to resort to xor'ing which makes for a poor hash.

Acked-by: Jiri Pirko j...@resnulli.us
Signed-off-by: Tom Herbert t...@herbertland.com
---
 include/linux/skbuff.h   |  2 +-
 include/net/flow_dissector.h | 21 ++---
 include/net/ip.h |  2 ++
 include/net/ipv6.h   |  2 ++
 net/core/flow_dissector.c| 54 +---
 net/sched/cls_flower.c   |  2 ++
 6 files changed, 66 insertions(+), 17 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 6b41c15..cc612fc 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1943,7 +1943,7 @@ static inline void skb_probe_transport_header(struct 
sk_buff *skb,
if (skb_transport_header_was_set(skb))
return;
else if (skb_flow_dissect_flow_keys(skb, keys))
-   skb_set_transport_header(skb, keys.basic.thoff);
+   skb_set_transport_header(skb, keys.control.thoff);
else
skb_set_transport_header(skb, offset_hint);
 }
diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index bac9c14..cba6a10 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -7,15 +7,24 @@
 #include uapi/linux/if_ether.h
 
 /**
+ * struct flow_dissector_key_control:
+ * @thoff: Transport header offset
+ */
+struct flow_dissector_key_control {
+   u16 thoff;
+   u16 padding;
+};
+
+/**
  * struct flow_dissector_key_basic:
  * @thoff: Transport header offset
  * @n_proto: Network header protocol (eg. IPv4/IPv6)
  * @ip_proto: Transport header protocol (eg. TCP/UDP)
  */
 struct flow_dissector_key_basic {
-   u16 thoff;
__be16  n_proto;
u8  ip_proto;
+   u8  padding;
 };
 
 /**
@@ -70,6 +79,7 @@ struct flow_dissector_key_eth_addrs {
 };
 
 enum flow_dissector_key_id {
+   FLOW_DISSECTOR_KEY_CONTROL, /* struct flow_dissector_key_control */
FLOW_DISSECTOR_KEY_BASIC, /* struct flow_dissector_key_basic */
FLOW_DISSECTOR_KEY_IPV4_ADDRS, /* struct flow_dissector_key_addrs */
FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS, /* struct flow_dissector_key_addrs 
*/
@@ -109,11 +119,16 @@ static inline bool skb_flow_dissect(const struct sk_buff 
*skb,
 }
 
 struct flow_keys {
-   struct flow_dissector_key_addrs addrs;
-   struct flow_dissector_key_ports ports;
+   struct flow_dissector_key_control control;
+#define FLOW_KEYS_HASH_START_FIELD basic
struct flow_dissector_key_basic basic;
+   struct flow_dissector_key_ports ports;
+   struct flow_dissector_key_addrs addrs;
 };
 
+#define FLOW_KEYS_HASH_OFFSET  \
+   offsetof(struct flow_keys, FLOW_KEYS_HASH_START_FIELD)
+
 extern struct flow_dissector flow_keys_dissector;
 extern struct flow_dissector flow_keys_buf_dissector;
 
diff --git a/include/net/ip.h b/include/net/ip.h
index 9b976cf..16cfc87 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -360,6 +360,8 @@ static inline void inet_set_txhash(struct sock *sk)
struct inet_sock *inet = inet_sk(sk);
struct flow_keys keys;
 
+   memset(keys, 0, sizeof(keys));
+
keys.addrs.src = inet-inet_saddr;
keys.addrs.dst = inet-inet_daddr;
keys.ports.src = inet-inet_sport;
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 35d485c..474ca46 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -699,6 +699,8 @@ static inline void ip6_set_txhash(struct sock *sk)
struct ipv6_pinfo *np = inet6_sk(sk);
struct flow_keys keys;
 
+   memset(keys, 0, sizeof(keys));
+
keys.addrs.src = (__force __be32)ipv6_addr_hash(np-saddr);
keys.addrs.dst = (__force __be32)ipv6_addr_hash(sk-sk_v6_daddr);
keys.ports.src = inet-inet_sport;
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 0763795..55b5f29 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -57,10 +57,12 @@ void skb_flow_dissector_init(struct flow_dissector 
*flow_dissector,
flow_dissector-offset[key-key_id] = key-offset;
}
 
-   /* Ensure that the dissector always includes basic key. That way
-* we are able to avoid handling lack of it in fast path.
+   /* Ensure that the dissector always includes control and basic key.
+* That way we are able to avoid handling lack of these in fast path.
 */
BUG_ON(!skb_flow_dissector_uses_key(flow_dissector,
+   FLOW_DISSECTOR_KEY_CONTROL));
+   BUG_ON(!skb_flow_dissector_uses_key(flow_dissector,
FLOW_DISSECTOR_KEY_BASIC));
 }

[PATCH v4 net-next 01/11] net: Simplify GRE case in flow_dissector

2015-05-28 Thread Tom Herbert

Do break when we see routing flag or a non-zero version number in GRE
header.

Acked-by: Jiri Pirko j...@resnulli.us
Signed-off-by: Tom Herbert t...@herbertland.com
---
 net/core/flow_dissector.c | 44 ++--
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 1f2d893..7f69916 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -308,30 +308,30 @@ flow_label:
 * Only look inside GRE if version zero and no
 * routing
 */
-   if (!(hdr-flags  (GRE_VERSION|GRE_ROUTING))) {
-   proto = hdr-proto;
+   if (hdr-flags  (GRE_VERSION | GRE_ROUTING))
+   break;
+
+   proto = hdr-proto;
+   nhoff += 4;
+   if (hdr-flags  GRE_CSUM)
nhoff += 4;
-   if (hdr-flags  GRE_CSUM)
-   nhoff += 4;
-   if (hdr-flags  GRE_KEY)
-   nhoff += 4;
-   if (hdr-flags  GRE_SEQ)
-   nhoff += 4;
-   if (proto == htons(ETH_P_TEB)) {
-   const struct ethhdr *eth;
-   struct ethhdr _eth;
-
-   eth = __skb_header_pointer(skb, nhoff,
-  sizeof(_eth),
-  data, hlen, _eth);
-   if (!eth)
-   return false;
-   proto = eth-h_proto;
-   nhoff += sizeof(*eth);
-   }
-   goto again;
+   if (hdr-flags  GRE_KEY)
+   nhoff += 4;
+   if (hdr-flags  GRE_SEQ)
+   nhoff += 4;
+   if (proto == htons(ETH_P_TEB)) {
+   const struct ethhdr *eth;
+   struct ethhdr _eth;
+
+   eth = __skb_header_pointer(skb, nhoff,
+  sizeof(_eth),
+  data, hlen, _eth);
+   if (!eth)
+   return false;
+   proto = eth-h_proto;
+   nhoff += sizeof(*eth);
}
-   break;
+   goto again;
}
case IPPROTO_IPIP:
proto = htons(ETH_P_IP);
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 net-next 10/11] net: Add GRE keyid in flow_keys

2015-05-28 Thread Tom Herbert

In flow dissector if a GRE header contains a keyid this is saved in the
new keyid field of flow_keys. The GRE keyid is then represented
in the flow hash function input.

Signed-off-by: Tom Herbert t...@herbertland.com
---
 include/net/flow_dissector.h |  6 ++
 net/core/flow_dissector.c| 24 +++-
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index 14d8483..5d4257b 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -32,6 +32,10 @@ struct flow_dissector_key_tags {
flow_label:20;
 };
 
+struct flow_dissector_key_keyid {
+   u32 keyid;
+};
+
 /**
  * struct flow_dissector_key_ipv4_addrs:
  * @src: source ip address
@@ -113,6 +117,7 @@ enum flow_dissector_key_id {
FLOW_DISSECTOR_KEY_TIPC_ADDRS, /* struct flow_dissector_key_tipc_addrs 
*/
FLOW_DISSECTOR_KEY_VLANID, /* struct flow_dissector_key_flow_tags */
FLOW_DISSECTOR_KEY_FLOW_LABEL, /* struct flow_dissector_key_flow_tags */
+   FLOW_DISSECTOR_KEY_GRE_KEYID, /* struct flow_dissector_key_keyid */
 
FLOW_DISSECTOR_KEY_MAX,
 };
@@ -150,6 +155,7 @@ struct flow_keys {
 #define FLOW_KEYS_HASH_START_FIELD basic
struct flow_dissector_key_basic basic;
struct flow_dissector_key_tags tags;
+   struct flow_dissector_key_keyid keyid;
struct flow_dissector_key_ports ports;
struct flow_dissector_key_addrs addrs;
 };
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index ba089d9..ea318d5 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -127,6 +127,7 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
struct flow_dissector_key_addrs *key_addrs;
struct flow_dissector_key_ports *key_ports;
struct flow_dissector_key_tags *key_tags;
+   struct flow_dissector_key_keyid *key_keyid;
u8 ip_proto;
 
if (!data) {
@@ -315,8 +316,25 @@ ipv6:
nhoff += 4;
if (hdr-flags  GRE_CSUM)
nhoff += 4;
-   if (hdr-flags  GRE_KEY)
+   if (hdr-flags  GRE_KEY) {
+   const __be32 *keyid;
+   __be32 _keyid;
+
+   keyid = __skb_header_pointer(skb, nhoff, sizeof(_keyid),
+data, hlen, _keyid);
+
+   if (!keyid)
+   return false;
+
+   if (skb_flow_dissector_uses_key(flow_dissector,
+   
FLOW_DISSECTOR_KEY_GRE_KEYID)) {
+   key_keyid = 
skb_flow_dissector_target(flow_dissector,
+ 
FLOW_DISSECTOR_KEY_GRE_KEYID,
+ 
target_container);
+   key_keyid-keyid = ntohl(*keyid);
+   }
nhoff += 4;
+   }
if (hdr-flags  GRE_SEQ)
nhoff += 4;
if (proto == htons(ETH_P_TEB)) {
@@ -650,6 +668,10 @@ static const struct flow_dissector_key 
flow_keys_dissector_keys[] = {
.key_id = FLOW_DISSECTOR_KEY_FLOW_LABEL,
.offset = offsetof(struct flow_keys, tags),
},
+   {
+   .key_id = FLOW_DISSECTOR_KEY_GRE_KEYID,
+   .offset = offsetof(struct flow_keys, keyid),
+   },
 };
 
 static const struct flow_dissector_key flow_keys_buf_dissector_keys[] = {
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: linux-next: build failure after merge of most of the trees

2015-05-28 Thread David Miller

From: Stephen Rothwell s...@canb.auug.org.au
Date: Thu, 28 May 2015 22:06:07 +1000

 Ouch :-(

The only thing I will say on this matter is that the _only_ way this
problem will go away is if someone does the work necessary to get rid
of that implicit vmalloc.h include that happens on all x86 platform
builds.

So if you want this to stop happening, work on that.

I've applied the following to net-next, thanks for your report.


[PATCH] treewide: Add missing vmalloc.h inclusion.

All of these files were only building on non-x86 because of
the indirect of inclusion of vmalloc.h by, of all things,
net/inet_hashtables.h

None of this got caught during build testing, because on x86
there is an implicit vmalloc.h include via on of the arch asm/
headers.

This fixes all of these

Reported-by: Stephen Rothwell s...@canb.auug.org.au
Signed-off-by: David S. Miller da...@davemloft.net
---
 crypto/algif_skcipher.c| 1 +
 drivers/scsi/qla2xxx/tcm_qla2xxx.c | 1 +
 drivers/target/iscsi/iscsi_target.c| 1 +
 drivers/target/target_core_file.c  | 1 +
 drivers/target/target_core_pr.c| 1 +
 drivers/target/target_core_transport.c | 1 +
 drivers/target/target_core_user.c  | 1 +
 drivers/vhost/scsi.c   | 1 +
 8 files changed, 8 insertions(+)

diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index 37110fd..4d1c315 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -444,6 +444,7 @@ static int skcipher_recvmsg(struct kiocb *unused, struct 
socket *sock,
err = skcipher_wait_for_data(sk, flags);
if (err)
goto unlock;
+   used = ctx-used;
}
 
used = min_t(unsigned long, used, 
iov_iter_count(msg-msg_iter));
diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c 
b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
index 73f9fee..54c986a 100644
--- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c
+++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c
@@ -27,6 +27,7 @@
 #include linux/moduleparam.h
 #include generated/utsrelease.h
 #include linux/utsname.h
+#include linux/vmalloc.h
 #include linux/init.h
 #include linux/list.h
 #include linux/slab.h
diff --git a/drivers/target/iscsi/iscsi_target.c 
b/drivers/target/iscsi/iscsi_target.c
index aebde32..f2ce95c 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -21,6 +21,7 @@
 #include linux/crypto.h
 #include linux/completion.h
 #include linux/module.h
+#include linux/vmalloc.h
 #include linux/idr.h
 #include asm/unaligned.h
 #include scsi/scsi_device.h
diff --git a/drivers/target/target_core_file.c 
b/drivers/target/target_core_file.c
index d836de2..5f8b119 100644
--- a/drivers/target/target_core_file.c
+++ b/drivers/target/target_core_file.c
@@ -30,6 +30,7 @@
 #include linux/slab.h
 #include linux/spinlock.h
 #include linux/module.h
+#include linux/vmalloc.h
 #include linux/falloc.h
 #include scsi/scsi.h
 #include scsi/scsi_host.h
diff --git a/drivers/target/target_core_pr.c b/drivers/target/target_core_pr.c
index 283cf78..06cd53e 100644
--- a/drivers/target/target_core_pr.c
+++ b/drivers/target/target_core_pr.c
@@ -27,6 +27,7 @@
 #include linux/slab.h
 #include linux/spinlock.h
 #include linux/list.h
+#include linux/vmalloc.h
 #include linux/file.h
 #include scsi/scsi.h
 #include scsi/scsi_cmnd.h
diff --git a/drivers/target/target_core_transport.c 
b/drivers/target/target_core_transport.c
index 0adc0f6..c99d2ea 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -34,6 +34,7 @@
 #include linux/cdrom.h
 #include linux/module.h
 #include linux/ratelimit.h
+#include linux/vmalloc.h
 #include asm/unaligned.h
 #include net/sock.h
 #include net/tcp.h
diff --git a/drivers/target/target_core_user.c 
b/drivers/target/target_core_user.c
index 1a1bcf7..ca43a10 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -21,6 +21,7 @@
 #include linux/idr.h
 #include linux/timer.h
 #include linux/parser.h
+#include linux/vmalloc.h
 #include scsi/scsi.h
 #include scsi/scsi_host.h
 #include linux/uio_driver.h
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index dc78d87..16b45ca 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -35,6 +35,7 @@
 #include linux/compat.h
 #include linux/eventfd.h
 #include linux/fs.h
+#include linux/vmalloc.h
 #include linux/miscdevice.h
 #include asm/unaligned.h
 #include scsi/scsi.h
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: linux-next: build failure after merge of most of the trees

2015-05-28 Thread Joe Perches

On Thu, 2015-05-28 at 11:42 -0700, David Miller wrote:
 I've applied the following to net-next, thanks for your report.
 
 
 [PATCH] treewide: Add missing vmalloc.h inclusion.
 
 All of these files were only building on non-x86 because of
 the indirect of inclusion of vmalloc.h by, of all things,
 net/inet_hashtables.h
[]
 diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
[]
 @@ -444,6 +444,7 @@ static int skcipher_recvmsg(struct kiocb *unused, struct 
 socket *sock,
   err = skcipher_wait_for_data(sk, flags);
   if (err)
   goto unlock;
 + used = ctx-used;

huh?



--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] bpf: allow BPF programs access skb-skb_iif and skb-dev-ifindex fields

2015-05-28 Thread Daniel Borkmann


On 05/28/2015 12:30 AM, Alexei Starovoitov wrote:

classic BPF already exposes skb-dev-ifindex via SKF_AD_IFINDEX extension.
Allow eBPF program to access it as well. Note that classic aborts execution
of the program if 'skb-dev == NULL' (which is inconvenient for program
writers), whereas eBPF returns zero in such case.


That's better, yep.


Also expose the 'skb_iif' field, since programs triggered by redirected
packet need to known the original interface index.
Summary:
__skb-ifindex - skb-dev-ifindex
__skb-ingress_ifindex - skb-skb_iif

Signed-off-by: Alexei Starovoitov a...@plumgrid.com


Acked-by: Daniel Borkmann dan...@iogearbox.net
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next V5 10/11] net/mlx5: Ethernet resource handling files

2015-05-28 Thread Amir Vadai

This patch contains the resource handling files:
- flow_table.c: This file contains the code to handle the low level API
to configure hardware flow table. It is separated from
the flow_table_en.c, because it will be used in the
future by Raw Ethernet QP in mlx5_ib too.
- en_flow_table.[ch]: Ethernet flow steering handling. The flow table
object contain a mapping between flow specs and TIRs.
This mechanism will be used also to configure e-switch
in the future, when SR-IOV support will be added.
- transobj.[ch] - Low level functions to create/modify/destroy the
  transport objects: RQ/SQ/TIR/TIS
- vport.[ch] - Handle attributes of a virtual port (vPort) in the
  embedded switch. Currently this switch is a passthrough, until SR-IOV
  support will be added.

Signed-off-by: Amir Vadai am...@mellanox.com
---
 .../ethernet/mellanox/mlx5/core/en_flow_table.c| 858 +
 .../net/ethernet/mellanox/mlx5/core/flow_table.c   | 422 ++
 drivers/net/ethernet/mellanox/mlx5/core/transobj.c | 169 
 drivers/net/ethernet/mellanox/mlx5/core/transobj.h |  47 ++
 drivers/net/ethernet/mellanox/mlx5/core/vport.c|  84 ++
 drivers/net/ethernet/mellanox/mlx5/core/vport.h|  41 +
 include/linux/mlx5/flow_table.h|  54 ++
 7 files changed, 1675 insertions(+)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_flow_table.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/flow_table.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/transobj.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/transobj.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/vport.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/vport.h
 create mode 100644 include/linux/mlx5/flow_table.h

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_flow_table.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_flow_table.c
new file mode 100644
index 000..6feebda
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_flow_table.c
@@ -0,0 +1,858 @@
+/*
+ * Copyright (c) 2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include linux/list.h
+#include linux/ip.h
+#include linux/ipv6.h
+#include linux/tcp.h
+#include linux/mlx5/flow_table.h
+#include en.h
+
+enum {
+   MLX5E_FULLMATCH = 0,
+   MLX5E_ALLMULTI  = 1,
+   MLX5E_PROMISC   = 2,
+};
+
+enum {
+   MLX5E_UC= 0,
+   MLX5E_MC_IPV4   = 1,
+   MLX5E_MC_IPV6   = 2,
+   MLX5E_MC_OTHER  = 3,
+};
+
+enum {
+   MLX5E_ACTION_NONE = 0,
+   MLX5E_ACTION_ADD  = 1,
+   MLX5E_ACTION_DEL  = 2,
+};
+
+struct mlx5e_eth_addr_hash_node {
+   struct hlist_node  hlist;
+   u8 action;
+   struct mlx5e_eth_addr_info ai;
+};
+
+static inline int mlx5e_hash_eth_addr(u8 *addr)
+{
+   return addr[5];
+}
+
+static void mlx5e_add_eth_addr_to_hash(struct hlist_head *hash, u8 *addr)
+{
+   struct mlx5e_eth_addr_hash_node *hn;
+   int ix = mlx5e_hash_eth_addr(addr);
+   int found = 0;
+
+   hlist_for_each_entry(hn, hash[ix], hlist)
+   if (ether_addr_equal_64bits(hn-ai.addr, addr)) {
+   found = 1;
+   break;
+   }
+
+   if (found) {
+   hn-action = MLX5E_ACTION_NONE;
+   return;
+   }
+
+   hn = kzalloc(sizeof(*hn), GFP_ATOMIC);
+   if (!hn)
+   return;
+
+   ether_addr_copy(hn-ai.addr, addr);
+

[PATCH net-next V5 05/11] net/mlx5_core: Implement access functions of ptys register fields

2015-05-28 Thread Amir Vadai

From: Saeed Mahameed sae...@mellanox.com

Those registers will be used by the ethtool to set/get settings.

Signed-off-by: Rana Shahout ra...@mellanox.com
Signed-off-by: Saeed Mahameed sae...@mellanox.com
Signed-off-by: Amir Vadai am...@mellanox.com
---
 drivers/net/ethernet/mellanox/mlx5/core/port.c | 77 ++
 include/linux/mlx5/driver.h| 14 +
 2 files changed, 91 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c 
b/drivers/net/ethernet/mellanox/mlx5/core/port.c
index 49e90f2..6e2d99c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/port.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c
@@ -102,3 +102,80 @@ int mlx5_set_port_caps(struct mlx5_core_dev *dev, u8 
port_num, u32 caps)
return err;
 }
 EXPORT_SYMBOL_GPL(mlx5_set_port_caps);
+
+int mlx5_query_port_ptys(struct mlx5_core_dev *dev, u32 *ptys,
+int ptys_size, int proto_mask)
+{
+   u32 in[MLX5_ST_SZ_DW(ptys_reg)];
+   int err;
+
+   memset(in, 0, sizeof(in));
+   MLX5_SET(ptys_reg, in, local_port, 1);
+   MLX5_SET(ptys_reg, in, proto_mask, proto_mask);
+
+   err = mlx5_core_access_reg(dev, in, sizeof(in), ptys,
+  ptys_size, MLX5_REG_PTYS, 0, 0);
+
+   return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_query_port_ptys);
+
+int mlx5_query_port_proto_cap(struct mlx5_core_dev *dev,
+ u32 *proto_cap, int proto_mask)
+{
+   u32 out[MLX5_ST_SZ_DW(ptys_reg)];
+   int err;
+
+   err = mlx5_query_port_ptys(dev, out, sizeof(out), proto_mask);
+   if (err)
+   return err;
+
+   if (proto_mask == MLX5_PTYS_EN)
+   *proto_cap = MLX5_GET(ptys_reg, out, eth_proto_capability);
+   else
+   *proto_cap = MLX5_GET(ptys_reg, out, ib_proto_capability);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(mlx5_query_port_proto_cap);
+
+int mlx5_query_port_proto_admin(struct mlx5_core_dev *dev,
+   u32 *proto_admin, int proto_mask)
+{
+   u32 out[MLX5_ST_SZ_DW(ptys_reg)];
+   int err;
+
+   err = mlx5_query_port_ptys(dev, out, sizeof(out), proto_mask);
+   if (err)
+   return err;
+
+   if (proto_mask == MLX5_PTYS_EN)
+   *proto_admin = MLX5_GET(ptys_reg, out, eth_proto_admin);
+   else
+   *proto_admin = MLX5_GET(ptys_reg, out, ib_proto_admin);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(mlx5_query_port_proto_admin);
+
+int mlx5_set_port_proto(struct mlx5_core_dev *dev, u32 proto_admin,
+   int proto_mask)
+{
+   u32 in[MLX5_ST_SZ_DW(ptys_reg)];
+   u32 out[MLX5_ST_SZ_DW(ptys_reg)];
+   int err;
+
+   memset(in, 0, sizeof(in));
+
+   MLX5_SET(ptys_reg, in, local_port, 1);
+   MLX5_SET(ptys_reg, in, proto_mask, proto_mask);
+   if (proto_mask == MLX5_PTYS_EN)
+   MLX5_SET(ptys_reg, in, eth_proto_admin, proto_admin);
+   else
+   MLX5_SET(ptys_reg, in, ib_proto_admin, proto_admin);
+
+   err = mlx5_core_access_reg(dev, in, sizeof(in), out,
+  sizeof(out), MLX5_REG_PTYS, 0, 1);
+   return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_set_port_proto);
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 6b91991..266d549 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -504,6 +504,11 @@ enum {
MLX5_COMP_EQ_SIZE = 1024,
 };
 
+enum {
+   MLX5_PTYS_IB = 1  0,
+   MLX5_PTYS_EN = 1  2,
+};
+
 struct mlx5_db_pgdir {
struct list_headlist;
DECLARE_BITMAP(bitmap, MLX5_DB_PER_PAGE);
@@ -686,7 +691,16 @@ void mlx5_qp_debugfs_cleanup(struct mlx5_core_dev *dev);
 int mlx5_core_access_reg(struct mlx5_core_dev *dev, void *data_in,
 int size_in, void *data_out, int size_out,
 u16 reg_num, int arg, int write);
+
 int mlx5_set_port_caps(struct mlx5_core_dev *dev, u8 port_num, u32 caps);
+int mlx5_query_port_ptys(struct mlx5_core_dev *dev, u32 *ptys,
+int ptys_size, int proto_mask);
+int mlx5_query_port_proto_cap(struct mlx5_core_dev *dev,
+ u32 *proto_cap, int proto_mask);
+int mlx5_query_port_proto_admin(struct mlx5_core_dev *dev,
+   u32 *proto_admin, int proto_mask);
+int mlx5_set_port_proto(struct mlx5_core_dev *dev, u32 proto_admin,
+   int proto_mask);
 
 int mlx5_debug_eq_add(struct mlx5_core_dev *dev, struct mlx5_eq *eq);
 void mlx5_debug_eq_remove(struct mlx5_core_dev *dev, struct mlx5_eq *eq);
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next V5 01/11] net/mlx5_core,mlx5_ib: Do not use vmap() on coherent memory

2015-05-28 Thread Amir Vadai

As David Daney pointed in mlx4_core driver [1], mlx5_core is also
misusing the DMA-API.

This patch is removing the code that vmap() memory allocated by
dma_alloc_coherent().

After this patch, users of this drivers might fail allocating resources
on memory fragmeneted systems.  This will be fixed later on.

[1] - https://patchwork.ozlabs.org/patch/458531/

CC: David Daney david.da...@cavium.com
Signed-off-by: Amir Vadai am...@mellanox.com
---
 drivers/infiniband/hw/mlx5/cq.c |  3 +-
 drivers/infiniband/hw/mlx5/qp.c |  2 +-
 drivers/infiniband/hw/mlx5/srq.c|  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/alloc.c | 96 +
 drivers/net/ethernet/mellanox/mlx5/core/eq.c|  3 +-
 include/linux/mlx5/driver.h |  9 +--
 6 files changed, 22 insertions(+), 93 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index 2ee6b10..4e88b18 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -590,8 +590,7 @@ static int alloc_cq_buf(struct mlx5_ib_dev *dev, struct 
mlx5_ib_cq_buf *buf,
 {
int err;
 
-   err = mlx5_buf_alloc(dev-mdev, nent * cqe_size,
-PAGE_SIZE * 2, buf-buf);
+   err = mlx5_buf_alloc(dev-mdev, nent * cqe_size, buf-buf);
if (err)
return err;
 
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index d35f62d..426eb88 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -768,7 +768,7 @@ static int create_kernel_qp(struct mlx5_ib_dev *dev,
qp-sq.offset = qp-rq.wqe_cnt  qp-rq.wqe_shift;
qp-buf_size = err + (qp-rq.wqe_cnt  qp-rq.wqe_shift);
 
-   err = mlx5_buf_alloc(dev-mdev, qp-buf_size, PAGE_SIZE * 2, qp-buf);
+   err = mlx5_buf_alloc(dev-mdev, qp-buf_size, qp-buf);
if (err) {
mlx5_ib_dbg(dev, err %d\n, err);
goto err_uuar;
diff --git a/drivers/infiniband/hw/mlx5/srq.c b/drivers/infiniband/hw/mlx5/srq.c
index 02d77a2..4242e1d 100644
--- a/drivers/infiniband/hw/mlx5/srq.c
+++ b/drivers/infiniband/hw/mlx5/srq.c
@@ -165,7 +165,7 @@ static int create_srq_kernel(struct mlx5_ib_dev *dev, 
struct mlx5_ib_srq *srq,
return err;
}
 
-   if (mlx5_buf_alloc(dev-mdev, buf_size, PAGE_SIZE * 2, srq-buf)) {
+   if (mlx5_buf_alloc(dev-mdev, buf_size, srq-buf)) {
mlx5_ib_dbg(dev, buf alloc failed\n);
err = -ENOMEM;
goto err_db;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c 
b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
index ac0f7bf..0715b49 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c
@@ -42,95 +42,36 @@
 #include mlx5_core.h
 
 /* Handling for queue buffers -- we allocate a bunch of memory and
- * register it in a memory region at HCA virtual address 0.  If the
- * requested size is  max_direct, we split the allocation into
- * multiple pages, so we don't require too much contiguous memory.
+ * register it in a memory region at HCA virtual address 0.
  */
 
-int mlx5_buf_alloc(struct mlx5_core_dev *dev, int size, int max_direct,
-  struct mlx5_buf *buf)
+int mlx5_buf_alloc(struct mlx5_core_dev *dev, int size, struct mlx5_buf *buf)
 {
dma_addr_t t;
 
buf-size = size;
-   if (size = max_direct) {
-   buf-nbufs= 1;
-   buf-npages   = 1;
-   buf-page_shift   = (u8)get_order(size) + PAGE_SHIFT;
-   buf-direct.buf   = dma_zalloc_coherent(dev-pdev-dev,
-   size, t, GFP_KERNEL);
-   if (!buf-direct.buf)
-   return -ENOMEM;
-
-   buf-direct.map = t;
-
-   while (t  ((1  buf-page_shift) - 1)) {
-   --buf-page_shift;
-   buf-npages *= 2;
-   }
-   } else {
-   int i;
-
-   buf-direct.buf  = NULL;
-   buf-nbufs   = (size + PAGE_SIZE - 1) / PAGE_SIZE;
-   buf-npages  = buf-nbufs;
-   buf-page_shift  = PAGE_SHIFT;
-   buf-page_list   = kcalloc(buf-nbufs, sizeof(*buf-page_list),
-  GFP_KERNEL);
-   if (!buf-page_list)
-   return -ENOMEM;
-
-   for (i = 0; i  buf-nbufs; i++) {
-   buf-page_list[i].buf =
-   dma_zalloc_coherent(dev-pdev-dev, PAGE_SIZE,
-   t, GFP_KERNEL);
-   if (!buf-page_list[i].buf)
-   goto err_free;
-
-   buf-page_list[i].map = t;
-   }
-
-   if (BITS_PER_LONG == 64) {
-   struct

[PATCH net-next V5 07/11] net/mlx5_core: Modify CQ moderation parameters

2015-05-28 Thread Amir Vadai

From: Rana Shahout ra...@mellanox.com

Introduce mlx5_core_modify_cq_moderation() to be used by the netdev, to
set hardware coalescing.

Signed-off-by: Rana Shahout ra...@mellanox.com
Signed-off-by: Saeed Mahameed sae...@mellanox.com
Signed-off-by: Amir Vadai am...@mellanox.com
---
 drivers/net/ethernet/mellanox/mlx5/core/cq.c | 18 ++
 include/linux/mlx5/cq.h  |  3 +++
 2 files changed, 21 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cq.c 
b/drivers/net/ethernet/mellanox/mlx5/core/cq.c
index eb0cf81..04ab7e4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cq.c
@@ -219,6 +219,24 @@ int mlx5_core_modify_cq(struct mlx5_core_dev *dev, struct 
mlx5_core_cq *cq,
 }
 EXPORT_SYMBOL(mlx5_core_modify_cq);
 
+int mlx5_core_modify_cq_moderation(struct mlx5_core_dev *dev,
+  struct mlx5_core_cq *cq,
+  u16 cq_period,
+  u16 cq_max_count)
+{
+   struct mlx5_modify_cq_mbox_in in;
+
+   memset(in, 0, sizeof(in));
+
+   in.cqn  = cpu_to_be32(cq-cqn);
+   in.ctx.cq_period= cpu_to_be16(cq_period);
+   in.ctx.cq_max_count = cpu_to_be16(cq_max_count);
+   in.field_select = cpu_to_be32(MLX5_CQ_MODIFY_PERIOD |
+ MLX5_CQ_MODIFY_COUNT);
+
+   return mlx5_core_modify_cq(dev, cq, in, sizeof(in));
+}
+
 int mlx5_init_cq_table(struct mlx5_core_dev *dev)
 {
struct mlx5_cq_table *table = dev-priv.cq_table;
diff --git a/include/linux/mlx5/cq.h b/include/linux/mlx5/cq.h
index 2695ced..abc4767 100644
--- a/include/linux/mlx5/cq.h
+++ b/include/linux/mlx5/cq.h
@@ -169,6 +169,9 @@ int mlx5_core_query_cq(struct mlx5_core_dev *dev, struct 
mlx5_core_cq *cq,
   struct mlx5_query_cq_mbox_out *out);
 int mlx5_core_modify_cq(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq,
struct mlx5_modify_cq_mbox_in *in, int in_sz);
+int mlx5_core_modify_cq_moderation(struct mlx5_core_dev *dev,
+  struct mlx5_core_cq *cq, u16 cq_period,
+  u16 cq_max_count);
 int mlx5_debug_cq_add(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq);
 void mlx5_debug_cq_remove(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq);
 
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next V5 04/11] net/mlx5_core: New device capabilities handling

2015-05-28 Thread Amir Vadai

From: Saeed Mahameed sae...@mellanox.com

- Query all supported types of dev caps on driver load.
- Store the Cap data outbox per cap type into driver private data.
- Introduce new Macros to access/dump stored caps (using the auto
  generated data types).
- Obsolete SW representation of dev caps (no need for SW copy for each
  cap).
- Modify IB driver to use new macros for checking caps.

Signed-off-by: Saeed Mahameed sae...@mellanox.com
Signed-off-by: Amir Vadai am...@mellanox.com
---
 drivers/infiniband/hw/mlx5/cq.c|   8 +-
 drivers/infiniband/hw/mlx5/mad.c   |   2 +-
 drivers/infiniband/hw/mlx5/main.c  | 113 ---
 drivers/infiniband/hw/mlx5/mlx5_ib.h   |   6 +-
 drivers/infiniband/hw/mlx5/mr.c|   3 +-
 drivers/infiniband/hw/mlx5/odp.c   |  47 +++
 drivers/infiniband/hw/mlx5/qp.c|  84 +--
 drivers/infiniband/hw/mlx5/srq.c   |   7 +-
 drivers/net/ethernet/mellanox/mlx5/core/eq.c   |   4 +-
 drivers/net/ethernet/mellanox/mlx5/core/fw.c   |  90 +++-
 drivers/net/ethernet/mellanox/mlx5/core/main.c | 154 +++--
 .../net/ethernet/mellanox/mlx5/core/mlx5_core.h|  10 +-
 drivers/net/ethernet/mellanox/mlx5/core/uar.c  |   7 +-
 include/linux/mlx5/device.h|  66 -
 include/linux/mlx5/driver.h|  58 +---
 15 files changed, 310 insertions(+), 349 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index 4e88b18..e2bea9a 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -753,7 +753,7 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev, 
int entries,
return ERR_PTR(-EINVAL);
 
entries = roundup_pow_of_two(entries + 1);
-   if (entries  dev-mdev-caps.gen.max_cqes)
+   if (entries  (1  MLX5_CAP_GEN(dev-mdev, log_max_cq_sz)))
return ERR_PTR(-EINVAL);
 
cq = kzalloc(sizeof(*cq), GFP_KERNEL);
@@ -920,7 +920,7 @@ int mlx5_ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 
cq_period)
int err;
u32 fsel;
 
-   if (!(dev-mdev-caps.gen.flags  MLX5_DEV_CAP_FLAG_CQ_MODER))
+   if (!MLX5_CAP_GEN(dev-mdev, cq_moderation))
return -ENOSYS;
 
in = kzalloc(sizeof(*in), GFP_KERNEL);
@@ -1075,7 +1075,7 @@ int mlx5_ib_resize_cq(struct ib_cq *ibcq, int entries, 
struct ib_udata *udata)
int uninitialized_var(cqe_size);
unsigned long flags;
 
-   if (!(dev-mdev-caps.gen.flags  MLX5_DEV_CAP_FLAG_RESIZE_CQ)) {
+   if (!MLX5_CAP_GEN(dev-mdev, cq_resize)) {
pr_info(Firmware does not support resize CQ\n);
return -ENOSYS;
}
@@ -1084,7 +1084,7 @@ int mlx5_ib_resize_cq(struct ib_cq *ibcq, int entries, 
struct ib_udata *udata)
return -EINVAL;
 
entries = roundup_pow_of_two(entries + 1);
-   if (entries  dev-mdev-caps.gen.max_cqes + 1)
+   if (entries   (1  MLX5_CAP_GEN(dev-mdev, log_max_cq_sz)) + 1)
return -EINVAL;
 
if (entries == ibcq-cqe + 1)
diff --git a/drivers/infiniband/hw/mlx5/mad.c b/drivers/infiniband/hw/mlx5/mad.c
index 9cf9a37..f2d9e70 100644
--- a/drivers/infiniband/hw/mlx5/mad.c
+++ b/drivers/infiniband/hw/mlx5/mad.c
@@ -129,7 +129,7 @@ int mlx5_query_ext_port_caps(struct mlx5_ib_dev *dev, u8 
port)
 
packet_error = be16_to_cpu(out_mad-status);
 
-   dev-mdev-caps.gen.ext_port_cap[port - 1] = (!err  !packet_error) ?
+   dev-mdev-port_caps[port - 1].ext_port_cap = (!err  !packet_error) ?
MLX_EXT_PORT_CAP_FLAG_EXTENDED_PORT_INFO : 0;
 
 out:
diff --git a/drivers/infiniband/hw/mlx5/main.c 
b/drivers/infiniband/hw/mlx5/main.c
index 57c9809..9075649 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -66,15 +66,13 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
struct ib_device_attr *props)
 {
struct mlx5_ib_dev *dev = to_mdev(ibdev);
+   struct mlx5_core_dev *mdev = dev-mdev;
struct ib_smp *in_mad  = NULL;
struct ib_smp *out_mad = NULL;
-   struct mlx5_general_caps *gen;
int err = -ENOMEM;
int max_rq_sg;
int max_sq_sg;
-   u64 flags;
 
-   gen = dev-mdev-caps.gen;
in_mad  = kzalloc(sizeof(*in_mad), GFP_KERNEL);
out_mad = kmalloc(sizeof(*out_mad), GFP_KERNEL);
if (!in_mad || !out_mad)
@@ -96,18 +94,18 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
IB_DEVICE_PORT_ACTIVE_EVENT |
IB_DEVICE_SYS_IMAGE_GUID|
IB_DEVICE_RC_RNR_NAK_GEN;
-   flags = gen-flags;
-   if (flags  MLX5_DEV_CAP_FLAG_BAD_PKEY_CNTR)
+
+   if (MLX5_CAP_GEN(mdev, pkv))
props-device_cap_flags |=

[PATCH net-next V5 11/11] net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality

2015-05-28 Thread Amir Vadai

This is the Ethernet part of the driver for the Mellanox ConnectX(R)-4
Single/Dual-Port Adapter supporting 100Gb/s with VPI.  The driver
extends the existing mlx5 driver with Ethernet functionality.

This patch contains the driver entry points but does not include
transmit and receive (see the previous patch in the series) routines.

It also adds the option MLX5_CORE_EN to Kconfig to enable/disable the
Ethernet functionality. Currently, Kconfig is programmed to make
Ethernet and Infiniband functionality mutally exclusive.
Also changed MLX5_INFINIBAND to be depandant on MLX5_CORE instead of
selecting it, since MLX5_CORE could be selected without MLX5_INFINIBAND
being selected.

Signed-off-by: Amir Vadai am...@mellanox.com
---
 drivers/infiniband/hw/mlx5/Kconfig |4 +-
 drivers/net/ethernet/mellanox/mlx5/core/Kconfig|   14 +-
 drivers/net/ethernet/mellanox/mlx5/core/Makefile   |3 +
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c  |   19 -
 drivers/net/ethernet/mellanox/mlx5/core/en.h   |  520 ++
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |  679 +++
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 1899 
 drivers/net/ethernet/mellanox/mlx5/core/main.c |   74 +-
 .../net/ethernet/mellanox/mlx5/core/mlx5_core.h|9 +-
 include/linux/mlx5/device.h|   19 +
 include/linux/mlx5/driver.h|1 +
 11 files changed, 3213 insertions(+), 28 deletions(-)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_main.c

diff --git a/drivers/infiniband/hw/mlx5/Kconfig 
b/drivers/infiniband/hw/mlx5/Kconfig
index 10df386..bce263b 100644
--- a/drivers/infiniband/hw/mlx5/Kconfig
+++ b/drivers/infiniband/hw/mlx5/Kconfig
@@ -1,8 +1,6 @@
 config MLX5_INFINIBAND
tristate Mellanox Connect-IB HCA support
-   depends on NETDEVICES  ETHERNET  PCI
-   select NET_VENDOR_MELLANOX
-   select MLX5_CORE
+   depends on NETDEVICES  ETHERNET  PCI  MLX5_CORE
---help---
  This driver provides low-level InfiniBand support for
  Mellanox Connect-IB PCI Express host channel adapters (HCAs).
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig 
b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
index 8ff57e8..0d7aef0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
@@ -3,6 +3,18 @@
 #
 
 config MLX5_CORE
-   tristate
+   tristate Mellanox Technologies ConnectX-4 and Connect-IB core driver
depends on PCI
default n
+   ---help---
+ Core driver for low level functionality of the ConnectX-4 and
+ Connect-IB cards by Mellanox Technologies.
+
+config MLX5_CORE_EN
+   bool Mellanox Technologies ConnectX-4 Ethernet support
+   depends on MLX5_INFINIBAND=n  NETDEVICES  ETHERNET  PCI  
MLX5_CORE
+   default n
+   ---help---
+ Ethernet support in Mellanox Technologies ConnectX-4 NIC.
+ Ethernet and Infiniband support in ConnectX-4 are currently mutually
+ exclusive.
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile 
b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
index 105780b..87e9e60 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile
@@ -3,3 +3,6 @@ obj-$(CONFIG_MLX5_CORE) += mlx5_core.o
 mlx5_core-y := main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
health.o mcg.o cq.o srq.o alloc.o qp.o port.o mr.o pd.o   \
mad.o
+mlx5_core-$(CONFIG_MLX5_CORE_EN) += wq.o flow_table.o vport.o transobj.o \
+   en_main.o en_flow_table.o en_ethtool.o en_tx.o en_rx.o \
+   en_txrx.o
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c 
b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 2f22cd2..75ff58d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -75,25 +75,6 @@ enum {
MLX5_CMD_DELIVERY_STAT_CMD_DESCR_ERR= 0x10,
 };
 
-enum {
-   MLX5_CMD_STAT_OK= 0x0,
-   MLX5_CMD_STAT_INT_ERR   = 0x1,
-   MLX5_CMD_STAT_BAD_OP_ERR= 0x2,
-   MLX5_CMD_STAT_BAD_PARAM_ERR = 0x3,
-   MLX5_CMD_STAT_BAD_SYS_STATE_ERR = 0x4,
-   MLX5_CMD_STAT_BAD_RES_ERR   = 0x5,
-   MLX5_CMD_STAT_RES_BUSY  = 0x6,
-   MLX5_CMD_STAT_LIM_ERR   = 0x8,
-   MLX5_CMD_STAT_BAD_RES_STATE_ERR = 0x9,
-   MLX5_CMD_STAT_IX_ERR= 0xa,
-   MLX5_CMD_STAT_NO_RES_ERR= 0xf,
-   MLX5_CMD_STAT_BAD_INP_LEN_ERR   = 0x50,
-   MLX5_CMD_STAT_BAD_OUTP_LEN_ERR  = 0x51,
-

[PATCH net-next V5 06/11] net/mlx5_core: Implement get/set port status

2015-05-28 Thread Amir Vadai

From: Rana Shahout ra...@mellanox.com

Implemet get/set port status low level functions to be exposed by the
netdev.

Signed-off-by: Rana Shahout ra...@mellanox.com
Signed-off-by: Saeed Mahameed sae...@mellanox.com
Signed-off-by: Amir Vadai am...@mellanox.com
---
 drivers/net/ethernet/mellanox/mlx5/core/port.c | 32 ++
 include/linux/mlx5/driver.h|  8 +++
 2 files changed, 40 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c 
b/drivers/net/ethernet/mellanox/mlx5/core/port.c
index 6e2d99c..742a6fb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/port.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c
@@ -179,3 +179,35 @@ int mlx5_set_port_proto(struct mlx5_core_dev *dev, u32 
proto_admin,
return err;
 }
 EXPORT_SYMBOL_GPL(mlx5_set_port_proto);
+
+int mlx5_set_port_status(struct mlx5_core_dev *dev,
+enum mlx5_port_status status)
+{
+   u32 in[MLX5_ST_SZ_DW(paos_reg)];
+   u32 out[MLX5_ST_SZ_DW(paos_reg)];
+
+   memset(in, 0, sizeof(in));
+
+   MLX5_SET(paos_reg, in, admin_status, status);
+   MLX5_SET(paos_reg, in, ase, 1);
+
+   return mlx5_core_access_reg(dev, in, sizeof(in), out,
+   sizeof(out), MLX5_REG_PAOS, 0, 1);
+}
+
+int mlx5_query_port_status(struct mlx5_core_dev *dev, u8 *status)
+{
+   u32 in[MLX5_ST_SZ_DW(paos_reg)];
+   u32 out[MLX5_ST_SZ_DW(paos_reg)];
+   int err;
+
+   memset(in, 0, sizeof(in));
+
+   err = mlx5_core_access_reg(dev, in, sizeof(in), out,
+  sizeof(out), MLX5_REG_PAOS, 0, 0);
+   if (err)
+   return err;
+
+   *status = MLX5_GET(paos_reg, out, oper_status);
+   return err;
+}
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 266d549..6438444 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -149,6 +149,11 @@ enum mlx5_dev_event {
MLX5_DEV_EVENT_CLIENT_REREG,
 };
 
+enum mlx5_port_status {
+   MLX5_PORT_UP= 1  1,
+   MLX5_PORT_DOWN  = 1  2,
+};
+
 struct mlx5_uuar_info {
struct mlx5_uar*uars;
int num_uars;
@@ -701,6 +706,9 @@ int mlx5_query_port_proto_admin(struct mlx5_core_dev *dev,
u32 *proto_admin, int proto_mask);
 int mlx5_set_port_proto(struct mlx5_core_dev *dev, u32 proto_admin,
int proto_mask);
+int mlx5_set_port_status(struct mlx5_core_dev *dev,
+enum mlx5_port_status status);
+int mlx5_query_port_status(struct mlx5_core_dev *dev, u8 *status);
 
 int mlx5_debug_eq_add(struct mlx5_core_dev *dev, struct mlx5_eq *eq);
 void mlx5_debug_eq_remove(struct mlx5_core_dev *dev, struct mlx5_eq *eq);
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next V5 08/11] net/mlx5_core: Set/Query port MTU commands

2015-05-28 Thread Amir Vadai

From: Saeed Mahameed sae...@mellanox.com

Introduce set/Query low level functions to access MTU in hardware. To be
used by the netdev.

Signed-off-by: Saeed Mahameed sae...@mellanox.com
Signed-off-by: Amir Vadai am...@mellanox.com
---
 drivers/net/ethernet/mellanox/mlx5/core/port.c | 53 ++
 include/linux/mlx5/driver.h|  4 ++
 2 files changed, 57 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c 
b/drivers/net/ethernet/mellanox/mlx5/core/port.c
index 742a6fb..7d3d0f9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/port.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c
@@ -211,3 +211,56 @@ int mlx5_query_port_status(struct mlx5_core_dev *dev, u8 
*status)
*status = MLX5_GET(paos_reg, out, oper_status);
return err;
 }
+
+static int mlx5_query_port_mtu(struct mlx5_core_dev *dev,
+  int *admin_mtu, int *max_mtu, int *oper_mtu)
+{
+   u32 in[MLX5_ST_SZ_DW(pmtu_reg)];
+   u32 out[MLX5_ST_SZ_DW(pmtu_reg)];
+   int err;
+
+   memset(in, 0, sizeof(in));
+
+   MLX5_SET(pmtu_reg, in, local_port, 1);
+
+   err = mlx5_core_access_reg(dev, in, sizeof(in), out,
+  sizeof(out), MLX5_REG_PMTU, 0, 0);
+   if (err)
+   return err;
+
+   if (max_mtu)
+   *max_mtu  = MLX5_GET(pmtu_reg, out, max_mtu);
+   if (oper_mtu)
+   *oper_mtu = MLX5_GET(pmtu_reg, out, oper_mtu);
+   if (admin_mtu)
+   *admin_mtu = MLX5_GET(pmtu_reg, out, admin_mtu);
+
+   return 0;
+}
+
+int mlx5_set_port_mtu(struct mlx5_core_dev *dev, int mtu)
+{
+   u32 in[MLX5_ST_SZ_DW(pmtu_reg)];
+   u32 out[MLX5_ST_SZ_DW(pmtu_reg)];
+
+   memset(in, 0, sizeof(in));
+
+   MLX5_SET(pmtu_reg, in, admin_mtu, mtu);
+   MLX5_SET(pmtu_reg, in, local_port, 1);
+
+   return mlx5_core_access_reg(dev, in, sizeof(in), out, sizeof(out),
+   MLX5_REG_PMTU, 0, 1);
+}
+EXPORT_SYMBOL_GPL(mlx5_set_port_mtu);
+
+int mlx5_query_port_max_mtu(struct mlx5_core_dev *dev, int *max_mtu)
+{
+   return mlx5_query_port_mtu(dev, NULL, max_mtu, NULL);
+}
+EXPORT_SYMBOL_GPL(mlx5_query_port_max_mtu);
+
+int mlx5_query_port_oper_mtu(struct mlx5_core_dev *dev, int *oper_mtu)
+{
+   return mlx5_query_port_mtu(dev, NULL, NULL, oper_mtu);
+}
+EXPORT_SYMBOL_GPL(mlx5_query_port_oper_mtu);
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 6438444..5173847 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -710,6 +710,10 @@ int mlx5_set_port_status(struct mlx5_core_dev *dev,
 enum mlx5_port_status status);
 int mlx5_query_port_status(struct mlx5_core_dev *dev, u8 *status);
 
+int mlx5_set_port_mtu(struct mlx5_core_dev *dev, int mtu);
+int mlx5_query_port_max_mtu(struct mlx5_core_dev *dev, int *max_mtu);
+int mlx5_query_port_oper_mtu(struct mlx5_core_dev *dev, int *oper_mtu);
+
 int mlx5_debug_eq_add(struct mlx5_core_dev *dev, struct mlx5_eq *eq);
 void mlx5_debug_eq_remove(struct mlx5_core_dev *dev, struct mlx5_eq *eq);
 int mlx5_core_eq_query(struct mlx5_core_dev *dev, struct mlx5_eq *eq,
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next V5 09/11] net/mlx5: Ethernet Datapath files

2015-05-28 Thread Amir Vadai

en_[rt]x.c contains the data path related code specific to tx or rx.
en_txrx.c contains data path code which is common for both the rx and
tx, this is mainly napi related code.

Below are the objects that are being used by the hardware and the driver
in the data path:

Channel - one channel per IRQ. Every channel object contains:
  RQ  - describes the rx queue
  TIR - One TIR (Transport Interface Receive) object per flow type. TIR
contains attributes for a type of rx flow (e.g IPv4, IPv6 etc).
A flow is defined in the Flow Table.
Currently TIR describes the RSS hash parameters if exists and LRO
attributes.
  SQ  - describes the a tx queue. There is one SQ (Send Queue) per
TC (traffic class).
  TIS - There is one TIS (Transport Interface Send) per TC.  It
describes the TC and may later be extended to describe more
transport properties.

Both RQ and SQ inherit from the object WQ (work queue). This common code
to describe the layout of CQE's WQE's in memory is in the files wq.[cj]

For every channel there is one NAPI context that is used for RX and
for TX.

Driver is using netdev_alloc_skb() to allocate skb's.

Signed-off-by: Amir Vadai am...@mellanox.com
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c   | 249 
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   | 344 ++
 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c | 107 +++
 drivers/net/ethernet/mellanox/mlx5/core/wq.c  | 183 
 drivers/net/ethernet/mellanox/mlx5/core/wq.h  | 171 +++
 5 files changed, 1054 insertions(+)
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/wq.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/wq.h

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
new file mode 100644
index 000..ce1317c
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -0,0 +1,249 @@
+/*
+ * Copyright (c) 2015, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include linux/ip.h
+#include linux/ipv6.h
+#include linux/tcp.h
+#include en.h
+
+static inline int mlx5e_alloc_rx_wqe(struct mlx5e_rq *rq,
+struct mlx5e_rx_wqe *wqe, u16 ix)
+{
+   struct sk_buff *skb;
+   dma_addr_t dma_addr;
+
+   skb = netdev_alloc_skb(rq-netdev, rq-wqe_sz);
+   if (unlikely(!skb))
+   return -ENOMEM;
+
+   skb_reserve(skb, MLX5E_NET_IP_ALIGN);
+
+   dma_addr = dma_map_single(rq-pdev,
+ /* hw start padding */
+ skb-data - MLX5E_NET_IP_ALIGN,
+ /* hw   end padding */
+ rq-wqe_sz,
+ DMA_FROM_DEVICE);
+
+   if (unlikely(dma_mapping_error(rq-pdev, dma_addr)))
+   goto err_free_skb;
+
+   *((dma_addr_t *)skb-cb) = dma_addr;
+   wqe-data.addr = cpu_to_be64(dma_addr + MLX5E_NET_IP_ALIGN);
+
+   rq-skb[ix] = skb;
+
+   return 0;
+
+err_free_skb:
+   dev_kfree_skb(skb);
+
+   return -ENOMEM;
+}
+
+bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq)
+{
+   struct mlx5_wq_ll *wq = rq-wq;
+
+   if (unlikely(!test_bit(MLX5E_RQ_STATE_POST_WQES_ENABLE, rq-state)))
+   return false;
+
+   while

Re: [PATCH net-next] openvswitch: include datapath actions with sampled-packet upcall to userspace

2015-05-28 Thread Jesse Gross

On Wed, May 27, 2015 at 10:57 PM, Pravin Shelar pshe...@nicira.com wrote:
 On Wed, May 27, 2015 at 9:16 PM, Jesse Gross je...@nicira.com wrote:
 On Wed, May 27, 2015 at 7:46 PM, Pravin Shelar pshe...@nicira.com wrote:
 On Wed, May 27, 2015 at 2:10 PM, Jesse Gross je...@nicira.com wrote:
 On Fri, May 22, 2015 at 10:53 AM, Pravin Shelar pshe...@nicira.com wrote:
 On Wed, May 20, 2015 at 12:32 PM, Neil McKee neil.mc...@inmon.com wrote:
 diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
 index b491c1c..ee5760d 100644
 --- a/net/openvswitch/actions.c
 +++ b/net/openvswitch/actions.c
 @@ -608,7 +608,8 @@ static void do_output(struct datapath *dp, struct 
 sk_buff *skb, int out_port)
  }

  static int output_userspace(struct datapath *dp, struct sk_buff *skb,
 -   struct sw_flow_key *key, const struct nlattr 
 *attr)
 +   struct sw_flow_key *key, const struct nlattr 
 *attr,
 +   const struct nlattr *actions, int 
 actions_len)
  {
 struct ovs_tunnel_info info;
 struct dp_upcall_info upcall;
 @@ -619,6 +620,8 @@ static int output_userspace(struct datapath *dp, 
 struct sk_buff *skb,
 upcall.userdata = NULL;
 upcall.portid = 0;
 upcall.egress_tun_info = NULL;
 +   upcall.actions = actions;
 +   upcall.actions_len = actions_len;

 Rather than unconditionally passing actions to the upcall, there
 should be attribute in ovs_userspace_attr to request the actions list.

 Why? It seems simpler to just always pass the actions and I'm not sure
 that this is really performance critical (which is the only reason
 that comes to mind to not always pass this).

 This is only required for sFlow sampling so I do not think we should
 send it on every upcall.

 But what is the downside?

 This increases memory allocation in atomic context but if you think
 this makes code complicated then I am fine without the attribute.

OK, I see.

My guess is that there are only likely to be a significant set of
actions for sampling use cases anyways so if this is a real problem
then a flag is probably not going to make much of a difference.

One possibility is to retry with a smaller size if allocation fails
and not include the actions in that case. Userspace is already going
to have to handle the case where actions are omitted for existing
kernels.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Drops in qdisc on ifb interface

2015-05-28 Thread jsulli...@opensourcedevel.com


 On May 28, 2015 at 1:17 PM Eric Dumazet eric.duma...@gmail.com wrote:


 On Thu, 2015-05-28 at 12:33 -0400, jsulli...@opensourcedevel.com wrote:

  Our initial testing has been single flow but the ultimate purpose is
  processing
  real time video in a complex application which ingests associated meta data,
  post to consumer facing cloud, does reporting back - so lots of different
  traffics with very different demands - a perfect tc environment.

 Wait, do you really plan using TCP for real time video ?


The overall product does but the video source feeds come over a different
network via UDP. There are, however, RTMP quality control feeds coming across
this connection.  There may also occasionally be test UDP source feeds on this
connection but those are not production.  Thanks - John
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM

2015-05-28 Thread Jason Gunthorpe

On Thu, May 28, 2015 at 07:21:11PM +0300, Or Gerlitz wrote:

 Anything else except for that (you said reworking of the network scripts
 and NetworkManager assumptions to make it work)??

IPv6 becomes very broken, child interfaces will generate the same IPv6
addreses for radv and link local resulting in duplicate address
scenarios.

About the only thing that will work properly is statically assigned
IPv4 addresses.

 I don't see why we should stop the whole RDMA containers support train just
 b/c we found out the IPoIB DHCP bug which was there for few years before
 this effort started.

I don't think that is what Doug said.

Jason
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 net-next 08/11] net: Add VLAN ID to flow_keys

2015-05-28 Thread Tom Herbert

In flow_dissector set vlan_id in flow_keys when VLAN is found.

Signed-off-by: Tom Herbert t...@herbertland.com
---
 include/net/flow_dissector.h |  6 ++
 net/core/flow_dissector.c| 14 ++
 2 files changed, 20 insertions(+)

diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index 59f00f9..08480fb 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -27,6 +27,10 @@ struct flow_dissector_key_basic {
u8  padding;
 };
 
+struct flow_dissector_key_tags {
+   u32 vlan_id:12;
+};
+
 /**
  * struct flow_dissector_key_ipv4_addrs:
  * @src: source ip address
@@ -106,6 +110,7 @@ enum flow_dissector_key_id {
FLOW_DISSECTOR_KEY_PORTS, /* struct flow_dissector_key_ports */
FLOW_DISSECTOR_KEY_ETH_ADDRS, /* struct flow_dissector_key_eth_addrs */
FLOW_DISSECTOR_KEY_TIPC_ADDRS, /* struct flow_dissector_key_tipc_addrs 
*/
+   FLOW_DISSECTOR_KEY_VLANID, /* struct flow_dissector_key_flow_tags */
 
FLOW_DISSECTOR_KEY_MAX,
 };
@@ -142,6 +147,7 @@ struct flow_keys {
struct flow_dissector_key_control control;
 #define FLOW_KEYS_HASH_START_FIELD basic
struct flow_dissector_key_basic basic;
+   struct flow_dissector_key_tags tags;
struct flow_dissector_key_ports ports;
struct flow_dissector_key_addrs addrs;
 };
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 5348a46..5c66cb2 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -126,6 +126,7 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
struct flow_dissector_key_basic *key_basic;
struct flow_dissector_key_addrs *key_addrs;
struct flow_dissector_key_ports *key_ports;
+   struct flow_dissector_key_tags *key_tags;
u8 ip_proto;
 
if (!data) {
@@ -246,6 +247,15 @@ flow_label:
if (!vlan)
return false;
 
+   if (skb_flow_dissector_uses_key(flow_dissector,
+   FLOW_DISSECTOR_KEY_VLANID)) {
+   key_tags = skb_flow_dissector_target(flow_dissector,
+
FLOW_DISSECTOR_KEY_VLANID,
+target_container);
+
+   key_tags-vlan_id = skb_vlan_tag_get_id(skb);
+   }
+
proto = vlan-h_vlan_encapsulated_proto;
nhoff += sizeof(*vlan);
goto again;
@@ -645,6 +655,10 @@ static const struct flow_dissector_key 
flow_keys_dissector_keys[] = {
.key_id = FLOW_DISSECTOR_KEY_PORTS,
.offset = offsetof(struct flow_keys, ports),
},
+   {
+   .key_id = FLOW_DISSECTOR_KEY_VLANID,
+   .offset = offsetof(struct flow_keys, tags),
+   },
 };
 
 static const struct flow_dissector_key flow_keys_buf_dissector_keys[] = {
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 net-next 07/11] net: Get rid of IPv6 hash addresses flow keys

2015-05-28 Thread Tom Herbert

We don't need to return the IPv6 address hash as part of flow keys.
In general, using the IPv6 address hash is risky in a hash value
since the underlying use of xor provides no entropy. If someone
really needs the hash value they can get it from the full IPv6
addresses in flow keys (e.g. from flow_get_u32_src).

Signed-off-by: Tom Herbert t...@herbertland.com
---
 include/net/flow_dissector.h |  1 -
 net/core/flow_dissector.c| 17 -
 2 files changed, 18 deletions(-)

diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index 3ee606a..59f00f9 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -103,7 +103,6 @@ enum flow_dissector_key_id {
FLOW_DISSECTOR_KEY_BASIC, /* struct flow_dissector_key_basic */
FLOW_DISSECTOR_KEY_IPV4_ADDRS, /* struct flow_dissector_key_ipv4_addrs 
*/
FLOW_DISSECTOR_KEY_IPV6_ADDRS, /* struct flow_dissector_key_ipv6_addrs 
*/
-   FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS, /* struct flow_dissector_key_addrs 
*/
FLOW_DISSECTOR_KEY_PORTS, /* struct flow_dissector_key_ports */
FLOW_DISSECTOR_KEY_ETH_ADDRS, /* struct flow_dissector_key_eth_addrs */
FLOW_DISSECTOR_KEY_TIPC_ADDRS, /* struct flow_dissector_key_tipc_addrs 
*/
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 91861c3..5348a46 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -200,19 +200,6 @@ ipv6:
nhoff += sizeof(struct ipv6hdr);
 
if (skb_flow_dissector_uses_key(flow_dissector,
-   
FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS)) {
-   key_addrs = skb_flow_dissector_target(flow_dissector,
- 
FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS,
- target_container);
-
-   key_addrs-v4addrs.src =
-   (__force __be32)ipv6_addr_hash(iph-saddr);
-   key_addrs-v4addrs.dst =
-   (__force __be32)ipv6_addr_hash(iph-daddr);
-   key_control-addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
-   goto flow_label;
-   }
-   if (skb_flow_dissector_uses_key(flow_dissector,
FLOW_DISSECTOR_KEY_IPV6_ADDRS)) 
{
struct flow_dissector_key_ipv6_addrs *key_ipv6_addrs;
 
@@ -651,10 +638,6 @@ static const struct flow_dissector_key 
flow_keys_dissector_keys[] = {
.offset = offsetof(struct flow_keys, addrs.v6addrs),
},
{
-   .key_id = FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS,
-   .offset = offsetof(struct flow_keys, addrs.v4addrs),
-   },
-   {
.key_id = FLOW_DISSECTOR_KEY_TIPC_ADDRS,
.offset = offsetof(struct flow_keys, addrs.tipcaddrs),
},
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 net-next 06/11] net: Add keys for TIPC address

2015-05-28 Thread Tom Herbert

Add a new flow key for TIPC addresses.

Signed-off-by: Tom Herbert t...@herbertland.com
---
 include/net/flow_dissector.h | 10 ++
 net/core/flow_dissector.c| 18 +-
 2 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index 306d461..3ee606a 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -50,6 +50,14 @@ struct flow_dissector_key_ipv6_addrs {
 };
 
 /**
+ * struct flow_dissector_key_tipc_addrs:
+ * @srcnode: source node address
+ */
+struct flow_dissector_key_tipc_addrs {
+   __be32 srcnode;
+};
+
+/**
  * struct flow_dissector_key_addrs:
  * @v4addrs: IPv4 addresses
  * @v6addrs: IPv6 addresses
@@ -58,6 +66,7 @@ struct flow_dissector_key_addrs {
union {
struct flow_dissector_key_ipv4_addrs v4addrs;
struct flow_dissector_key_ipv6_addrs v6addrs;
+   struct flow_dissector_key_tipc_addrs tipcaddrs;
};
 };
 
@@ -97,6 +106,7 @@ enum flow_dissector_key_id {
FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS, /* struct flow_dissector_key_addrs 
*/
FLOW_DISSECTOR_KEY_PORTS, /* struct flow_dissector_key_ports */
FLOW_DISSECTOR_KEY_ETH_ADDRS, /* struct flow_dissector_key_eth_addrs */
+   FLOW_DISSECTOR_KEY_TIPC_ADDRS, /* struct flow_dissector_key_tipc_addrs 
*/
 
FLOW_DISSECTOR_KEY_MAX,
 };
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index ca9d224..91861c3 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -294,13 +294,12 @@ flow_label:
key_control-thoff = (u16)nhoff;
 
if (skb_flow_dissector_uses_key(flow_dissector,
-   
FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS)) {
+   FLOW_DISSECTOR_KEY_TIPC_ADDRS)) 
{
key_addrs = skb_flow_dissector_target(flow_dissector,
- 
FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS,
+ 
FLOW_DISSECTOR_KEY_TIPC_ADDRS,
  target_container);
-   key_addrs-v4addrs.src = hdr-srcnode;
-   key_addrs-v4addrs.dst = 0;
-   key_control-addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
+   key_addrs-tipcaddrs.srcnode = hdr-srcnode;
+   key_control-addr_type = FLOW_DISSECTOR_KEY_TIPC_ADDRS;
}
return true;
}
@@ -408,6 +407,9 @@ static inline size_t flow_keys_hash_length(struct flow_keys 
*flow)
case FLOW_DISSECTOR_KEY_IPV6_ADDRS:
diff -= sizeof(flow-addrs.v6addrs);
break;
+   case FLOW_DISSECTOR_KEY_TIPC_ADDRS:
+   diff -= sizeof(flow-addrs.tipcaddrs);
+   break;
}
return (sizeof(*flow) - diff) / sizeof(u32);
 }
@@ -420,6 +422,8 @@ __be32 flow_get_u32_src(const struct flow_keys *flow)
case FLOW_DISSECTOR_KEY_IPV6_ADDRS:
return (__force __be32)ipv6_addr_hash(
flow-addrs.v6addrs.src);
+   case FLOW_DISSECTOR_KEY_TIPC_ADDRS:
+   return flow-addrs.tipcaddrs.srcnode;
default:
return 0;
}
@@ -651,6 +655,10 @@ static const struct flow_dissector_key 
flow_keys_dissector_keys[] = {
.offset = offsetof(struct flow_keys, addrs.v4addrs),
},
{
+   .key_id = FLOW_DISSECTOR_KEY_TIPC_ADDRS,
+   .offset = offsetof(struct flow_keys, addrs.tipcaddrs),
+   },
+   {
.key_id = FLOW_DISSECTOR_KEY_PORTS,
.offset = offsetof(struct flow_keys, ports),
},
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v4 net-next 11/11] mpls: Add MPLS entropy label in flow_keys

2015-05-28 Thread Tom Herbert

In flow dissector if an MPLS header contains an entropy label this is
saved in the new keyid field of flow_keys. The entropy label is
then represented in the flow hash function input.

Signed-off-by: Tom Herbert t...@herbertland.com
---
 include/net/flow_dissector.h |  1 +
 net/core/flow_dissector.c| 35 +++
 2 files changed, 36 insertions(+)

diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index 5d4257b..09f4b76 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -118,6 +118,7 @@ enum flow_dissector_key_id {
FLOW_DISSECTOR_KEY_VLANID, /* struct flow_dissector_key_flow_tags */
FLOW_DISSECTOR_KEY_FLOW_LABEL, /* struct flow_dissector_key_flow_tags */
FLOW_DISSECTOR_KEY_GRE_KEYID, /* struct flow_dissector_key_keyid */
+   FLOW_DISSECTOR_KEY_MPLS_ENTROPY, /* struct flow_dissector_key_keyid */
 
FLOW_DISSECTOR_KEY_MAX,
 };
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index ea318d5..aaebe52 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -15,6 +15,7 @@
 #include linux/ppp_defs.h
 #include linux/stddef.h
 #include linux/if_ether.h
+#include linux/mpls.h
 #include net/flow_dissector.h
 #include scsi/fc/fc_fcoe.h
 
@@ -288,6 +289,37 @@ ipv6:
}
return true;
}
+
+   case htons(ETH_P_MPLS_UC):
+   case htons(ETH_P_MPLS_MC): {
+   struct mpls_label *hdr, _hdr[2];
+mpls:
+   hdr = __skb_header_pointer(skb, nhoff, sizeof(_hdr), data,
+  hlen, _hdr);
+   if (!hdr)
+   return false;
+
+   if ((ntohl(hdr[0].entry)  MPLS_LS_LABEL_MASK) ==
+MPLS_LABEL_ENTROPY) {
+   if (skb_flow_dissector_uses_key(flow_dissector,
+   
FLOW_DISSECTOR_KEY_MPLS_ENTROPY)) {
+   key_keyid = 
skb_flow_dissector_target(flow_dissector,
+ 
FLOW_DISSECTOR_KEY_MPLS_ENTROPY,
+ 
target_container);
+   key_keyid-keyid = ntohl(hdr[1].entry) 
+   MPLS_LS_LABEL_MASK;
+   }
+
+   key_basic-n_proto = proto;
+   key_basic-ip_proto = ip_proto;
+   key_control-thoff = (u16)nhoff;
+
+   return true;
+   }
+
+   return true;
+   }
+
case htons(ETH_P_FCOE):
key_control-thoff = (u16)(nhoff + FCOE_HEADER_LEN);
/* fall through */
@@ -357,6 +389,9 @@ ipv6:
case IPPROTO_IPV6:
proto = htons(ETH_P_IPV6);
goto ipv6;
+   case IPPROTO_MPLS:
+   proto = htons(ETH_P_MPLS_UC);
+   goto mpls;
default:
break;
}
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM

2015-05-28 Thread Doug Ledford

On Thu, 2015-05-28 at 11:43 -0600, Jason Gunthorpe wrote:
 On Thu, May 28, 2015 at 07:21:11PM +0300, Or Gerlitz wrote:
 
  Anything else except for that (you said reworking of the network scripts
  and NetworkManager assumptions to make it work)??
 
 IPv6 becomes very broken, child interfaces will generate the same IPv6
 addreses for radv and link local resulting in duplicate address
 scenarios.
 
 About the only thing that will work properly is statically assigned
 IPv4 addresses.
 
  I don't see why we should stop the whole RDMA containers support train just
  b/c we found out the IPoIB DHCP bug which was there for few years before
  this effort started.
 
 I don't think that is what Doug said.

Indeed.  There is no need to scrap things, but if the design as it
stands, and the intended means of creating objects for use in
containers, is going to result in an unworkable network, then we have to
re-evaluate how the container constructs are created, and that then has
possible consequences for how we would get from an incoming packet to
the proper container.

I'm not trying to stop the support train here, but at the same time,
if the train is headed for a bridge that's out

-- 
Doug Ledford dledf...@redhat.com
  GPG KeyID: 0E572FDD



signature.asc
Description: This is a digitally signed message part

Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM

2015-05-28 Thread Or Gerlitz

On Thu, May 28, 2015 at 9:22 PM, Doug Ledford dledf...@redhat.com wrote:

 I don't think that is what Doug said.

 Indeed.  There is no need to scrap things, but if the design as it
 stands, and the intended means of creating objects for use in
 containers, is going to result in an unworkable network, then we have to
 re-evaluate how the container constructs are created, and that then has
 possible consequences for how we would get from an incoming packet to
 the proper container.

To be precise, do we agree that the issue here isn't in the design as
it stands but rather in a problem we found in the intended way of
assigning IP addresses through DHCP for the containers?

 I'm not trying to stop the support train here, but at the same time,
 if the train is headed for a bridge that's out

So what's your concrete saying here? where should we go from here?

Or.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next V5 02/11] net/mlx5_core: Set irq affinity hints

2015-05-28 Thread Amir Vadai

From: Saeed Mahameed sae...@mellanox.com

Preparation for upcoming ethernet driver.
- Move msix array from eq_table struct to priv since its not related to
  eq_table
- Intorduce irq_info struct to hold all irq information
- Move name from mlx5_eq to irq_info struct since it is irq property.
- Set IRQ affinity hints

Signed-off-by: Achiad Shochat ach...@mellanox.com
Signed-off-by: Rana Shahout ra...@mellanox.com
Signed-off-by: Saeed Mahameed sae...@mellanox.com
Signed-off-by: Amir Vadai am...@mellanox.com
---
 drivers/net/ethernet/mellanox/mlx5/core/eq.c   |  16 ++--
 drivers/net/ethernet/mellanox/mlx5/core/main.c | 111 ++---
 include/linux/mlx5/driver.h|  11 ++-
 3 files changed, 117 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c 
b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 3f511bd..516efc2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -339,7 +339,7 @@ static void init_eq_buf(struct mlx5_eq *eq)
 int mlx5_create_map_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq, u8 
vecidx,
   int nent, u64 mask, const char *name, struct mlx5_uar 
*uar)
 {
-   struct mlx5_eq_table *table = dev-priv.eq_table;
+   struct mlx5_priv *priv = dev-priv;
struct mlx5_create_eq_mbox_in *in;
struct mlx5_create_eq_mbox_out out;
int err;
@@ -377,14 +377,15 @@ int mlx5_create_map_eq(struct mlx5_core_dev *dev, struct 
mlx5_eq *eq, u8 vecidx,
goto err_in;
}
 
-   snprintf(eq-name, MLX5_MAX_EQ_NAME, %s@pci:%s,
+   snprintf(priv-irq_info[vecidx].name, MLX5_MAX_IRQ_NAME, %s@pci:%s,
 name, pci_name(dev-pdev));
+
eq-eqn = out.eq_number;
eq-irqn = vecidx;
eq-dev = dev;
eq-doorbell = uar-map + MLX5_EQ_DOORBEL_OFFSET;
-   err = request_irq(table-msix_arr[vecidx].vector, mlx5_msix_handler, 0,
- eq-name, eq);
+   err = request_irq(priv-msix_arr[vecidx].vector, mlx5_msix_handler, 0,
+ priv-irq_info[vecidx].name, eq);
if (err)
goto err_eq;
 
@@ -400,7 +401,7 @@ int mlx5_create_map_eq(struct mlx5_core_dev *dev, struct 
mlx5_eq *eq, u8 vecidx,
return 0;
 
 err_irq:
-   free_irq(table-msix_arr[vecidx].vector, eq);
+   free_irq(priv-msix_arr[vecidx].vector, eq);
 
 err_eq:
mlx5_cmd_destroy_eq(dev, eq-eqn);
@@ -416,16 +417,15 @@ EXPORT_SYMBOL_GPL(mlx5_create_map_eq);
 
 int mlx5_destroy_unmap_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq)
 {
-   struct mlx5_eq_table *table = dev-priv.eq_table;
int err;
 
mlx5_debug_eq_remove(dev, eq);
-   free_irq(table-msix_arr[eq-irqn].vector, eq);
+   free_irq(dev-priv.msix_arr[eq-irqn].vector, eq);
err = mlx5_cmd_destroy_eq(dev, eq-eqn);
if (err)
mlx5_core_warn(dev, failed to destroy a previously created eq: 
eqn %d\n,
   eq-eqn);
-   synchronize_irq(table-msix_arr[eq-irqn].vector);
+   synchronize_irq(dev-priv.msix_arr[eq-irqn].vector);
mlx5_buf_free(dev, eq-buf);
 
return err;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 28425e5..55085b0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -38,6 +38,7 @@
 #include linux/dma-mapping.h
 #include linux/slab.h
 #include linux/io-mapping.h
+#include linux/interrupt.h
 #include linux/mlx5/driver.h
 #include linux/mlx5/cq.h
 #include linux/mlx5/qp.h
@@ -208,7 +209,8 @@ static void release_bar(struct pci_dev *pdev)
 
 static int mlx5_enable_msix(struct mlx5_core_dev *dev)
 {
-   struct mlx5_eq_table *table = dev-priv.eq_table;
+   struct mlx5_priv *priv = dev-priv;
+   struct mlx5_eq_table *table = priv-eq_table;
int num_eqs = 1  dev-caps.gen.log_max_eq;
int nvec;
int i;
@@ -218,14 +220,16 @@ static int mlx5_enable_msix(struct mlx5_core_dev *dev)
if (nvec = MLX5_EQ_VEC_COMP_BASE)
return -ENOMEM;
 
-   table-msix_arr = kzalloc(nvec * sizeof(*table-msix_arr), GFP_KERNEL);
-   if (!table-msix_arr)
-   return -ENOMEM;
+   priv-msix_arr = kcalloc(nvec, sizeof(*priv-msix_arr), GFP_KERNEL);
+
+   priv-irq_info = kcalloc(nvec, sizeof(*priv-irq_info), GFP_KERNEL);
+   if (!priv-msix_arr || !priv-irq_info)
+   goto err_free_msix;
 
for (i = 0; i  nvec; i++)
-   table-msix_arr[i].entry = i;
+   priv-msix_arr[i].entry = i;
 
-   nvec = pci_enable_msix_range(dev-pdev, table-msix_arr,
+   nvec = pci_enable_msix_range(dev-pdev, priv-msix_arr,
 MLX5_EQ_VEC_COMP_BASE + 1, nvec);
if (nvec  0)
return nvec;
@@ -233,14 +237,20 @@ static int

[PATCH net-next V5 00/11] net/mlx5: ConnectX-4 100G Ethernet driver

2015-05-28 Thread Amir Vadai

Hi Dave,

This patchset extends the mlx5_core driver to support Ethernet
functionality. The Ethernet functionality in the mlx5 driver is
integrated into the core driver and not as separated driver. The
IB functionality remains in the mlx5_ib driver as before.

This functionality will enable the Ethernet capability of Mellanox's new
famility of cards - ConnectX-4. Due to the fact that backword
compatability is being kept, existing Connect-IB cards that are using
this driver are fully working with the modified driver, and no issues
with current deployments should be seen.

Like the ConnectX-3 cards, ConnectX-4 is a VPI (Virtual Port Interface -
every port can be configured as Infiniband or Ethernet) card.
Unlike previous generations, the ConnectX-4 has a separate PCI function
per port.

The current code has a limitation that Infiniband and Ethernet port types
are mutually exclusive. When the driver is compiled with Ethernet
support, the Infiniband functionality is disabled and vice versa. To
control that we added the CONFIG_MLX5_CORE_EN config directive
which is 'n' by default, but can be changed by the user.

This limitation is short-lived and would be addressed soon.

As part of this patchset, mlx5_ifc.h was heavily modified [1]. This file
is now generated automatically from the device specification document.
Since this patch is too big for the mail server, it might be missing in
the mailing list, but could be pulled from an external git repository [2].

irq name selection is done at driver initialization and doesn't contain the
interface name as part of the irq name.
irq_balancer will still work thanks to an improvement introduced by Neil Horman
[3] to use sysfs instead of /proc/interrupts.

Patchset was applied on top of commit ed2dfd9 (tcp/dccp: warn user for
preferred ip_local_port_range)

[1] - Patch 4/11 (net/mlx5_core: HW data structs/types definitions preparation 
for mlx5 ehternet driver)
[2] - 
http://git.openfabrics.org/?p=~amirv/linux.git;a=shortlog;h=refs/heads/mlx5e_v1
[3] - kernel: da8d1c8 PCI/sysfs: add per pci device msi[x] irq listing (v5)
  irq_balancer: 32a7757 Complete rework of how we detect and classify irqs

Thanks to Achiad, Saeed, Yevheny, Or and the whole team for making this happen,
Amir

Changes from V4:
- Removed Patch 3/12: net/mlx5_core: Add EQ renaming mechanism
- Patch 12/12: net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet 
functionality
  - irq name is created on driver initialization, therefore it won't contain
the network interface name in it. This won't effect irq_balancer thanks to
patches introduced by Neil Horman to use sysfs instead of /proc/interrupts.

Changes from V3:
- PATCH 8/11: net/mlx5_core: Set/Query port MTU commands
  - Return value directly - no need for err.

Changes from V2:
- Improved changelogs and cover-letter
- Added CONFIG_MLX5_EN to disable/enable the Ethernet functionality
- Moved en.h and wq.[ch] into the patch with data-path related code

Changes from V1:
- Added patch 1/12 (net/mlx5_core,mlx5_ib: Do not use vmap() on coherent
  memory)

Changes from V0:
- Removed V0 Patch 1/11 (net/mlx5_core: Virtually extend work/completion queue
  buffers by one page) due to misuse of DMA API. Thanks Dave.
- Patch 1/11 (net/mlx5_core: Set irq affinity hints):
  - Use kcalloc instead of kzalloc
  - Fix build error when CONFIG_CPUMASK_OFFSTACK=n. Driver loading will fail
now if cpumask allocation is failing.
  - Using dev_to_node helper. Thanks, Ido.
- Patch 3/11 (net/mlx5_core: HW data structs/types definitions preparation for
  mlx5 ehternet driver)
  - Removed Mellanox internal comment at the head of the file. Thanks Joe
- Patch 6/11 (net/mlx5_core: Implement get/set port status)
  - Use direct return of function's result. Thanks Sergei.
- Added Patch 8/11 (net/mlx5_core: Set/Query port MTU commands)
- Patch 9/11 (net/mlx5: Ethernet Datapath files)
  - Use rq-wqe_sz instead of skb_end_offset. Thanks Ido.
  - Use dma_wmb() when possible instead of wmb(). Thanks Alex.
  - Fix checkpatch issues
- Patch 10/11 (net/mlx5: ethernet resources handling)
  - checkpatch issues
  - Added missing include
- Patch 11/11 (net/mlx5: Ethernet driver)
  - checkpatch issues
  - fixed typo
  - Modified use of affinity hint
  - Using dev_to_node helper. Thanks, Ido.
  - Use new hardware commands from Patch 8/11 (net/mlx5_core: Set/Query port
MTU commands) to get/set port MTU in hardware.
  - Removed NETIF_F_SG since hardware ring wraparound is not supported
  - Use dma_wmb() when possible instead of wmb(). Thanks Alex.

Amir Vadai (4):
  net/mlx5_core,mlx5_ib: Do not use vmap() on coherent memory
  net/mlx5: Ethernet Datapath files
  net/mlx5: Ethernet resource handling files
  net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet
functionality

Rana Shahout (2):
  net/mlx5_core: Implement get/set port status
  net/mlx5_core: Modify CQ moderation parameters

Saeed Mahameed (5):
  net/mlx5_core: Set irq affinity hints
  net/mlx5_core: HW

RE: [PATCH] net: qlcnic: clean up sysfs error codes

2015-05-28 Thread Rajesh Borundia

-Original Message-
From: dept_hsg_linux_nic_dev-boun...@qlclistserver.qlogic.com
[mailto:dept_hsg_linux_nic_dev-boun...@qlclistserver.qlogic.com] On
Behalf Of Vladimir Zapolskiy
Sent: Tuesday, May 26, 2015 6:20 AM
To: David Miller; Shahed Shaikh; Dept-GE Linux NIC Dev
Cc: netdev
Subject: [PATCH] net: qlcnic: clean up sysfs error codes

Replace confusing QL_STATUS_INVALID_PARAM == -1 == -EPERM with -
EINVAL and QLC_STATUS_UNSUPPORTED_CMD == -2 == -ENOENT with -
EOPNOTSUPP, the latter error code is arguable, but it is already used in the
driver, so let it be here as well.

Also remove always false (!buf) check on read(), the driver should not care if
userspace gets its EFAULT or not.

Signed-off-by: Vladimir Zapolskiy v...@mleia.com
---
 drivers/net/ethernet/qlogic/qlcnic/qlcnic.h   |  3 -
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c  |  2 +-
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c | 77 +++---
-
 3 files changed, 36 insertions(+), 46 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
index f221126..055f376 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h
@@ -1326,9 +1326,6 @@ struct qlcnic_eswitch {  };


-/* Return codes for Error handling */
-#define QL_STATUS_INVALID_PARAM   -1
-
 #define MAX_BW100 /* % of link speed */
 #define MIN_BW1   /* % of link speed */
 #define MAX_VLAN_ID   4095
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
index 367f397..2f6cc42 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
@@ -1031,7 +1031,7 @@ int qlcnic_init_pci_info(struct qlcnic_adapter
*adapter)
   pfn = pci_info[i].id;

   if (pfn = ahw-max_vnic_func) {
-  ret = QL_STATUS_INVALID_PARAM;
+  ret = -EINVAL;
   dev_err(adapter-pdev-dev, %s: Invalid function
0x%x, max 0x%x\n,
   __func__, pfn, ahw-max_vnic_func);
   goto err_eswitch;
diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c
index 59a721f..05c28f2 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c
@@ -24,8 +24,6 @@
 #include linux/hwmon-sysfs.h
 #endif

-#define QLC_STATUS_UNSUPPORTED_CMD-2
-
 int qlcnicvf_config_bridged_mode(struct qlcnic_adapter *adapter, u32
enable)  {
   return -EOPNOTSUPP;
@@ -166,7 +164,7 @@ static int qlcnic_82xx_store_beacon(struct
qlcnic_adapter *adapter,
   u8 b_state, b_rate;

   if (len != sizeof(u16))
-  return QL_STATUS_INVALID_PARAM;
+  return -EINVAL;

   memcpy(beacon, buf, sizeof(u16));
   err = qlcnic_validate_beacon(adapter, beacon, b_state, b_rate);
@@ -383,17 +381,17 @@ static int validate_pm_config(struct qlcnic_adapter
*adapter,
   dest_pci_func = pm_cfg[i].dest_npar;
   src_index = qlcnic_is_valid_nic_func(adapter, src_pci_func);
   if (src_index  0)
-  return QL_STATUS_INVALID_PARAM;
+  return -EINVAL;

   dest_index = qlcnic_is_valid_nic_func(adapter,
dest_pci_func);
   if (dest_index  0)
-  return QL_STATUS_INVALID_PARAM;
+  return -EINVAL;

   s_esw_id = adapter-npars[src_index].phy_port;
   d_esw_id = adapter-npars[dest_index].phy_port;

   if (s_esw_id != d_esw_id)
-  return QL_STATUS_INVALID_PARAM;
+  return -EINVAL;
   }

   return 0;
@@ -414,7 +412,7 @@ static ssize_t qlcnic_sysfs_write_pm_config(struct file
*filp,
   count   = size / sizeof(struct qlcnic_pm_func_cfg);
   rem = size % sizeof(struct qlcnic_pm_func_cfg);
   if (rem)
-  return QL_STATUS_INVALID_PARAM;
+  return -EINVAL;

   qlcnic_swap32_buffer((u32 *)buf, size / sizeof(u32));
   pm_cfg = (struct qlcnic_pm_func_cfg *)buf; @@ -427,7 +425,7 @@
static ssize_t qlcnic_sysfs_write_pm_config(struct file *filp,
   action = !!pm_cfg[i].action;
   index = qlcnic_is_valid_nic_func(adapter, pci_func);
   if (index  0)
-  return QL_STATUS_INVALID_PARAM;
+  return -EINVAL;

   id = adapter-npars[index].phy_port;
   ret = qlcnic_config_port_mirroring(adapter, id, @@ -440,7
+438,7 @@ static ssize_t qlcnic_sysfs_write_pm_config(struct file *filp,
   pci_func = pm_cfg[i].pci_func;
   index = qlcnic_is_valid_nic_func(adapter, pci_func);
   if (index  0)
-  return

Re: Drops in qdisc on ifb interface

2015-05-28 Thread Eric Dumazet

On Thu, 2015-05-28 at 13:31 -0400, jsulli...@opensourcedevel.com wrote:

 The overall product does but the video source feeds come over a different
 network via UDP. There are, however, RTMP quality control feeds coming across
 this connection.  There may also occasionally be test UDP source feeds on this
 connection but those are not production.  Thanks - John

This is important to know, because UDP wont benefit from GRO.

I was assuming your receiver had to handle ~88000 packets per second,
so I was doubting it could saturate one core,
but maybe your target is very different.



--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Drops in qdisc on ifb interface

2015-05-28 Thread jsulli...@opensourcedevel.com


 On May 28, 2015 at 1:49 PM Eric Dumazet eric.duma...@gmail.com wrote:


 On Thu, 2015-05-28 at 13:31 -0400, jsulli...@opensourcedevel.com wrote:

  The overall product does but the video source feeds come over a different
  network via UDP. There are, however, RTMP quality control feeds coming
  across
  this connection. There may also occasionally be test UDP source feeds on
  this
  connection but those are not production. Thanks - John

 This is important to know, because UDP wont benefit from GRO.

 I was assuming your receiver had to handle ~88000 packets per second,
 so I was doubting it could saturate one core,
 but maybe your target is very different.



That PPS estimate seems accurate - the port speed and CIR on the shaped
connection is 1 Gbps.

I'm still mystified by why the GbE bottlenecks on IFB but the 10GbE does not.
 Thanks -  John
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/4] netfilter: nf_tables: add netdev table to filter from ingress

2015-05-28 Thread Pablo Neira Ayuso

This allows us to create netdev tables that contain ingress chains. Use
skb_header_pointer() as we may see shared sk_buffs at this stage.

This change provides access to the existing nf_tables features from the ingress
hook.

Signed-off-by: Pablo Neira Ayuso pa...@netfilter.org
---
 include/net/netns/nftables.h |1 +
 net/netfilter/Kconfig|5 ++
 net/netfilter/Makefile   |1 +
 net/netfilter/nf_tables_netdev.c |  183 ++
 4 files changed, 190 insertions(+)
 create mode 100644 net/netfilter/nf_tables_netdev.c

diff --git a/include/net/netns/nftables.h b/include/net/netns/nftables.h
index eee608b..c807811 100644
--- a/include/net/netns/nftables.h
+++ b/include/net/netns/nftables.h
@@ -13,6 +13,7 @@ struct netns_nftables {
struct nft_af_info  *inet;
struct nft_af_info  *arp;
struct nft_af_info  *bridge;
+   struct nft_af_info  *netdev;
unsigned intbase_seq;
u8  gencursor;
 };
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 9a89e7c..bd5aaeb 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -456,6 +456,11 @@ config NF_TABLES_INET
help
  This option enables support for a mixed IPv4/IPv6 inet table.
 
+config NF_TABLES_NETDEV
+   tristate Netfilter nf_tables netdev tables support
+   help
+ This option enables support for the netdev table.
+
 config NFT_EXTHDR
tristate Netfilter nf_tables IPv6 exthdr module
help
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index a87d8b8..70d026d 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -75,6 +75,7 @@ nf_tables-objs += nft_bitwise.o nft_byteorder.o nft_payload.o
 
 obj-$(CONFIG_NF_TABLES)+= nf_tables.o
 obj-$(CONFIG_NF_TABLES_INET)   += nf_tables_inet.o
+obj-$(CONFIG_NF_TABLES_NETDEV) += nf_tables_netdev.o
 obj-$(CONFIG_NFT_COMPAT)   += nft_compat.o
 obj-$(CONFIG_NFT_EXTHDR)   += nft_exthdr.o
 obj-$(CONFIG_NFT_META) += nft_meta.o
diff --git a/net/netfilter/nf_tables_netdev.c b/net/netfilter/nf_tables_netdev.c
new file mode 100644
index 000..04cb170
--- /dev/null
+++ b/net/netfilter/nf_tables_netdev.c
@@ -0,0 +1,183 @@
+/*
+ * Copyright (c) 2015 Pablo Neira Ayuso pa...@netfilter.org
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include linux/init.h
+#include linux/module.h
+#include net/netfilter/nf_tables.h
+#include linux/ip.h
+#include linux/ipv6.h
+#include net/netfilter/nf_tables_ipv4.h
+#include net/netfilter/nf_tables_ipv6.h
+
+static inline void
+nft_netdev_set_pktinfo_ipv4(struct nft_pktinfo *pkt,
+   const struct nf_hook_ops *ops, struct sk_buff *skb,
+   const struct nf_hook_state *state)
+{
+   struct iphdr *iph, _iph;
+   u32 len, thoff;
+
+   nft_set_pktinfo(pkt, ops, skb, state);
+
+   iph = skb_header_pointer(skb, skb_network_offset(skb), sizeof(*iph),
+_iph);
+   if (!iph)
+   return;
+
+   iph = ip_hdr(skb);
+   if (iph-ihl  5 || iph-version != 4)
+   return;
+
+   len = ntohs(iph-tot_len);
+   thoff = iph-ihl * 4;
+   if (skb-len  len)
+   return;
+   else if (len  thoff)
+   return;
+
+   pkt-tprot = iph-protocol;
+   pkt-xt.thoff = thoff;
+   pkt-xt.fragoff = ntohs(iph-frag_off)  IP_OFFSET;
+}
+
+static inline void
+__nft_netdev_set_pktinfo_ipv6(struct nft_pktinfo *pkt,
+ const struct nf_hook_ops *ops,
+ struct sk_buff *skb,
+ const struct nf_hook_state *state)
+{
+#if IS_ENABLED(CONFIG_IPV6)
+   struct ipv6hdr *ip6h, _ip6h;
+   unsigned int thoff = 0;
+   unsigned short frag_off;
+   int protohdr;
+   u32 pkt_len;
+
+   ip6h = skb_header_pointer(skb, skb_network_offset(skb), sizeof(*ip6h),
+ _ip6h);
+   if (!ip6h)
+   return;
+
+   if (ip6h-version != 6)
+   return;
+
+   pkt_len = ntohs(ip6h-payload_len);
+   if (pkt_len + sizeof(*ip6h)  skb-len)
+   return;
+
+   protohdr = ipv6_find_hdr(pkt-skb, thoff, -1, frag_off, NULL);
+   if (protohdr  0)
+return;
+
+   pkt-tprot = protohdr;
+   pkt-xt.thoff = thoff;
+   pkt-xt.fragoff = frag_off;
+#endif
+}
+
+static inline void nft_netdev_set_pktinfo_ipv6(struct nft_pktinfo *pkt,
+  const struct nf_hook_ops *ops,
+  struct sk_buff *skb,
+  const struct nf_hook_state 
*state)
+{
+   nft_set_pktinfo(pkt, ops,

Re: [PATCH v2] README: clarify redistribution requirements covering patents

2015-05-28 Thread Luis R. Rodriguez

On Tue, May 19, 2015 at 1:22 PM, Luis R. Rodriguez
mcg...@do-not-panic.com wrote:
 This v2 just changes licence to license as requested by Arend.

Please let me know if there is anything else needed.

 Luis
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v2] netevent: remove automatic variable in register_netevent_notifier()

2015-05-28 Thread Wang Long

Remove automatic variable 'err' in register_netevent_notifier() and
return the result of atomic_notifier_chain_register() directly.

Signed-off-by: Wang Long long.wangl...@huawei.com
---
 net/core/netevent.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/net/core/netevent.c b/net/core/netevent.c
index f17ccd2..8b3bc4f 100644
--- a/net/core/netevent.c
+++ b/net/core/netevent.c
@@ -31,10 +31,7 @@ static ATOMIC_NOTIFIER_HEAD(netevent_notif_chain);
  */
 int register_netevent_notifier(struct notifier_block *nb)
 {
-   int err;
-
-   err = atomic_notifier_chain_register(netevent_notif_chain, nb);
-   return err;
+   return atomic_notifier_chain_register(netevent_notif_chain, nb);
 }
 EXPORT_SYMBOL_GPL(register_netevent_notifier);
 
-- 
1.8.3.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/7] net: dsa: ar8xxx: add ethtool hw statistics support

2015-05-28 Thread Mathieu Olivari

MIB counters can now be reported through each switch port by using
ethtool -S.

Signed-off-by: Mathieu Olivari math...@codeaurora.org
---
 drivers/net/dsa/ar8xxx.c | 106 +++
 drivers/net/dsa/ar8xxx.h |  47 +
 2 files changed, 146 insertions(+), 7 deletions(-)

diff --git a/drivers/net/dsa/ar8xxx.c b/drivers/net/dsa/ar8xxx.c
index 4ce3ffc..2f0fa4d 100644
--- a/drivers/net/dsa/ar8xxx.c
+++ b/drivers/net/dsa/ar8xxx.c
@@ -22,6 +22,55 @@
 
 #include ar8xxx.h
 
+#define MIB_DESC(_s, _o, _n)   \
+   {   \
+   .size = (_s),   \
+   .offset = (_o), \
+   .name = (_n),   \
+   }
+
+static const struct ar8xxx_mib_desc ar8327_mib[] = {
+   MIB_DESC(1, 0x00, RxBroad),
+   MIB_DESC(1, 0x04, RxPause),
+   MIB_DESC(1, 0x08, RxMulti),
+   MIB_DESC(1, 0x0c, RxFcsErr),
+   MIB_DESC(1, 0x10, RxAlignErr),
+   MIB_DESC(1, 0x14, RxRunt),
+   MIB_DESC(1, 0x18, RxFragment),
+   MIB_DESC(1, 0x1c, Rx64Byte),
+   MIB_DESC(1, 0x20, Rx128Byte),
+   MIB_DESC(1, 0x24, Rx256Byte),
+   MIB_DESC(1, 0x28, Rx512Byte),
+   MIB_DESC(1, 0x2c, Rx1024Byte),
+   MIB_DESC(1, 0x30, Rx1518Byte),
+   MIB_DESC(1, 0x34, RxMaxByte),
+   MIB_DESC(1, 0x38, RxTooLong),
+   MIB_DESC(2, 0x3c, RxGoodByte),
+   MIB_DESC(2, 0x44, RxBadByte),
+   MIB_DESC(1, 0x4c, RxOverFlow),
+   MIB_DESC(1, 0x50, Filtered),
+   MIB_DESC(1, 0x54, TxBroad),
+   MIB_DESC(1, 0x58, TxPause),
+   MIB_DESC(1, 0x5c, TxMulti),
+   MIB_DESC(1, 0x60, TxUnderRun),
+   MIB_DESC(1, 0x64, Tx64Byte),
+   MIB_DESC(1, 0x68, Tx128Byte),
+   MIB_DESC(1, 0x6c, Tx256Byte),
+   MIB_DESC(1, 0x70, Tx512Byte),
+   MIB_DESC(1, 0x74, Tx1024Byte),
+   MIB_DESC(1, 0x78, Tx1518Byte),
+   MIB_DESC(1, 0x7c, TxMaxByte),
+   MIB_DESC(1, 0x80, TxOverSize),
+   MIB_DESC(2, 0x84, TxByte),
+   MIB_DESC(1, 0x8c, TxCollision),
+   MIB_DESC(1, 0x90, TxAbortCol),
+   MIB_DESC(1, 0x94, TxMultiCol),
+   MIB_DESC(1, 0x98, TxSingleCol),
+   MIB_DESC(1, 0x9c, TxExcDefer),
+   MIB_DESC(1, 0xa0, TxDefer),
+   MIB_DESC(1, 0xa4, TxLateCol),
+};
+
 u32
 ar8xxx_mii_read32(struct mii_bus *bus, int phy_id, int regnum)
 {
@@ -184,6 +233,10 @@ static int ar8xxx_setup(struct dsa_switch *ds)
if (ret  0)
return ret;
 
+   /* Enable MIB counters */
+   ar8xxx_reg_set(ds, AR8327_REG_MIB, AR8327_MIB_CPU_KEEP);
+   ar8xxx_write(ds, AR8327_REG_MODULE_EN, AR8327_MODULE_EN_MIB);
+
/* Disable forwarding by default on all ports */
for (i = 0; i  AR8327_NUM_PORTS; i++)
ar8xxx_rmw(ds, AR8327_PORT_LOOKUP_CTRL(i),
@@ -228,6 +281,42 @@ ar8xxx_phy_write(struct dsa_switch *ds, int phy, int 
regnum, u16 val)
return mdiobus_write(bus, phy, regnum, val);
 }
 
+static void ar8xxx_get_strings(struct dsa_switch *ds, int phy, uint8_t *data)
+{
+   int i;
+
+   for (i = 0; i  ARRAY_SIZE(ar8327_mib); i++) {
+   strncpy(data + i * ETH_GSTRING_LEN, ar8327_mib[i].name,
+   ETH_GSTRING_LEN);
+   }
+}
+
+static void ar8xxx_get_ethtool_stats(struct dsa_switch *ds, int phy,
+uint64_t *data)
+{
+   const struct ar8xxx_mib_desc *mib;
+   uint32_t reg, i, port;
+   u64 hi;
+
+   port = phy_to_port(phy);
+
+   for (i = 0; i  ARRAY_SIZE(ar8327_mib); i++) {
+   mib = ar8327_mib[i];
+   reg = AR8327_PORT_MIB_COUNTER(port) + mib-offset;
+
+   data[i] = ar8xxx_read(ds, reg);
+   if (mib-size == 2) {
+   hi = ar8xxx_read(ds, reg + 4);
+   data[i] |= hi  32;
+   }
+   }
+}
+
+static int ar8xxx_get_sset_count(struct dsa_switch *ds)
+{
+   return ARRAY_SIZE(ar8327_mib);
+}
+
 static void ar8xxx_poll_link(struct dsa_switch *ds)
 {
int i = 0;
@@ -275,13 +364,16 @@ static void ar8xxx_poll_link(struct dsa_switch *ds)
 }
 
 static struct dsa_switch_driver ar8xxx_switch_driver = {
-   .tag_protocol   = DSA_TAG_PROTO_NONE,
-   .probe  = ar8xxx_probe,
-   .setup  = ar8xxx_setup,
-   .set_addr   = ar8xxx_set_addr,
-   .poll_link  = ar8xxx_poll_link,
-   .phy_read   = ar8xxx_phy_read,
-   .phy_write  = ar8xxx_phy_write,
+   .tag_protocol   = DSA_TAG_PROTO_NONE,
+   .probe  = ar8xxx_probe,
+   .setup  = ar8xxx_setup,
+   .set_addr   = ar8xxx_set_addr,
+   .poll_link  = ar8xxx_poll_link,
+   .phy_read   = ar8xxx_phy_read,
+   .phy_write  = ar8xxx_phy_write,
+   .get_strings= ar8xxx_get_strings,
+   .get_ethtool_stats  = ar8xxx_get_ethtool_stats,
+   .get_sset_count = ar8xxx_get_sset_count,
 };
 
 static int __init

[PATCH 6/7] net: dsa: ar8xxx: add support for second xMII interfaces through DT

2015-05-28 Thread Mathieu Olivari

This patch is adding support for port6 specific options to device tree.
They can be used to setup the second xMII interface, and connect it to
one of the switch port.

Signed-off-by: Mathieu Olivari math...@codeaurora.org
---
 drivers/net/dsa/ar8xxx.c | 50 
 1 file changed, 50 insertions(+)

diff --git a/drivers/net/dsa/ar8xxx.c b/drivers/net/dsa/ar8xxx.c
index 4044614..7559249 100644
--- a/drivers/net/dsa/ar8xxx.c
+++ b/drivers/net/dsa/ar8xxx.c
@@ -19,6 +19,7 @@
 #include net/dsa.h
 #include linux/phy.h
 #include linux/of_net.h
+#include linux/of_platform.h
 
 #include ar8xxx.h
 
@@ -260,6 +261,9 @@ static int ar8xxx_set_pad_ctrl(struct dsa_switch *ds, int 
port, int mode)
ar8xxx_write(ds, AR8327_REG_PORT5_PAD_CTRL,
 AR8327_PORT_PAD_RGMII_RX_DELAY_EN);
break;
+   case PHY_INTERFACE_MODE_SGMII:
+   ar8xxx_write(ds, reg, AR8327_PORT_PAD_SGMII_EN);
+   break;
default:
pr_err(xMII mode %d not supported\n, mode);
return -EINVAL;
@@ -268,6 +272,48 @@ static int ar8xxx_set_pad_ctrl(struct dsa_switch *ds, int 
port, int mode)
return 0;
 }
 
+static int ar8xxx_of_setup(struct dsa_switch *ds)
+{
+   struct device_node *dn = ds-pd-of_node;
+   const char *s_phymode;
+   int ret, mode;
+   u32 phy_id, ctrl;
+
+   /* If port6-phy-mode property exists, configure it accordingly */
+   if (!of_property_read_string(dn, qca,port6-phy-mode, s_phymode)) {
+   for (mode = 0; mode  PHY_INTERFACE_MODE_MAX; mode++)
+   if (!strcasecmp(s_phymode, phy_modes(mode)))
+   break;
+
+   if (mode == PHY_INTERFACE_MODE_MAX)
+   pr_err(Unknown phy-mode: \%s\\n, s_phymode);
+
+   ret = ar8xxx_set_pad_ctrl(ds, 6, mode);
+   if (ret  0)
+   return ret;
+   }
+
+   /* If a phy ID is specified for PORT6 mac, connect them together */
+   if (!of_property_read_u32(dn, qca,port6-phy-id, phy_id)) {
+   ar8xxx_rmw(ds, AR8327_PORT_LOOKUP_CTRL(6),
+  AR8327_PORT_LOOKUP_MEMBER, BIT(phy_to_port(phy_id)));
+   ar8xxx_rmw(ds, AR8327_PORT_LOOKUP_CTRL(phy_to_port(phy_id)),
+  AR8327_PORT_LOOKUP_MEMBER, BIT(6));
+
+   /* We want the switch to be pass-through and act like a PHY on
+* these ports. So BC/MC/UC  IGMP frames need to be accepted
+*/
+   ctrl = BIT(phy_to_port(phy_id)) | BIT(6);
+   ar8xxx_reg_set(ds, AR8327_REG_GLOBAL_FW_CTRL1,
+  ctrl  AR8327_GLOBAL_FW_CTRL1_IGMP_DP_S |
+  ctrl  AR8327_GLOBAL_FW_CTRL1_BC_DP_S |
+  ctrl  AR8327_GLOBAL_FW_CTRL1_MC_DP_S |
+  ctrl  AR8327_GLOBAL_FW_CTRL1_UC_DP_S);
+   }
+
+   return 0;
+}
+
 static int ar8xxx_setup(struct dsa_switch *ds)
 {
struct ar8xxx_priv *priv = ds_to_priv(ds);
@@ -341,6 +387,10 @@ static int ar8xxx_setup(struct dsa_switch *ds)
}
}
 
+   ret = ar8xxx_of_setup(ds);
+   if (ret  0)
+   return ret;
+
return 0;
 }
 
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/7] net: dsa: ar8xxx: add regmap support

2015-05-28 Thread Mathieu Olivari

All switch registers can now be dumped using regmap/debugfs.

\# cat /sys/kernel/debug/regmap/mdiobus/registers
: 1302
0004: ...
...

Signed-off-by: Mathieu Olivari math...@codeaurora.org
---
 drivers/net/dsa/Kconfig  |  1 +
 drivers/net/dsa/ar8xxx.c | 60 
 drivers/net/dsa/ar8xxx.h |  5 
 3 files changed, 66 insertions(+)

diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig
index 2aae541..17fb296 100644
--- a/drivers/net/dsa/Kconfig
+++ b/drivers/net/dsa/Kconfig
@@ -68,6 +68,7 @@ config NET_DSA_BCM_SF2
 config NET_DSA_AR8XXX
tristate Qualcomm Atheros AR8XXX Ethernet switch family support
depends on NET_DSA
+   select REGMAP
---help---
  This enables support for the Qualcomm Atheros AR8XXX Ethernet
  switch chips.
diff --git a/drivers/net/dsa/ar8xxx.c b/drivers/net/dsa/ar8xxx.c
index 2f0fa4d..327abd4 100644
--- a/drivers/net/dsa/ar8xxx.c
+++ b/drivers/net/dsa/ar8xxx.c
@@ -176,6 +176,57 @@ static char *ar8xxx_probe(struct device *host_dev, int 
sw_addr)
}
 }
 
+static int ar8xxx_regmap_read(void *ctx, uint32_t reg, uint32_t *val)
+{
+   struct dsa_switch *ds = (struct dsa_switch *)ctx;
+
+   *val = ar8xxx_read(ds, reg);
+
+   return 0;
+}
+
+static int ar8xxx_regmap_write(void *ctx, uint32_t reg, uint32_t val)
+{
+   struct dsa_switch *ds = (struct dsa_switch *)ctx;
+
+   ar8xxx_write(ds, reg, val);
+
+   return 0;
+}
+
+static const struct regmap_range ar8xxx_readable_ranges[] = {
+   regmap_reg_range(0x, 0x00e4), /* Global control */
+   regmap_reg_range(0x0100, 0x0168), /* EEE control */
+   regmap_reg_range(0x0200, 0x0270), /* Parser control */
+   regmap_reg_range(0x0400, 0x0454), /* ACL */
+   regmap_reg_range(0x0600, 0x0718), /* Lookup */
+   regmap_reg_range(0x0800, 0x0b70), /* QM */
+   regmap_reg_range(0x0C00, 0x0c80), /* PKT */
+   regmap_reg_range(0x1000, 0x10ac), /* MIB - Port0 */
+   regmap_reg_range(0x1100, 0x11ac), /* MIB - Port1 */
+   regmap_reg_range(0x1200, 0x12ac), /* MIB - Port2 */
+   regmap_reg_range(0x1300, 0x13ac), /* MIB - Port3 */
+   regmap_reg_range(0x1400, 0x14ac), /* MIB - Port4 */
+   regmap_reg_range(0x1500, 0x15ac), /* MIB - Port5 */
+   regmap_reg_range(0x1600, 0x16ac), /* MIB - Port6 */
+
+};
+
+static struct regmap_access_table ar8xxx_readable_table = {
+   .yes_ranges = ar8xxx_readable_ranges,
+   .n_yes_ranges = ARRAY_SIZE(ar8xxx_readable_ranges),
+};
+
+struct regmap_config ar8xxx_regmap_config = {
+   .reg_bits = 16,
+   .val_bits = 32,
+   .reg_stride = 4,
+   .max_register = 0x16ac, /* end MIB - Port6 range */
+   .reg_read = ar8xxx_regmap_read,
+   .reg_write = ar8xxx_regmap_write,
+   .rd_table = ar8xxx_readable_table,
+};
+
 static int ar8xxx_set_pad_ctrl(struct dsa_switch *ds, int port, int mode)
 {
int reg;
@@ -219,9 +270,17 @@ static int ar8xxx_set_pad_ctrl(struct dsa_switch *ds, int 
port, int mode)
 
 static int ar8xxx_setup(struct dsa_switch *ds)
 {
+   struct ar8xxx_priv *priv = ds_to_priv(ds);
struct net_device *netdev = ds-dst-pd-of_netdev;
int ret, i, phy_mode;
 
+   /* Start by setting up the register mapping */
+   priv-regmap = devm_regmap_init(ds-master_dev, NULL, ds,
+   ar8xxx_regmap_config);
+
+   if (IS_ERR(priv-regmap))
+   pr_warn(regmap initialization failed);
+
/* Initialize CPU port pad mode (xMII type, delays...) */
phy_mode = of_get_phy_mode(netdev-dev.parent-of_node);
if (phy_mode  0) {
@@ -365,6 +424,7 @@ static void ar8xxx_poll_link(struct dsa_switch *ds)
 
 static struct dsa_switch_driver ar8xxx_switch_driver = {
.tag_protocol   = DSA_TAG_PROTO_NONE,
+   .priv_size  = sizeof(struct ar8xxx_priv),
.probe  = ar8xxx_probe,
.setup  = ar8xxx_setup,
.set_addr   = ar8xxx_set_addr,
diff --git a/drivers/net/dsa/ar8xxx.h b/drivers/net/dsa/ar8xxx.h
index 7c7a125..98cc7ed 100644
--- a/drivers/net/dsa/ar8xxx.h
+++ b/drivers/net/dsa/ar8xxx.h
@@ -17,6 +17,11 @@
 #define __AR8XXX_H
 
 #include linux/delay.h
+#include linux/regmap.h
+
+struct ar8xxx_priv {
+   struct regmap *regmap;
+};
 
 struct ar8xxx_mib_desc {
unsigned int size;
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 7/7] Documentation: devicetree: add ar8xxx binding

2015-05-28 Thread Mathieu Olivari

Add device-tree binding for ar8xxx switch families.

Signed-off-by: Mathieu Olivari math...@codeaurora.org
---
 .../devicetree/bindings/net/dsa/qca-ar8xxx.txt | 70 ++
 1 file changed, 70 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/dsa/qca-ar8xxx.txt

diff --git a/Documentation/devicetree/bindings/net/dsa/qca-ar8xxx.txt 
b/Documentation/devicetree/bindings/net/dsa/qca-ar8xxx.txt
new file mode 100644
index 000..f4fd3f1
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/dsa/qca-ar8xxx.txt
@@ -0,0 +1,70 @@
+* Qualcomm Atheros AR8xxx switch family
+
+Required properties:
+
+- compatible: should be qca,ar8xxx
+- dsa,mii-bus: phandle to the MDIO bus controller, see dsa/dsa.txt
+- dsa,ethernet: phandle to the CPU network interface controller, see 
dsa/dsa.txt
+- #size-cells: must be 0
+- #address-cells: must be 2, see dsa/dsa.txt
+
+Subnodes:
+
+The integrated switch subnode should be specified according to the binding
+described in dsa/dsa.txt.
+
+Optional properties:
+
+- qca,port6-phy-mode: if specified, the driver will configure Port 6 in the
+  given phy-mode. See Documentation/devicetree/bindings/net/ethernet.txt for
+  the list of valid phy-mode.
+
+- qca,port6-phy-id: if specified, the driver will connect Port 6 to the PHY
+  given as a parameter. In this case, Port6 and the corresponding PHY will be
+  isolated from the rest of the switch. From a system perspective, they will
+  act as a regular PHY.
+
+Example:
+
+   dsa@0 {
+   compatible = qca,ar8xxx;
+   #address-cells = 2;
+   #size-cells = 0;
+
+   dsa,ethernet = ethernet0;
+   dsa,mii-bus = mii_bus0;
+
+   switch@0 {
+   #address-cells = 1;
+   #size-cells = 0;
+   reg = 0 0;/* MDIO address 0, switch 0 in tree */
+
+   qca,port6-phy-mode = sgmii;
+   qca,port6-phy-id = 4;
+
+   port@0 {
+   reg = 11;
+   label = cpu;
+   };
+
+   port@1 {
+   reg = 0;
+   label = lan1;
+   };
+
+   port@2 {
+   reg = 1;
+   label = lan2;
+   };
+
+   port@3 {
+   reg = 2;
+   label = lan3;
+   };
+
+   port@4 {
+   reg = 3;
+   label = lan4;
+   };
+   };
+   };
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/7] net: dsa: add QCA tag support

2015-05-28 Thread Mathieu Olivari

QCA tags are used on QCA ar8xxx switch family. This change adds support
for encap/decap using 2 bytes header mode.

Signed-off-by: Mathieu Olivari math...@codeaurora.org
---
 include/net/dsa.h  |   1 +
 net/dsa/Kconfig|   3 +
 net/dsa/Makefile   |   1 +
 net/dsa/dsa.c  |   5 ++
 net/dsa/dsa_priv.h |   2 +
 net/dsa/slave.c|   5 ++
 net/dsa/tag_qca.c  | 158 +
 7 files changed, 175 insertions(+)
 create mode 100644 net/dsa/tag_qca.c

diff --git a/include/net/dsa.h b/include/net/dsa.h
index fbca63b..64ddf6f 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -26,6 +26,7 @@ enum dsa_tag_protocol {
DSA_TAG_PROTO_TRAILER,
DSA_TAG_PROTO_EDSA,
DSA_TAG_PROTO_BRCM,
+   DSA_TAG_PROTO_QCA,
 };
 
 #define DSA_MAX_SWITCHES   4
diff --git a/net/dsa/Kconfig b/net/dsa/Kconfig
index ff7736f..4f3cce1 100644
--- a/net/dsa/Kconfig
+++ b/net/dsa/Kconfig
@@ -26,6 +26,9 @@ config NET_DSA_HWMON
  via the hwmon sysfs interface and exposes the onboard sensors.
 
 # tagging formats
+config NET_DSA_TAG_QCA
+   bool
+
 config NET_DSA_TAG_BRCM
bool
 
diff --git a/net/dsa/Makefile b/net/dsa/Makefile
index da06ed1..9feb86c 100644
--- a/net/dsa/Makefile
+++ b/net/dsa/Makefile
@@ -3,6 +3,7 @@ obj-$(CONFIG_NET_DSA) += dsa_core.o
 dsa_core-y += dsa.o slave.o
 
 # tagging formats
+dsa_core-$(CONFIG_NET_DSA_TAG_QCA) += tag_qca.o
 dsa_core-$(CONFIG_NET_DSA_TAG_BRCM) += tag_brcm.o
 dsa_core-$(CONFIG_NET_DSA_TAG_DSA) += tag_dsa.o
 dsa_core-$(CONFIG_NET_DSA_TAG_EDSA) += tag_edsa.o
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index fffb9aa..6010a7d 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -249,6 +249,11 @@ static int dsa_switch_setup_one(struct dsa_switch *ds, 
struct device *parent)
dst-rcv = brcm_netdev_ops.rcv;
break;
 #endif
+#ifdef CONFIG_NET_DSA_TAG_QCA
+   case DSA_TAG_PROTO_QCA:
+   dst-rcv = qca_netdev_ops.rcv;
+   break;
+#endif
case DSA_TAG_PROTO_NONE:
break;
default:
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index d5f1f9b..350c94b 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -74,5 +74,7 @@ extern const struct dsa_device_ops trailer_netdev_ops;
 /* tag_brcm.c */
 extern const struct dsa_device_ops brcm_netdev_ops;
 
+/* tag_qca.c */
+extern const struct dsa_device_ops qca_netdev_ops;
 
 #endif
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 04ffad3..cd8f552 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -925,6 +925,11 @@ int dsa_slave_create(struct dsa_switch *ds, struct device 
*parent,
p-xmit = brcm_netdev_ops.xmit;
break;
 #endif
+#ifdef CONFIG_NET_DSA_TAG_QCA
+   case DSA_TAG_PROTO_QCA:
+   p-xmit = qca_netdev_ops.xmit;
+   break;
+#endif
default:
p-xmit = dsa_slave_notag_xmit;
break;
diff --git a/net/dsa/tag_qca.c b/net/dsa/tag_qca.c
new file mode 100644
index 000..8f02196
--- /dev/null
+++ b/net/dsa/tag_qca.c
@@ -0,0 +1,158 @@
+/*
+ * Copyright (c) 2015, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include linux/etherdevice.h
+#include dsa_priv.h
+
+#define QCA_HDR_LEN2
+#define QCA_HDR_VERSION0x2
+
+#define QCA_HDR_RECV_VERSION_MASK  GENMASK(15, 14)
+#define QCA_HDR_RECV_VERSION_S 14
+#define QCA_HDR_RECV_PRIORITY_MASK GENMASK(13, 11)
+#define QCA_HDR_RECV_PRIORITY_S11
+#define QCA_HDR_RECV_TYPE_MASK GENMASK(10, 6)
+#define QCA_HDR_RECV_TYPE_S6
+#define QCA_HDR_RECV_FRAME_IS_TAGGED   BIT(3)
+#define QCA_HDR_RECV_SOURCE_PORT_MASK  GENMASK(2, 0)
+
+#define QCA_HDR_XMIT_VERSION_MASK  GENMASK(15, 14)
+#define QCA_HDR_XMIT_VERSION_S 14
+#define QCA_HDR_XMIT_PRIORITY_MASK GENMASK(13, 11)
+#define QCA_HDR_XMIT_PRIORITY_S11
+#define QCA_HDR_XMIT_CONTROL_MASK  GENMASK(10, 8)
+#define QCA_HDR_XMIT_CONTROL_S 8
+#define QCA_HDR_XMIT_FROM_CPU  BIT(7)
+#define QCA_HDR_XMIT_DP_BIT_MASK   GENMASK(6, 0)
+
+static inline int reg_to_port(int reg)
+{
+   if (reg  5)
+   return reg + 1;
+
+   return -1;
+}
+
+static inline int port_to_reg(int port)
+{
+   if (port = 1  port = 6)
+   return port - 1;
+
+   return -1;
+}
+
+static netdev_tx_t qca_tag_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+

[PATCH 1/7] net: dsa: add new driver for ar8xxx family

2015-05-28 Thread Mathieu Olivari

This patch contains initial init  registration code for QCA8337. It
will detect a QCA8337 switch, if present and declared in DT/platform.

Each port will be represented through a standalone net_device interface,
as for other DSA switches. CPU can communicate with any of the ports by
setting an IP@ on ethN interface. Ports cannot communicate with each
other just yet.

Link status will be reported through polling, and we don't use any
encapsulation.

Signed-off-by: Mathieu Olivari math...@codeaurora.org
---
 drivers/net/dsa/Kconfig  |   7 ++
 drivers/net/dsa/Makefile |   1 +
 drivers/net/dsa/ar8xxx.c | 303 +++
 drivers/net/dsa/ar8xxx.h |  82 +
 net/dsa/dsa.c|   1 +
 5 files changed, 394 insertions(+)
 create mode 100644 drivers/net/dsa/ar8xxx.c
 create mode 100644 drivers/net/dsa/ar8xxx.h

diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig
index 7ad0a4d..2aae541 100644
--- a/drivers/net/dsa/Kconfig
+++ b/drivers/net/dsa/Kconfig
@@ -65,4 +65,11 @@ config NET_DSA_BCM_SF2
  This enables support for the Broadcom Starfighter 2 Ethernet
  switch chips.
 
+config NET_DSA_AR8XXX
+   tristate Qualcomm Atheros AR8XXX Ethernet switch family support
+   depends on NET_DSA
+   ---help---
+ This enables support for the Qualcomm Atheros AR8XXX Ethernet
+ switch chips.
+
 endmenu
diff --git a/drivers/net/dsa/Makefile b/drivers/net/dsa/Makefile
index e2d51c4..7647687 100644
--- a/drivers/net/dsa/Makefile
+++ b/drivers/net/dsa/Makefile
@@ -14,3 +14,4 @@ ifdef CONFIG_NET_DSA_MV88E6171
 mv88e6xxx_drv-y += mv88e6171.o
 endif
 obj-$(CONFIG_NET_DSA_BCM_SF2)  += bcm_sf2.o
+obj-$(CONFIG_NET_DSA_AR8XXX)   += ar8xxx.o
diff --git a/drivers/net/dsa/ar8xxx.c b/drivers/net/dsa/ar8xxx.c
new file mode 100644
index 000..4ce3ffc
--- /dev/null
+++ b/drivers/net/dsa/ar8xxx.c
@@ -0,0 +1,303 @@
+/*
+ * Copyright (C) 2009 Felix Fietkau n...@openwrt.org
+ * Copyright (C) 2011-2012 Gabor Juhos juh...@openwrt.org
+ * Copyright (c) 2015, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include linux/module.h
+#include linux/phy.h
+#include linux/netdevice.h
+#include net/dsa.h
+#include linux/phy.h
+#include linux/of_net.h
+
+#include ar8xxx.h
+
+u32
+ar8xxx_mii_read32(struct mii_bus *bus, int phy_id, int regnum)
+{
+   u16 lo, hi;
+
+   lo = bus-read(bus, phy_id, regnum);
+   hi = bus-read(bus, phy_id, regnum + 1);
+
+   return (hi  16) | lo;
+}
+
+void
+ar8xxx_mii_write32(struct mii_bus *bus, int phy_id, int regnum, u32 val)
+{
+   u16 lo, hi;
+
+   lo = val  0x;
+   hi = (u16)(val  16);
+
+   bus-write(bus, phy_id, regnum, lo);
+   bus-write(bus, phy_id, regnum + 1, hi);
+}
+
+u32 ar8xxx_read(struct dsa_switch *ds, int reg)
+{
+   struct mii_bus *bus = dsa_host_dev_to_mii_bus(ds-master_dev);
+   u16 r1, r2, page;
+   u32 val;
+
+   split_addr((u32)reg, r1, r2, page);
+
+   mutex_lock(bus-mdio_lock);
+
+   bus-write(bus, 0x18, 0, page);
+   wait_for_page_switch();
+   val = ar8xxx_mii_read32(bus, 0x10 | r2, r1);
+
+   mutex_unlock(bus-mdio_lock);
+
+   return val;
+}
+
+void ar8xxx_write(struct dsa_switch *ds, int reg, u32 val)
+{
+   struct mii_bus *bus = dsa_host_dev_to_mii_bus(ds-master_dev);
+   u16 r1, r2, page;
+
+   split_addr((u32)reg, r1, r2, page);
+
+   mutex_lock(bus-mdio_lock);
+
+   bus-write(bus, 0x18, 0, page);
+   wait_for_page_switch();
+   ar8xxx_mii_write32(bus, 0x10 | r2, r1, val);
+
+   mutex_unlock(bus-mdio_lock);
+}
+
+u32
+ar8xxx_rmw(struct dsa_switch *ds, int reg, u32 mask, u32 val)
+{
+   struct mii_bus *bus = dsa_host_dev_to_mii_bus(ds-master_dev);
+   u16 r1, r2, page;
+   u32 ret;
+
+   split_addr((u32)reg, r1, r2, page);
+
+   mutex_lock(bus-mdio_lock);
+
+   bus-write(bus, 0x18, 0, page);
+   wait_for_page_switch();
+
+   ret = ar8xxx_mii_read32(bus, 0x10 | r2, r1);
+   ret = ~mask;
+   ret |= val;
+   ar8xxx_mii_write32(bus, 0x10 | r2, r1, ret);
+
+   mutex_unlock(bus-mdio_lock);
+
+   return ret;
+}
+
+static char *ar8xxx_probe(struct device *host_dev, int sw_addr)
+{
+   struct mii_bus *bus = dsa_host_dev_to_mii_bus(host_dev);
+   u32 phy_id;
+
+   if (!bus)
+   return NULL;
+
+   /* sw_addr is irrelevant as the switch occupies the MDIO bus from
+* addresses 0 to 4 (PHYs) and 16-23 (for MDIO 32bits protocol). So
+* we'll

Re: [PATCH] namespace: Remove no longer needed goto label in the function, ops_init

2015-05-28 Thread Eric W. Biederman

Nicholas Krause xerofo...@gmail.com writes:

 This removes the no longer needed goto label, cleanup in the
 function ops_init due to kfree being NULL pointer safe and
 therefore no need to avoid calling it the call to kzalloc
 fails inside this particular function.

Your proposed change pessimizes the error path without a description of
why that would be a benefit.

Further the subject on this patch is incorrect.  You don't remove any
gotos.

So I don't like this change as it.  The description is wrong
and the change provides little to no real world benefit.

Eric


 Signed-off-by: Nicholas Krause xerofo...@gmail.com
 ---
  net/core/net_namespace.c | 6 ++
  1 file changed, 2 insertions(+), 4 deletions(-)

 diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
 index 572af00..e8b5568 100644
 --- a/net/core/net_namespace.c
 +++ b/net/core/net_namespace.c
 @@ -102,7 +102,7 @@ static int ops_init(const struct pernet_operations *ops, 
 struct net *net)
  
   err = net_assign_generic(net, *ops-id, data);
   if (err)
 - goto cleanup;
 + goto out;
   }
   err = 0;
   if (ops-init)
 @@ -110,10 +110,8 @@ static int ops_init(const struct pernet_operations *ops, 
 struct net *net)
   if (!err)
   return 0;
  
 -cleanup:
 - kfree(data);
 -
  out:
 + kfree(data);
   return err;
  }
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: linux-next: build failure after merge of most of the trees

2015-05-28 Thread Stephen Rothwell

Hi Eric,

On Thu, 28 May 2015 08:26:51 -0700 Eric Dumazet eric.duma...@gmail.com wrote:

 We were alerted of this problem thanks to kbuild test robot.
 
 This fix is not a definitive one I hope.

No, just something to allow me to get my tree to build so I could go to
bed :-)

 Golden rule is that vmalloc() users must include vmalloc.h themselves,
 not by an indirect include.

Yep, that is a special case of Rule 1.

 I sent one fix, and prepared others, but I prefer that each offender is
 fixed.

Yeah, thanks to Dave for that.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpb9jNS5wQyU.pgp
Description: OpenPGP digital signature

Re: [PATCH v4 net-next 09/11] net: Add IPv6 flow label to flow_keys

2015-05-28 Thread Eric Dumazet

On Thu, 2015-05-28 at 11:19 -0700, Tom Herbert wrote:
 In flow_dissector set the flow label in flow_keys for IPv6. This also
 removes the shortcircuiting of flow dissection when a non-zero label
 is present, the flow label can be considered to provide additional
 entropy for a hash.
 
 Signed-off-by: Tom Herbert t...@herbertland.com
 ---
  include/net/flow_dissector.h |  4 +++-
  net/core/flow_dissector.c| 31 +++
  2 files changed, 14 insertions(+), 21 deletions(-)
 
 diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
 index 08480fb..14d8483 100644
 --- a/include/net/flow_dissector.h
 +++ b/include/net/flow_dissector.h
 @@ -28,7 +28,8 @@ struct flow_dissector_key_basic {
  };
  
  struct flow_dissector_key_tags {
 - u32 vlan_id:12;
 + u32 vlan_id:12,
 + flow_label:20;
  };
  
  /**
 @@ -111,6 +112,7 @@ enum flow_dissector_key_id {
   FLOW_DISSECTOR_KEY_ETH_ADDRS, /* struct flow_dissector_key_eth_addrs */
   FLOW_DISSECTOR_KEY_TIPC_ADDRS, /* struct flow_dissector_key_tipc_addrs 
 */
   FLOW_DISSECTOR_KEY_VLANID, /* struct flow_dissector_key_flow_tags */
 + FLOW_DISSECTOR_KEY_FLOW_LABEL, /* struct flow_dissector_key_flow_tags */
  
   FLOW_DISSECTOR_KEY_MAX,
  };
 diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
 index 5c66cb2..ba089d9 100644
 --- a/net/core/flow_dissector.c
 +++ b/net/core/flow_dissector.c
 @@ -190,7 +190,7 @@ ip:
   case htons(ETH_P_IPV6): {
   const struct ipv6hdr *iph;
   struct ipv6hdr _iph;
 - __be32 flow_label;
 + u32 flow_label;

You change flow_label from __be32 to u32.

  
  ipv6:
   iph = __skb_header_pointer(skb, nhoff, sizeof(_iph), data, 
 hlen, _iph);
 @@ -210,30 +210,17 @@ ipv6:
  
   memcpy(key_ipv6_addrs, iph-saddr, 
 sizeof(*key_ipv6_addrs));
   key_control-addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS;
 - goto flow_label;
   }
 - break;
 -flow_label:
 +
   flow_label = ip6_flowlabel(iph);

But ip6_flowlabel() returns a __be32. This should not please sparse.


   if (flow_label) {
 - /* Awesome, IPv6 packet has a flow label so we can
 -  * use that to represent the ports without any
 -  * further dissection.
 -  */
 -
 - key_basic-n_proto = proto;
 - key_basic-ip_proto = ip_proto;
 - key_control-thoff = (u16)nhoff;
 -
   if (skb_flow_dissector_uses_key(flow_dissector,
 - 
 FLOW_DISSECTOR_KEY_PORTS)) {
 - key_ports = 
 skb_flow_dissector_target(flow_dissector,
 -   
 FLOW_DISSECTOR_KEY_PORTS,
 -   
 target_container);
 - key_ports-ports = flow_label;
 + FLOW_DISSECTOR_KEY_FLOW_LABEL)) {
 + key_tags = 
 skb_flow_dissector_target(flow_dissector,
 +  
 FLOW_DISSECTOR_KEY_FLOW_LABEL,
 +  
 target_container);
 + key_tags-flow_label = ntohl(flow_label);

Then you call ntohl() on u32 variable. This should also complain.

Have you run sparse ?

make C=2 CF=-D__CHECK_ENDIAN__ net/core/flow_dissector.o

Thanks.


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: pull request (net-next): ipsec-next 2015-05-28

2015-05-28 Thread David Miller

From: Steffen Klassert steffen.klass...@secunet.com
Date: Thu, 28 May 2015 08:25:47 +0200

 1) Remove xfrm_queue_purge as this is the same as skb_queue_purge.

 2) Optimize policy and state walk.

 3) Use a sane return code if afinfo registration fails.

 4) Only check fori a acquire state if the state is not valid.

 5) Remove a unnecessary NULL check before xfrm_pol_hold
as it checks the input for NULL.

 6) Return directly if the xfrm hold queue is empty, avoid
to take a lock as it is nothing to do in this case.

 7) Optimize the inexact policy search and allow for matching
of policies with priority ~0U.

 All from Li RongQing.

 Please pull or let me know if there are problems.

Pulled, thanks Steffen.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] net: qlcnic: clean up sysfs error codes

2015-05-28 Thread David Miller

From: Vladimir Zapolskiy v...@mleia.com
Date: Tue, 26 May 2015 03:49:45 +0300

 Replace confusing QL_STATUS_INVALID_PARAM == -1 == -EPERM with -EINVAL
 and QLC_STATUS_UNSUPPORTED_CMD == -2 == -ENOENT with -EOPNOTSUPP, the
 latter error code is arguable, but it is already used in the driver,
 so let it be here as well.

 Also remove always false (!buf) check on read(), the driver should
 not care if userspace gets its EFAULT or not.

 Signed-off-by: Vladimir Zapolskiy v...@mleia.com

Qlogic folks, I'm waiting for your promised feedback.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] net: qlcnic: clean up sysfs error codes

2015-05-28 Thread Vladimir Zapolskiy

Hello David,

On 29.05.2015 02:28, David Miller wrote:
 From: Vladimir Zapolskiy v...@mleia.com
 Date: Tue, 26 May 2015 03:49:45 +0300

 Replace confusing QL_STATUS_INVALID_PARAM == -1 == -EPERM with -EINVAL
 and QLC_STATUS_UNSUPPORTED_CMD == -2 == -ENOENT with -EOPNOTSUPP, the
 latter error code is arguable, but it is already used in the driver,
 so let it be here as well.

 Also remove always false (!buf) check on read(), the driver should
 not care if userspace gets its EFAULT or not.

 Signed-off-by: Vladimir Zapolskiy v...@mleia.com

 Qlogic folks, I'm waiting for your promised feedback.

Rajesh reviewed and acked the change, thank you.

http://www.spinics.net/lists/netdev/msg331073.html

--
With best wishes,
Vladimir
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] neigh: Add missing rcu_assign_pointer

2015-05-28 Thread Ying Xue

On 05/28/2015 06:13 PM, Eric Dumazet wrote:
 This patch is not needed.
 
 You really should read Documentation/RCU , because it looks like you are
 quite confused.
 
 When we remove an element from a RCU protected list, all the objects in
 the chain are already ready to be caught by rcu readers.
 
 Therefore, no additional memory barrier is needed before doing *np =
 n-next;
 
 Please do not add spurious memory barriers. Like atomic operations, we
 want all of them being required and possibly documented.


Yes, you are right, thanks for your clear explanation :)
However, there are still three places where we use rcu_assign_pointer() to
remove a neigh entry from a RCU-protected list, and the three places are
neigh_forced_gc(), neigh_flush_dev(), and __neigh_for_each_release()
respectively. This means it's redundant for us to use rcu_assign_pointer() in
the three places, right?

Regards,
Ying

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/7] net: dsa: ar8xxx: add regmap support

2015-05-28 Thread Florian Fainelli

Le 05/28/15 18:42, Mathieu Olivari a écrit :
 All switch registers can now be dumped using regmap/debugfs.
 
 \# cat /sys/kernel/debug/regmap/mdiobus/registers
 : 1302
 0004: ...
 ...

ethtool has a register dump command, which should already be supported
by the current code in net/dsa/slave.c, is there a particular reason why
you use debugfs here instead?
-- 
Florian
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/7] net: dsa: add QCA AR8xxx switch family support

2015-05-28 Thread Andrew Lunn

On Thu, May 28, 2015 at 06:42:15PM -0700, Mathieu Olivari wrote:
 This patch set adds initial support for AR8xxx switches using the DSA
 subsystem. It currently supports QCA8337 switch, and can be extended to
 other hardware in the same family.
 
 This switch was already discussed in the following thread:
 https://www.marc.info/?t=14260141744r=1w=2
 
 Below is a typical picture of a QCA8337 used in a standard home gateway
 configuration:
 
   +---+   +---+
   |   | SGMII |   |
   |   eth0+---+   +-- 1000baseT MDI (WAN)
   |wan|   |  7-port   +-- 1000baseT MDI (LAN1)
   |   CPU |   |  ethernet +-- 1000baseT MDI (LAN2)
   |   | RGMII |  switch   +-- 1000baseT MDI (LAN3)
   |   eth1+---+  w/5 PHYs +-- 1000baseT MDI (LAN4)
   |lan|   |   |
   +---+   +---+
 |   MDIO |
 \/
 
 The switch is connected to the CPU using 2 xMII interfaces. As DSA only
 supports one logical interface to the switch, we split the switch using
 device-tree information into 2 parts:
 *port 6 (one of the xMII switch port) will be dedicated to one
  particular Ethernet port. From a system perspective, it will be seen as
  a regular PHY.
 *port 0 (the other xMII port) will act as the switch master interface

FYI:

I have patches which allow DSA to use two cpu interfaces. Seems to
work on my DIR665 with a Marvell Switch.

I will post the patches as an RFC.

  Andrew
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/7] net: dsa: add new driver for ar8xxx family

2015-05-28 Thread Andrew Lunn

 +static int ar8xxx_set_pad_ctrl(struct dsa_switch *ds, int port, int mode)
 +{
 + int reg;
 +
 + switch (port) {
 + case 0:
 + reg = AR8327_REG_PORT0_PAD_CTRL;
 + break;
 + case 6:
 + reg = AR8327_REG_PORT6_PAD_CTRL;
 + break;
 + default:
 + pr_err(Can't set PAD_CTRL on port %d\n, port);
 + return -EINVAL;
 + }
 +
 + /* DSA only supports 1 CPU port for now, so we'll take the assumption
 +  * that P0 is connected to the CPU master_dev.
 +  */

I don't like this assumption. Hardware i have with Marvell switches
has the CPU connected to ports 5, or 6, or 0.

Calling dsa_upstream_port() will tell you which is the CPU port.

Andrew
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next] bpf: add missing rcu protection when releasing programs from prog_array

2015-05-28 Thread Alexei Starovoitov

Normally the program attachment place (like sockets, qdiscs) takes
care of rcu protection and calls bpf_prog_put() after a grace period.
The programs stored inside prog_array may not be attached anywhere,
so prog_array needs to take care of preserving rcu protection.
Otherwise bpf_tail_call() will race with bpf_prog_put().
To solve that introduce bpf_prog_put_rcu() helper function and use
it in 3 places where unattached program can decrement refcnt:
closing program fd, deleting/replacing program in prog_array.

Fixes: 04fd61ab36ec (bpf: allow bpf programs to tail-call other bpf programs)
Reported-by: Martin Schwidefsky schwidef...@de.ibm.com
Signed-off-by: Alexei Starovoitov a...@plumgrid.com
---
 include/linux/bpf.h   |6 +-
 kernel/bpf/arraymap.c |4 ++--
 kernel/bpf/syscall.c  |   19 ++-
 3 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 8821b9a8689e..5f520f5f087e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -123,7 +123,10 @@ struct bpf_prog_aux {
const struct bpf_verifier_ops *ops;
struct bpf_map **used_maps;
struct bpf_prog *prog;
-   struct work_struct work;
+   union {
+   struct work_struct work;
+   struct rcu_head rcu;
+   };
 };
 
 struct bpf_array {
@@ -153,6 +156,7 @@ void bpf_register_map_type(struct bpf_map_type_list *tl);
 
 struct bpf_prog *bpf_prog_get(u32 ufd);
 void bpf_prog_put(struct bpf_prog *prog);
+void bpf_prog_put_rcu(struct bpf_prog *prog);
 
 struct bpf_map *bpf_map_get(struct fd f);
 void bpf_map_put(struct bpf_map *map);
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 614bcd4c1d74..cb31229a6fa4 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -202,7 +202,7 @@ static int prog_array_map_update_elem(struct bpf_map *map, 
void *key,
 
old_prog = xchg(array-prog + index, prog);
if (old_prog)
-   bpf_prog_put(old_prog);
+   bpf_prog_put_rcu(old_prog);
 
return 0;
 }
@@ -218,7 +218,7 @@ static int prog_array_map_delete_elem(struct bpf_map *map, 
void *key)
 
old_prog = xchg(array-prog + index, NULL);
if (old_prog) {
-   bpf_prog_put(old_prog);
+   bpf_prog_put_rcu(old_prog);
return 0;
} else {
return -ENOENT;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 98a69bd83069..a1b14d197a4f 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -432,6 +432,23 @@ static void free_used_maps(struct bpf_prog_aux *aux)
kfree(aux-used_maps);
 }
 
+static void __prog_put_rcu(struct rcu_head *rcu)
+{
+   struct bpf_prog_aux *aux = container_of(rcu, struct bpf_prog_aux, rcu);
+
+   free_used_maps(aux);
+   bpf_prog_free(aux-prog);
+}
+
+/* version of bpf_prog_put() that is called after a grace period */
+void bpf_prog_put_rcu(struct bpf_prog *prog)
+{
+   if (atomic_dec_and_test(prog-aux-refcnt)) {
+   prog-aux-prog = prog;
+   call_rcu(prog-aux-rcu, __prog_put_rcu);
+   }
+}
+
 void bpf_prog_put(struct bpf_prog *prog)
 {
if (atomic_dec_and_test(prog-aux-refcnt)) {
@@ -445,7 +462,7 @@ static int bpf_prog_release(struct inode *inode, struct 
file *filp)
 {
struct bpf_prog *prog = filp-private_data;
 
-   bpf_prog_put(prog);
+   bpf_prog_put_rcu(prog);
return 0;
 }
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] net: qlcnic: clean up sysfs error codes

2015-05-28 Thread David Miller

From: Vladimir Zapolskiy v...@mleia.com
Date: Fri, 29 May 2015 04:13:46 +0300

 Hello David,

 On 29.05.2015 02:28, David Miller wrote:
 From: Vladimir Zapolskiy v...@mleia.com
 Date: Tue, 26 May 2015 03:49:45 +0300

 Replace confusing QL_STATUS_INVALID_PARAM == -1 == -EPERM with -EINVAL
 and QLC_STATUS_UNSUPPORTED_CMD == -2 == -ENOENT with -EOPNOTSUPP, the
 latter error code is arguable, but it is already used in the driver,
 so let it be here as well.

 Also remove always false (!buf) check on read(), the driver should
 not care if userspace gets its EFAULT or not.

 Signed-off-by: Vladimir Zapolskiy v...@mleia.com

 Qlogic folks, I'm waiting for your promised feedback.

 Rajesh reviewed and acked the change, thank you.

 http://www.spinics.net/lists/netdev/msg331073.html

Thanks, I missed that, applied.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next 1/3] net: systemport: Pre-calculate and utilize cb-bd_addr

2015-05-28 Thread Petri Gynther

On Thu, May 28, 2015 at 3:24 PM, Florian Fainelli f.faine...@gmail.com wrote:
 There is a 1:1 mapping between the software maintained control block in
 priv-rx_cbs and the buffer address in priv-rx_bds, such that there is
 no need to keep computing the buffer address when refiling a control
 block.

 Signed-off-by: Florian Fainelli f.faine...@gmail.com

Reviewed-by: Petri Gynther pgynt...@google.com

 ---
  drivers/net/ethernet/broadcom/bcmsysport.c | 18 +-
  drivers/net/ethernet/broadcom/bcmsysport.h |  2 --
  2 files changed, 9 insertions(+), 11 deletions(-)

 diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c 
 b/drivers/net/ethernet/broadcom/bcmsysport.c
 index 084a50a555de..267330ccd595 100644
 --- a/drivers/net/ethernet/broadcom/bcmsysport.c
 +++ b/drivers/net/ethernet/broadcom/bcmsysport.c
 @@ -549,12 +549,7 @@ static int bcm_sysport_rx_refill(struct bcm_sysport_priv 
 *priv,
 }

 dma_unmap_addr_set(cb, dma_addr, mapping);
 -   dma_desc_set_addr(priv, priv-rx_bd_assign_ptr, mapping);
 -
 -   priv-rx_bd_assign_index++;
 -   priv-rx_bd_assign_index = (priv-num_rx_bds - 1);
 -   priv-rx_bd_assign_ptr = priv-rx_bds +
 -   (priv-rx_bd_assign_index * DESC_SIZE);
 +   dma_desc_set_addr(priv, cb-bd_addr, mapping);

 netif_dbg(priv, rx_status, ndev, RX refill\n);

 @@ -568,7 +563,7 @@ static int bcm_sysport_alloc_rx_bufs(struct 
 bcm_sysport_priv *priv)
 unsigned int i;

 for (i = 0; i  priv-num_rx_bds; i++) {
 -   cb = priv-rx_cbs[priv-rx_bd_assign_index];
 +   cb = priv-rx_cbs[i];
 if (cb-skb)
 continue;

 @@ -1330,14 +1325,14 @@ static inline int tdma_enable_set(struct 
 bcm_sysport_priv *priv,

  static int bcm_sysport_init_rx_ring(struct bcm_sysport_priv *priv)
  {
 +   struct bcm_sysport_cb *cb;
 u32 reg;
 int ret;
 +   int i;

 /* Initialize SW view of the RX ring */
 priv-num_rx_bds = NUM_RX_DESC;
 priv-rx_bds = priv-base + SYS_PORT_RDMA_OFFSET;
 -   priv-rx_bd_assign_ptr = priv-rx_bds;
 -   priv-rx_bd_assign_index = 0;
 priv-rx_c_index = 0;
 priv-rx_read_ptr = 0;
 priv-rx_cbs = kcalloc(priv-num_rx_bds, sizeof(struct 
 bcm_sysport_cb),
 @@ -1347,6 +1342,11 @@ static int bcm_sysport_init_rx_ring(struct 
 bcm_sysport_priv *priv)
 return -ENOMEM;
 }

 +   for (i = 0; i  priv-num_rx_bds; i++) {
 +   cb = priv-rx_cbs + i;
 +   cb-bd_addr = priv-rx_bds + i * DESC_SIZE;
 +   }
 +
 ret = bcm_sysport_alloc_rx_bufs(priv);
 if (ret) {
 netif_err(priv, hw, priv-netdev, SKB allocation failed\n);
 diff --git a/drivers/net/ethernet/broadcom/bcmsysport.h 
 b/drivers/net/ethernet/broadcom/bcmsysport.h
 index 42a4b4a0bc14..f28bf545d7f4 100644
 --- a/drivers/net/ethernet/broadcom/bcmsysport.h
 +++ b/drivers/net/ethernet/broadcom/bcmsysport.h
 @@ -663,8 +663,6 @@ struct bcm_sysport_priv {

 /* Receive queue */
 void __iomem*rx_bds;
 -   void __iomem*rx_bd_assign_ptr;
 -   unsigned intrx_bd_assign_index;
 struct bcm_sysport_cb   *rx_cbs;
 unsigned intnum_rx_bds;
 unsigned intrx_read_ptr;
 --
 2.1.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net v2] openvswitch: disable LRO

2015-05-28 Thread Pravin Shelar

On Thu, May 28, 2015 at 6:04 AM, Jiri Benc jb...@redhat.com wrote:
 Currently, openvswitch tries to disable LRO from the user space. This does
 not work correctly when the device added is a vlan interface, though.
 Instead of dealing with possibly complex stacked cross name space relations
 in the user space, do the same as bridging does and call dev_disable_lro in
 the kernel.

 Signed-off-by: Jiri Benc jb...@redhat.com

Looks good.
Acked-by: Pravin B Shelar pshe...@nicira.com
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] neigh: Add missing rcu_assign_pointer

2015-05-28 Thread Eric Dumazet

On Fri, 2015-05-29 at 09:21 +0800, Ying Xue wrote:
 On 05/28/2015 06:13 PM, Eric Dumazet wrote:
  This patch is not needed.
  
  You really should read Documentation/RCU , because it looks like you are
  quite confused.
  
  When we remove an element from a RCU protected list, all the objects in
  the chain are already ready to be caught by rcu readers.
  
  Therefore, no additional memory barrier is needed before doing *np =
  n-next;
  
  Please do not add spurious memory barriers. Like atomic operations, we
  want all of them being required and possibly documented.
 
 
 Yes, you are right, thanks for your clear explanation :)
 However, there are still three places where we use rcu_assign_pointer() to
 remove a neigh entry from a RCU-protected list, and the three places are
 neigh_forced_gc(), neigh_flush_dev(), and __neigh_for_each_release()
 respectively. This means it's redundant for us to use rcu_assign_pointer() in
 the three places, right?

I count 5 places of redundancy. 

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 
3a74df750af4044eba0e7d88ae01ca9b4dac0e72..ac3b69183cc982e722d9683d6de7a39f66b50b64
 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -141,9 +141,7 @@ static int neigh_forced_gc(struct neigh_table *tbl)
write_lock(n-lock);
if (atomic_read(n-refcnt) == 1 
!(n-nud_state  NUD_PERMANENT)) {
-   rcu_assign_pointer(*np,
-   rcu_dereference_protected(n-next,
- lockdep_is_held(tbl-lock)));
+   *np = n-next;
n-dead = 1;
shrunk  = 1;
write_unlock(n-lock);
@@ -210,9 +208,7 @@ static void neigh_flush_dev(struct neigh_table *tbl, struct 
net_device *dev)
np = n-next;
continue;
}
-   rcu_assign_pointer(*np,
-  rcu_dereference_protected(n-next,
-   lockdep_is_held(tbl-lock)));
+   *np = n-next;
write_lock(n-lock);
neigh_del_timer(n);
n-dead = 1;
@@ -380,10 +376,8 @@ static struct neigh_hash_table *neigh_hash_grow(struct 
neigh_table *tbl,
next = rcu_dereference_protected(n-next,
lockdep_is_held(tbl-lock));
 
-   rcu_assign_pointer(n-next,
-  rcu_dereference_protected(
-   new_nht-hash_buckets[hash],
-   lockdep_is_held(tbl-lock)));
+   n-next = new_nht-hash_buckets[hash];
+
rcu_assign_pointer(new_nht-hash_buckets[hash], n);
}
}
@@ -515,9 +509,7 @@ struct neighbour *__neigh_create(struct neigh_table *tbl, 
const void *pkey,
n-dead = 0;
if (want_ref)
neigh_hold(n);
-   rcu_assign_pointer(n-next,
-  
rcu_dereference_protected(nht-hash_buckets[hash_val],
-
lockdep_is_held(tbl-lock)));
+   n-next = nht-hash_buckets[hash_val];
rcu_assign_pointer(nht-hash_buckets[hash_val], n);
write_unlock_bh(tbl-lock);
neigh_dbg(2, neigh %p is created\n, n);
@@ -2381,9 +2373,7 @@ void __neigh_for_each_release(struct neigh_table *tbl,
write_lock(n-lock);
release = cb(n);
if (release) {
-   rcu_assign_pointer(*np,
-   rcu_dereference_protected(n-next,
-   lockdep_is_held(tbl-lock)));
+   *np = n-next;
n-dead = 1;
} else
np = n-next;



--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/7] net: dsa: ar8xxx: add regmap support

2015-05-28 Thread Andrew Lunn

 Fair enough, are there other global things besides counters that could
 deserve adding maybe some sort of global/master net_device to help query
 switch-wide information?

This was discussed a while back. I like the current abstraction, all
interfaces are real interfaces you can send and receive packets
over. This pseudo interface cannot be used for packet transfer, which
seems odd.

Having access to registers for debugging, so debugfs seems like the
best option to me.

 Andrew
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] netevent: remove automatic variable in register_netevent_notifier()

2015-05-28 Thread long.wanglong

On 2015/5/28 22:07, Sergei Shtylyov wrote:
 Hello.
 
 On 5/28/2015 1:00 PM, Wang Long wrote:
 
 Remove automatic variable 'err' in register_netevent_notifier() and
 return the return value of atomic_notifier_chain_register() directly.
 
s/return value/result/, in order to avoid tautology.
 
 Signed-off-by: Wang Long long.wangl...@huawei.com
 [...]
 
 WBR, Sergei
 
 
 

Thanks, I will fix that.

Best Regards
Wang Long

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/7] net: dsa: add QCA AR8xxx switch family support

2015-05-28 Thread Mathieu Olivari

This patch set adds initial support for AR8xxx switches using the DSA
subsystem. It currently supports QCA8337 switch, and can be extended to
other hardware in the same family.

This switch was already discussed in the following thread:
https://www.marc.info/?t=14260141744r=1w=2

Below is a typical picture of a QCA8337 used in a standard home gateway
configuration:

+---+   +---+
|   | SGMII |   |
|   eth0+---+   +-- 1000baseT MDI (WAN)
|wan|   |  7-port   +-- 1000baseT MDI (LAN1)
|   CPU |   |  ethernet +-- 1000baseT MDI (LAN2)
|   | RGMII |  switch   +-- 1000baseT MDI (LAN3)
|   eth1+---+  w/5 PHYs +-- 1000baseT MDI (LAN4)
|lan|   |   |
+---+   +---+
  |   MDIO |
  \/

The switch is connected to the CPU using 2 xMII interfaces. As DSA only
supports one logical interface to the switch, we split the switch using
device-tree information into 2 parts:
*port 6 (one of the xMII switch port) will be dedicated to one
 particular Ethernet port. From a system perspective, it will be seen as
 a regular PHY.
*port 0 (the other xMII port) will act as the switch master interface

When 2 xMII are used, the switch will therefore be seen as 2 devices: 1
PHY + 1 DSA switch. The configuration of this split is done using driver
specific options in device-tree.

The exact properties are detailed in the Documentation patch below.

Mathieu Olivari (7):
  net: dsa: add new driver for ar8xxx family
  net: dsa: ar8xxx: add ethtool hw statistics support
  net: dsa: ar8xxx: add regmap support
  net: dsa: add QCA tag support
  net: dsa: ar8xxx: enable QCA header support on AR8xxx
  net: dsa: ar8xxx: add support for second xMII interfaces through DT
  Documentation: devicetree: add ar8xxx binding

 .../devicetree/bindings/net/dsa/qca-ar8xxx.txt |  70 +++
 drivers/net/dsa/Kconfig|   9 +
 drivers/net/dsa/Makefile   |   1 +
 drivers/net/dsa/ar8xxx.c   | 530 +
 drivers/net/dsa/ar8xxx.h   | 157 ++
 include/net/dsa.h  |   1 +
 net/dsa/Kconfig|   3 +
 net/dsa/Makefile   |   1 +
 net/dsa/dsa.c  |   6 +
 net/dsa/dsa_priv.h |   2 +
 net/dsa/slave.c|   5 +
 net/dsa/tag_qca.c  | 159 +++
 12 files changed, 944 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/dsa/qca-ar8xxx.txt
 create mode 100644 drivers/net/dsa/ar8xxx.c
 create mode 100644 drivers/net/dsa/ar8xxx.h
 create mode 100644 net/dsa/tag_qca.c

-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 7/7] Documentation: devicetree: add ar8xxx binding

2015-05-28 Thread Florian Fainelli

Le 05/28/15 18:42, Mathieu Olivari a écrit :
 Add device-tree binding for ar8xxx switch families.
 
 Signed-off-by: Mathieu Olivari math...@codeaurora.org
 ---
  .../devicetree/bindings/net/dsa/qca-ar8xxx.txt | 70 
 ++
  1 file changed, 70 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/net/dsa/qca-ar8xxx.txt
 
 diff --git a/Documentation/devicetree/bindings/net/dsa/qca-ar8xxx.txt 
 b/Documentation/devicetree/bindings/net/dsa/qca-ar8xxx.txt
 new file mode 100644
 index 000..f4fd3f1
 --- /dev/null
 +++ b/Documentation/devicetree/bindings/net/dsa/qca-ar8xxx.txt
 @@ -0,0 +1,70 @@
 +* Qualcomm Atheros AR8xxx switch family
 +
 +Required properties:
 +
 +- compatible: should be qca,ar8xxx
 +- dsa,mii-bus: phandle to the MDIO bus controller, see dsa/dsa.txt
 +- dsa,ethernet: phandle to the CPU network interface controller, see 
 dsa/dsa.txt
 +- #size-cells: must be 0
 +- #address-cells: must be 2, see dsa/dsa.txt
 +
 +Subnodes:
 +
 +The integrated switch subnode should be specified according to the binding
 +described in dsa/dsa.txt.
 +
 +Optional properties:
 +
 +- qca,port6-phy-mode: if specified, the driver will configure Port 6 in the
 +  given phy-mode. See Documentation/devicetree/bindings/net/ethernet.txt for
 +  the list of valid phy-mode.

Is there a reason why this is a custom property and not a standard
phy-mode property here such that you could utilize of_get_phy_mode()
with this directly?

 +
 +- qca,port6-phy-id: if specified, the driver will connect Port 6 to the PHY
 +  given as a parameter. In this case, Port6 and the corresponding PHY will be
 +  isolated from the rest of the switch. From a system perspective, they will
 +  act as a regular PHY.

Same here, is there a reason why this is not a phy-handle property to
a PHY node that sits on a (potentially different) MDIO bus?

 +
 +Example:
 +
 + dsa@0 {
 + compatible = qca,ar8xxx;
 + #address-cells = 2;
 + #size-cells = 0;
 +
 + dsa,ethernet = ethernet0;
 + dsa,mii-bus = mii_bus0;
 +
 + switch@0 {
 + #address-cells = 1;
 + #size-cells = 0;
 + reg = 0 0;/* MDIO address 0, switch 0 in tree */
 +
 + qca,port6-phy-mode = sgmii;
 + qca,port6-phy-id = 4;
 +
 + port@0 {
 + reg = 11;
 + label = cpu;
 + };
 +
 + port@1 {
 + reg = 0;
 + label = lan1;
 + };
 +
 + port@2 {
 + reg = 1;
 + label = lan2;
 + };
 +
 + port@3 {
 + reg = 2;
 + label = lan3;
 + };
 +
 + port@4 {
 + reg = 3;
 + label = lan4;
 + };
 + };
 + };
 


-- 
Florian
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [V5 PATCH 2/5] arm64 : Introduce support for ACPI _CCA object

2015-05-28 Thread Mark Salter

On Wed, 2015-05-20 at 17:09 -0500, Suravee Suthikulpanit wrote:
 From http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf,
 section 6.2.17 _CCA states that ARM platforms require ACPI _CCA
 object to be specified for DMA-cabpable devices. Therefore, this patch
 specifies ACPI_CCA_REQUIRED in arm64 Kconfig.
 
 In addition, to handle the case when _CCA is missing, arm64 would assign
 dummy_dma_ops to disable DMA capability of the device.
 
 Acked-by: Catalin Marinas catalin.mari...@arm.com
 Signed-off-by: Mark Salter msal...@redhat.com
 Signed-off-by: Suravee Suthikulpanit suravee.suthikulpa...@amd.com
 ---
  arch/arm64/Kconfig   |  1 +
  arch/arm64/include/asm/dma-mapping.h | 18 ++-
  arch/arm64/mm/dma-mapping.c  | 92 
 
  3 files changed, 109 insertions(+), 2 deletions(-)
 
 diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
 index 4269dba..95307b4 100644
 --- a/arch/arm64/Kconfig
 +++ b/arch/arm64/Kconfig
 @@ -1,5 +1,6 @@
  config ARM64
   def_bool y
 + select ACPI_CCA_REQUIRED if ACPI
   select ACPI_GENERIC_GSI if ACPI
   select ACPI_REDUCED_HARDWARE_ONLY if ACPI
   select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE
 diff --git a/arch/arm64/include/asm/dma-mapping.h 
 b/arch/arm64/include/asm/dma-mapping.h
 index 9437e3d..f0d6d0b 100644
 --- a/arch/arm64/include/asm/dma-mapping.h
 +++ b/arch/arm64/include/asm/dma-mapping.h
 @@ -18,6 +18,7 @@
  
  #ifdef __KERNEL__
  
 +#include linux/acpi.h
  #include linux/types.h
  #include linux/vmalloc.h
  

^^^ This hunk causes build issues with a couple of drivers:

drivers/scsi/megaraid/megaraid_sas_fp.c:69:0: warning: FALSE redefined 
[enabled by default]
 #define FALSE 0
 ^
In file included from include/acpi/acpi.h:58:0,
 from include/linux/acpi.h:37,
 from ./arch/arm64/include/asm/dma-mapping.h:21,
 from include/linux/dma-mapping.h:86,
 from ./arch/arm64/include/asm/pci.h:7,
 from include/linux/pci.h:1460,
 from drivers/scsi/megaraid/megaraid_sas_fp.c:37:
include/acpi/actypes.h:433:0: note: this is the location of the previous 
definition
 #define FALSE   (1 == 0)
 ^


In file included from include/acpi/acpi.h:58:0,
 from include/linux/acpi.h:37,
 from ./arch/arm64/include/asm/dma-mapping.h:21,
 from include/linux/dma-mapping.h:86,
 from include/scsi/scsi_cmnd.h:4,
 from drivers/scsi/ufs/ufshcd.h:60,
 from drivers/scsi/ufs/ufshcd.c:43:
include/acpi/actypes.h:433:41: error: expected identifier before ‘(’ token
 #define FALSE   (1 == 0)
 ^
drivers/scsi/ufs/unipro.h:203:2: note: in expansion of macro ‘FALSE’
  FALSE = 0,
  ^

This happens because the ACPI definitions of TRUE and FALSE conflict
with local definitions in megaraid and enum declaration in ufs.


 @@ -28,13 +29,23 @@
  
  #define DMA_ERROR_CODE   (~(dma_addr_t)0)
  extern struct dma_map_ops *dma_ops;
 +extern struct dma_map_ops dummy_dma_ops;
  
  static inline struct dma_map_ops *__generic_dma_ops(struct device *dev)
  {
 - if (unlikely(!dev) || !dev-archdata.dma_ops)
 + if (unlikely(!dev))
   return dma_ops;
 - else
 + else if (dev-archdata.dma_ops)
   return dev-archdata.dma_ops;
 + else if (acpi_disabled)
 + return dma_ops;
 +
 + /*
 +  * When ACPI is enabled, if arch_set_dma_ops is not called,
 +  * we will disable device DMA capability by setting it
 +  * to dummy_dma_ops.
 +  */
 + return dummy_dma_ops;
  }
  
  static inline struct dma_map_ops *get_dma_ops(struct device *dev)
 @@ -48,6 +59,9 @@ static inline struct dma_map_ops *get_dma_ops(struct device 
 *dev)
  static inline void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 
 size,
 struct iommu_ops *iommu, bool coherent)
  {
 + if (!acpi_disabled  !dev-archdata.dma_ops)
 + dev-archdata.dma_ops = dma_ops;
 +
   dev-archdata.dma_coherent = coherent;
  }
  #define arch_setup_dma_ops   arch_setup_dma_ops
 diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
 index ef7d112..6e6d6ad 100644
 --- a/arch/arm64/mm/dma-mapping.c
 +++ b/arch/arm64/mm/dma-mapping.c
 @@ -415,6 +415,98 @@ out:
   return -ENOMEM;
  }
  
 +/
 + * The following APIs are for dummy DMA ops *
 + /
 +
 +static void *__dummy_alloc(struct device *dev, size_t size,
 +dma_addr_t *dma_handle, gfp_t flags,
 +struct dma_attrs *attrs)
 +{
 + return NULL;
 +}
 +
 +static void __dummy_free(struct device *dev, size_t size,
 +  void *vaddr, dma_addr_t dma_handle,
 +  struct dma_attrs

[PATCH 0/4] Netfilter updates for net-next

2015-05-28 Thread Pablo Neira Ayuso

Hi David,

The following patchset contains Netfilter updates for net-next, they are:

1) default CONFIG_NETFILTER_INGRESS to y for easier compile-testing of all
   options.

2) Allow to bind a table to net_device. This introduces the internal
   NFT_AF_NEEDS_DEV flag to perform a mandatory check for this binding.
   This is required by the next patch.

3) Add the 'netdev' table family, this new table allows you to create ingress
   filter basechains. This provides access to the existing nf_tables features
   from ingress.

4) Kill unused argument from compat_find_calc_{match,target} in ip_tables
   and ip6_tables, from Florian Westphal.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git

Thanks!



The following changes since commit 76d7c457659dfc05d5a23cd0b21fea333d1788cd:

  Merge branch 'icmp_frag' (2015-05-19 00:15:50 -0400)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git master

for you to fetch changes up to ed6c4136f1571bd6ab362afc3410905a8a69ca42:

  netfilter: nf_tables: add netdev table to filter from ingress (2015-05-26 
18:41:23 +0200)


Florian Westphal (1):
  netfilter: remove unused comefrom hookmask argument

Pablo Neira Ayuso (3):
  netfilter: default CONFIG_NETFILTER_INGRESS to y
  netfilter: nf_tables: allow to bind table to net_device
  netfilter: nf_tables: add netdev table to filter from ingress

 include/net/netfilter/nf_tables.h|8 ++
 include/net/netns/nftables.h |1 +
 include/uapi/linux/netfilter/nf_tables.h |2 +
 net/ipv4/netfilter/ip_tables.c   |4 +-
 net/ipv6/netfilter/ip6_tables.c  |4 +-
 net/netfilter/Kconfig|6 +
 net/netfilter/Makefile   |1 +
 net/netfilter/nf_tables_api.c|   46 +++-
 net/netfilter/nf_tables_netdev.c |  183 ++
 9 files changed, 244 insertions(+), 11 deletions(-)
 create mode 100644 net/netfilter/nf_tables_netdev.c
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

1 2 3 >

1 - 100 of 229 matches

Mail list logo