Re: [PATCH v4 net-next 05/11] net: Add full IPv6 addresses to flow_keys
On Thu, May 28, 2015 at 2:44 PM, Eric Dumazet eric.duma...@gmail.com wrote: On Thu, 2015-05-28 at 11:19 -0700, Tom Herbert wrote: @@ -566,11 +640,15 @@ static const struct flow_dissector_key flow_keys_dissector_keys[] = { }, { .key_id = FLOW_DISSECTOR_KEY_IPV4_ADDRS, - .offset = offsetof(struct flow_keys, addrs), + .offset = offsetof(struct flow_keys, addrs.v4addrs), + }, + { + .key_id = FLOW_DISSECTOR_KEY_IPV6_ADDRS, + .offset = offsetof(struct flow_keys, addrs.v6addrs), }, { .key_id = FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS, - .offset = offsetof(struct flow_keys, addrs), + .offset = offsetof(struct flow_keys, addrs.v4addrs), Shouldn't it be offsetof(struct flow_keys, addrs.v6addrs), ? This is to hash 128 bit IP addresses into 32 bit values which fit in the v4addrs area. This completely goes away in 07 patch in this set. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 0/2] hv_netvsc: Implement NUMA aware memory allocation
Allocate both receive buffer and send buffer from the NUMA node assigned to the primary channel. K. Y. Srinivasan (2): hv_netvsc: Allocate the receive buffer from the correct NUMA node hv_netvsc: Allocate the sendbuf in a NUMA aware way drivers/net/hyperv/netvsc.c | 11 +-- 1 files changed, 9 insertions(+), 2 deletions(-) -- 1.7.4.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V2 net-next 1/2] hv_netvsc: Allocate the receive buffer from the correct NUMA node
Allocate the receive bufer from the NUMA node assigned to the primary channel. Signed-off-by: K. Y. Srinivasan k...@microsoft.com --- V2: Specify the tree for this patch. drivers/net/hyperv/netvsc.c |7 ++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index b024968..d187965 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -227,13 +227,18 @@ static int netvsc_init_buf(struct hv_device *device) struct netvsc_device *net_device; struct nvsp_message *init_packet; struct net_device *ndev; + int node; net_device = get_outbound_net_device(device); if (!net_device) return -ENODEV; ndev = net_device-ndev; - net_device-recv_buf = vzalloc(net_device-recv_buf_size); + node = cpu_to_node(device-channel-target_cpu); + net_device-recv_buf = vzalloc_node(net_device-recv_buf_size, node); + if (!net_device-recv_buf) + net_device-recv_buf = vzalloc(net_device-recv_buf_size); + if (!net_device-recv_buf) { netdev_err(ndev, unable to allocate receive buffer of size %d\n, net_device-recv_buf_size); -- 1.7.4.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 2/2] hv_netvsc: Allocate the sendbuf in a NUMA aware way
Allocate the send buffer in a NUMA aware way. Signed-off-by: K. Y. Srinivasan k...@microsoft.com --- drivers/net/hyperv/netvsc.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index d187965..06de98a 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -326,7 +326,9 @@ static int netvsc_init_buf(struct hv_device *device) /* Now setup the send buffer. */ - net_device-send_buf = vzalloc(net_device-send_buf_size); + net_device-send_buf = vzalloc_node(net_device-send_buf_size, node); + if (!net_device-send_buf) + net_device-send_buf = vzalloc(net_device-send_buf_size); if (!net_device-send_buf) { netdev_err(ndev, unable to allocate send buffer of size %d\n, net_device-send_buf_size); -- 1.7.4.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: iproute2: missing patches in branch net-next
On Thu, 28 May 2015 18:32:32 +0200 Daniel Borkmann dan...@iogearbox.net wrote: On 05/28/2015 06:19 PM, Stephen Hemminger wrote: On Thu, 28 May 2015 13:31:08 +0200 Nicolas Dichtel nicolas.dich...@6wind.com wrote: Hi Stephen, some patches that were recently included in iproute2 branch net-next are not visible anymore on kernel.org. It seems that the branch has been overridden (note the forced update when I've fetched it): $ git fetch remote: Counting objects: 65, done. remote: Compressing objects: 100% (65/65), done. remote: Total 65 (delta 58), reused 0 (delta 0) Unpacking objects: 100% (65/65), done. From git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2 + aacee2695a90...eb9d6e794b52 net-next - origin/net-next (forced update) f043759dd492..c52827e9077f master - origin/master The following patches are lost: aacee2695a90 tc: gred: Add support for TCA_GRED_LIMIT attribute b6ec53e3008a xfrmmonitor: allows to monitor in several netns 449b824ad196 ipmonitor: allows to monitor in several netns 3b0006f8183e ipmonitor: introduce print_headers 0628cddd9d5c libnetlink: introduce rtnl_listen_filter_t 2503247d58c3 man: update ip monitor page 6fc1f8add30b iplink_bond: add support for ad_actor and port_key options df1c7d9138ea codel: add ce_threshold support to codel fc_codel 30eb304ecd1d tc: add support for Flower classifier 1a4dda7103bc ss: add support for bytes_acked bytes_received 908755dc49df iproute2: GENEVE support f9b004020a89 Merge branch 'master' into net-next 8f42ceaf2491 Update kernels for net-next Regards, Nicolas Ah found it was botched merge. The commits were still there locally. Should be fixed now, but had to force back to known good state on net-next branch. Okay, but now the iproute2 -next patches from last days are gone, right? I noticed the tc man page bits applied from yesterday are not in the -next tree anymore. Do you re-push those on top of the current restored state? Thanks, Daniel I will go back and recreate what is missing. Sorry for the confusion. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: iproute2: missing patches in branch net-next
On 05/29/2015 01:12 AM, Stephen Hemminger wrote: ... I will go back and recreate what is missing. Sorry for the confusion. Great thanks, no problem. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/9] ssb: drop unneeded goto
From: Julia Lawall julia.law...@lip6.fr Delete jump to a label on the next line, when that label is not used elsewhere. A simplified version of the semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // smpl @r@ identifier l; @@ -if (...) goto l; -l: // /smpl Also drop the unneeded err variable. Signed-off-by: Julia Lawall julia.law...@lip6.fr --- drivers/ssb/pci.c |8 +--- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/drivers/ssb/pci.c b/drivers/ssb/pci.c index 0f28c08..d6ca4d3 100644 --- a/drivers/ssb/pci.c +++ b/drivers/ssb/pci.c @@ -1173,17 +1173,11 @@ void ssb_pci_exit(struct ssb_bus *bus) int ssb_pci_init(struct ssb_bus *bus) { struct pci_dev *pdev; - int err; if (bus-bustype != SSB_BUSTYPE_PCI) return 0; pdev = bus-host_pci; mutex_init(bus-sprom_mutex); - err = device_create_file(pdev-dev, dev_attr_ssb_sprom); - if (err) - goto out; - -out: - return err; + return device_create_file(pdev-dev, dev_attr_ssb_sprom); } -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC 1/3] net: dsa: add basic support for VLAN ndo
This patch adds the ndo_vlan_rx_add_vid, ndo_vlan_rx_kill_vid, and ndo_bridge_setlink wrapper operations, used to create and remove VLAN entries in a DSA switch VLAN database. The switch drivers have to implement the port_vlan_add, port_vlan_kill, and port_bridge_setlink functions, in order to support VLANs. Signed-off-by: Vivien Didelot vivien.dide...@savoirfairelinux.com --- include/net/dsa.h | 9 +++ net/dsa/slave.c | 76 +-- 2 files changed, 83 insertions(+), 2 deletions(-) diff --git a/include/net/dsa.h b/include/net/dsa.h index fbca63b..cf02357 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -19,6 +19,7 @@ #include linux/phy.h #include linux/phy_fixed.h #include linux/ethtool.h +#include uapi/linux/if_bridge.h enum dsa_tag_protocol { DSA_TAG_PROTO_NONE = 0, @@ -302,6 +303,14 @@ struct dsa_switch_driver { const unsigned char *addr, u16 vid); int (*fdb_getnext)(struct dsa_switch *ds, int port, unsigned char *addr, bool *is_static); + + /* +* VLAN support +*/ + int (*port_vlan_add)(struct dsa_switch *ds, int port, u16 vid); + int (*port_vlan_kill)(struct dsa_switch *ds, int port, u16 vid); + int (*port_bridge_setlink)(struct dsa_switch *ds, int port, + struct bridge_vlan_info *vinfo); }; void register_switch_driver(struct dsa_switch_driver *type); diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 827cda56..72c3ff0 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -412,6 +412,71 @@ static netdev_tx_t dsa_slave_notag_xmit(struct sk_buff *skb, return NETDEV_TX_OK; } +static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, +__be16 proto, u16 vid) +{ + struct dsa_slave_priv *p = netdev_priv(dev); + struct dsa_switch *ds = p-parent; + + if (!ds-drv-port_vlan_add) + return -EOPNOTSUPP; + + netdev_dbg(dev, adding to VLAN %d\n, vid); + + return ds-drv-port_vlan_add(ds, p-port, vid); +} + +static int dsa_slave_vlan_rx_kill_vid(struct net_device *dev, + __be16 proto, u16 vid) +{ + struct dsa_slave_priv *p = netdev_priv(dev); + struct dsa_switch *ds = p-parent; + + if (!ds-drv-port_vlan_kill) + return -EOPNOTSUPP; + + netdev_dbg(dev, removing from VLAN %d\n, vid); + + return ds-drv-port_vlan_kill(ds, p-port, vid); +} + +static int dsa_slave_bridge_setlink(struct net_device *dev, + struct nlmsghdr *nlh, u16 flags) +{ + struct dsa_slave_priv *p = netdev_priv(dev); + struct dsa_switch *ds = p-parent; + struct nlattr *afspec; + struct nlattr *attr; + struct bridge_vlan_info *vinfo = NULL; + int rem; + + if (!ds-drv-port_bridge_setlink) + return -EOPNOTSUPP; + + afspec = nlmsg_find_attr(nlh, sizeof(struct ifinfomsg), IFLA_AF_SPEC); + if (!afspec) + return -EINVAL; + + nla_for_each_nested(attr, afspec, rem) { + if (nla_type(attr) != IFLA_BRIDGE_VLAN_INFO) + continue; + + if (nla_len(attr) != sizeof(struct bridge_vlan_info)) + return -EINVAL; + + vinfo = nla_data(attr); + } + + if (!vinfo) + return -EINVAL; + + netdev_dbg(dev, setting link to VLAN %d%s%s\n, vinfo-vid, + vinfo-flags BRIDGE_VLAN_INFO_UNTAGGED ? untagged : , + vinfo-flags BRIDGE_VLAN_INFO_PVID ? (default) : ); + + return ds-drv-port_bridge_setlink(ds, p-port, vinfo); +} + /* ethtool operations ***/ static int @@ -673,6 +738,9 @@ static const struct net_device_ops dsa_slave_netdev_ops = { .ndo_fdb_dump = dsa_slave_fdb_dump, .ndo_do_ioctl = dsa_slave_ioctl, .ndo_get_iflink = dsa_slave_get_iflink, + .ndo_vlan_rx_add_vid= dsa_slave_vlan_rx_add_vid, + .ndo_vlan_rx_kill_vid = dsa_slave_vlan_rx_kill_vid, + .ndo_bridge_setlink = dsa_slave_bridge_setlink, }; static const struct swdev_ops dsa_slave_swdev_ops = { @@ -854,7 +922,9 @@ int dsa_slave_create(struct dsa_switch *ds, struct device *parent, if (slave_dev == NULL) return -ENOMEM; - slave_dev-features = master-vlan_features; + slave_dev-features = master-vlan_features | + NETIF_F_VLAN_FEATURES | + NETIF_F_HW_SWITCH_OFFLOAD; slave_dev-ethtool_ops = dsa_slave_ethtool_ops; eth_hw_addr_inherit(slave_dev, master); slave_dev-tx_queue_len = 0; @@ -863,7 +933,9 @@ int dsa_slave_create(struct dsa_switch *ds, struct device *parent, SET_NETDEV_DEV(slave_dev, parent);
[RFC 2/3] net: dsa: mv88e6xxx: add support for VTU operations
This commit implements the port_vlan_add, port_vlan_kill, and port_bridge_setlink dsa_switch_driver functions to access the VTU, and thus add support for adding, removing VLANs, and joining ports to them. Signed-off-by: Vivien Didelot vivien.dide...@savoirfairelinux.com --- drivers/net/dsa/mv88e6xxx.c | 309 drivers/net/dsa/mv88e6xxx.h | 28 2 files changed, 337 insertions(+) diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c index cf309aa9..2f4c99f 100644 --- a/drivers/net/dsa/mv88e6xxx.c +++ b/drivers/net/dsa/mv88e6xxx.c @@ -2,6 +2,9 @@ * net/dsa/mv88e6xxx.c - Marvell 88e6xxx switch chip support * Copyright (c) 2008 Marvell Semiconductor * + * Copyright (c) 2015 CMC Electronics, Inc. + * Added support for 802.1q VTU operations + * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or @@ -1241,6 +1244,312 @@ static void mv88e6xxx_bridge_work(struct work_struct *work) } } +static int _mv88e6xxx_vtu_wait(struct dsa_switch *ds) +{ + return _mv88e6xxx_wait(ds, REG_GLOBAL, GLOBAL_VTU_OP, + GLOBAL_VTU_OP_BUSY); +} + +static int _mv88e6xxx_vtu_cmd(struct dsa_switch *ds, u16 op) +{ + int ret; + + ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_OP, op); + if (ret 0) + return ret; + + return _mv88e6xxx_vtu_wait(ds); +} + +static int _mv88e6xxx_stu_loadpurge(struct dsa_switch *ds, u8 sid, bool valid) +{ + int ret, data; + + ret = _mv88e6xxx_vtu_wait(ds); + if (ret 0) + return ret; + + data = sid GLOBAL_VTU_SID_MASK; + if (valid) + data |= GLOBAL_VTU_VID_VALID; + + ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_VID, data); + if (ret 0) + return ret; + + /* Unused (yet) data registers */ + ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_DATA_0_3, 0); + if (ret 0) + return ret; + + ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_DATA_4_7, 0); + if (ret 0) + return ret; + + ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_DATA_8_11, 0); + if (ret 0) + return ret; + + return _mv88e6xxx_vtu_cmd(ds, GLOBAL_VTU_OP_STU_LOAD_PURGE); +} + +static int _mv88e6xxx_vtu_getnext(struct dsa_switch *ds, u16 vid, + struct mv88e6xxx_vtu_entry *entry) +{ + int ret, i; + + ret = _mv88e6xxx_vtu_wait(ds); + if (ret 0) + return ret; + + ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_VID, + vid GLOBAL_VTU_VID_MASK); + if (ret 0) + return ret; + + ret = _mv88e6xxx_vtu_cmd(ds, GLOBAL_VTU_OP_VTU_GET_NEXT); + if (ret 0) + return ret; + + ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_VID); + if (ret 0) + return ret; + + entry-vid = ret GLOBAL_VTU_VID_MASK; + entry-valid = !!(ret GLOBAL_VTU_VID_VALID); + + if (entry-valid) { + /* Ports 0-3, offsets 0, 4, 8, 12 */ + ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_DATA_0_3); + if (ret 0) + return ret; + + for (i = 0; i 4; ++i) + entry-tags[i] = (ret (i * 4)) 3; + + /* Ports 4-6, offsets 0, 4, 8 */ + ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_DATA_4_7); + if (ret 0) + return ret; + + for (i = 4; i 7; ++i) + entry-tags[i] = (ret ((i - 4) * 4)) 3; + + ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_FID); + if (ret 0) + return ret; + + entry-fid = ret GLOBAL_VTU_FID_MASK; + + ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_SID); + if (ret 0) + return ret; + + entry-sid = ret GLOBAL_VTU_SID_MASK; + } + + return 0; +} + +static int _mv88e6xxx_vtu_loadpurge(struct dsa_switch *ds, + struct mv88e6xxx_vtu_entry *entry) +{ + u16 data = 0; + int ret, i; + + ret = _mv88e6xxx_vtu_wait(ds); + if (ret 0) + return ret; + + if (entry-valid) { + /* Set Data Register, ports 0-3, offsets 0, 4, 8, 12 */ + for (data = i = 0; i 4; ++i) + data |= entry-tags[i] (i * 4); + ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_DATA_0_3, + data); + if (ret 0) + return ret; + + /* Set
[PATCH 1/1] hv_netvsc: Allocate the receive buffer from the correct NUMA node
Allocate the receive bufer from the NUMA node assigned to the primary channel. Signed-off-by: K. Y. Srinivasan k...@microsoft.com --- drivers/net/hyperv/netvsc.c |7 ++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index b024968..d187965 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -227,13 +227,18 @@ static int netvsc_init_buf(struct hv_device *device) struct netvsc_device *net_device; struct nvsp_message *init_packet; struct net_device *ndev; + int node; net_device = get_outbound_net_device(device); if (!net_device) return -ENODEV; ndev = net_device-ndev; - net_device-recv_buf = vzalloc(net_device-recv_buf_size); + node = cpu_to_node(device-channel-target_cpu); + net_device-recv_buf = vzalloc_node(net_device-recv_buf_size, node); + if (!net_device-recv_buf) + net_device-recv_buf = vzalloc(net_device-recv_buf_size); + if (!net_device-recv_buf) { netdev_err(ndev, unable to allocate receive buffer of size %d\n, net_device-recv_buf_size); -- 1.7.4.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net 3/3] bna: fix soft lock-up during firmware initialization failure
Bug in the driver initialization causes soft-lockup if firmware initialization timeout is reached. Polling function bfa_ioc_poll_fwinit() incorrectly calls bfa_nw_iocpf_timeout() when the timeout is reached. The problem is that bfa_nw_iocpf_timeout() calls again bfa_ioc_poll_fwinit()... etc. The bfa_ioc_poll_fwinit() should directly send timeout event for iocpf and the same should be done if firmware download into HW fails. Cc: Rasesh Mody rasesh.m...@qlogic.com Signed-off-by: Ivan Vecera ivec...@redhat.com --- drivers/net/ethernet/brocade/bna/bfa_ioc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/brocade/bna/bfa_ioc.c b/drivers/net/ethernet/brocade/bna/bfa_ioc.c index 594a2ab..68f3c13 100644 --- a/drivers/net/ethernet/brocade/bna/bfa_ioc.c +++ b/drivers/net/ethernet/brocade/bna/bfa_ioc.c @@ -2414,7 +2414,7 @@ bfa_ioc_boot(struct bfa_ioc *ioc, enum bfi_fwboot_type boot_type, if (status == BFA_STATUS_OK) bfa_ioc_lpu_start(ioc); else - bfa_nw_iocpf_timeout(ioc); + bfa_fsm_send_event(ioc-iocpf, IOCPF_E_TIMEOUT); return status; } @@ -3029,7 +3029,7 @@ bfa_ioc_poll_fwinit(struct bfa_ioc *ioc) } if (ioc-iocpf.poll_time = BFA_IOC_TOV) { - bfa_nw_iocpf_timeout(ioc); + bfa_fsm_send_event(ioc-iocpf, IOCPF_E_TIMEOUT); } else { ioc-iocpf.poll_time += BFA_IOC_POLL_TOV; mod_timer(ioc-iocpf_timer, jiffies + -- 2.3.6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net 2/3] bna: remove unreasonable iocpf timer start
Driver starts iocpf timer prior bnad_ioceth_enable() call and this is unreasonable. This piece of code probably originates from Brocade/Qlogic out-of-box driver during initial import into upstream. This driver uses only one timer and queue to implement multiple timers and this timer is started at this place. The upstream driver uses multiple timers instead of this. Cc: Rasesh Mody rasesh.m...@qlogic.com Signed-off-by: Ivan Vecera ivec...@redhat.com --- drivers/net/ethernet/brocade/bna/bnad.c | 4 1 file changed, 4 deletions(-) diff --git a/drivers/net/ethernet/brocade/bna/bnad.c b/drivers/net/ethernet/brocade/bna/bnad.c index 37072a8..caae6cb 100644 --- a/drivers/net/ethernet/brocade/bna/bnad.c +++ b/drivers/net/ethernet/brocade/bna/bnad.c @@ -3701,10 +3701,6 @@ bnad_pci_probe(struct pci_dev *pdev, setup_timer(bnad-bna.ioceth.ioc.sem_timer, bnad_iocpf_sem_timeout, ((unsigned long)bnad)); - /* Now start the timer before calling IOC */ - mod_timer(bnad-bna.ioceth.ioc.iocpf_timer, - jiffies + msecs_to_jiffies(BNA_IOC_TIMER_FREQ)); - /* * Start the chip * If the call back comes with error, we bail out. -- 2.3.6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC 3/3] net: dsa: mv88e6352: add support for VLAN
This commit adds support for the VTU operations to the mv88e6352 driver. Signed-off-by: Vivien Didelot vivien.dide...@savoirfairelinux.com --- drivers/net/dsa/mv88e6352.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c index 8b0d54f..8396a2e 100644 --- a/drivers/net/dsa/mv88e6352.c +++ b/drivers/net/dsa/mv88e6352.c @@ -554,6 +554,9 @@ struct dsa_switch_driver mv88e6352_switch_driver = { .fdb_add= mv88e6xxx_port_fdb_add, .fdb_del= mv88e6xxx_port_fdb_del, .fdb_getnext= mv88e6xxx_port_fdb_getnext, + .port_vlan_add = mv88e6xxx_port_vlan_add, + .port_vlan_kill = mv88e6xxx_port_vlan_kill, + .port_bridge_setlink= mv88e6xxx_port_bridge_setlink, }; MODULE_ALIAS(platform:mv88e6352); -- 2.4.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC 0/3] DSA and Marvell 88E6352 802.1q support
This RFC is based on v4.1-rc3. It is meant to get a glance to the commits responsible to implement the necessary NDOs between DSA and the Marvell 88E6352 switch driver. With this support, I am able to create VLANs with (un)tagged ports, setting their default VID, from a bridge. To create a bridge containing all switch ports, with a VLAN ID 400, swp2 and swp3 untagged (pvid), and swp4 tagged, the userspace commands look like this: ip link add name br0 type bridge [...] ip link set dev swp2 up master br0 [...] bridge vlan add vid 400 pvid untagged dev swp2 bridge vlan add vid 400 pvid untagged dev swp3 bridge vlan add vid 400 dev swp4 [...] ip link add link br0 name br0.400 type vlan id 400 [...] bridge vlan add dev br0 vid 400 self The code is currently being rebased to the latest net-next/master. Seems like the way to go now is through switchdev attr getter/setter... Vivien Didelot (3): net: dsa: add basic support for VLAN ndo net: dsa: mv88e6xxx: add support for VTU operations net: dsa: mv88e6352: add support for VLAN drivers/net/dsa/mv88e6352.c | 3 + drivers/net/dsa/mv88e6xxx.c | 309 drivers/net/dsa/mv88e6xxx.h | 28 include/net/dsa.h | 9 ++ net/dsa/slave.c | 76 ++- 5 files changed, 423 insertions(+), 2 deletions(-) -- 2.4.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 1/1] hv_netvsc: Allocate the receive buffer from the correct NUMA node
-Original Message- From: K. Y. Srinivasan [mailto:k...@microsoft.com] Sent: Thursday, May 28, 2015 2:56 PM To: da...@davemloft.net; netdev@vger.kernel.org; linux- ker...@vger.kernel.org; de...@linuxdriverproject.org; o...@aepfle.de; a...@canonical.com; jasow...@redhat.com Cc: KY Srinivasan Subject: [PATCH 1/1] hv_netvsc: Allocate the receive buffer from the correct NUMA node Allocate the receive bufer from the NUMA node assigned to the primary channel. Signed-off-by: K. Y. Srinivasan k...@microsoft.com --- drivers/net/hyperv/netvsc.c |7 ++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index b024968..d187965 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -227,13 +227,18 @@ static int netvsc_init_buf(struct hv_device *device) struct netvsc_device *net_device; struct nvsp_message *init_packet; struct net_device *ndev; + int node; net_device = get_outbound_net_device(device); if (!net_device) return -ENODEV; ndev = net_device-ndev; - net_device-recv_buf = vzalloc(net_device-recv_buf_size); + node = cpu_to_node(device-channel-target_cpu); + net_device-recv_buf = vzalloc_node(net_device-recv_buf_size, node); + if (!net_device-recv_buf) + net_device-recv_buf = vzalloc(net_device-recv_buf_size); + if (!net_device-recv_buf) { netdev_err(ndev, unable to allocate receive buffer of size %d\n, net_device-recv_buf_size); -- 1.7.4.1 David, Please drop this patch; I am going to resend this with another patch. Regards, K. Y -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/9] wl1251: drop unneeded goto
From: Julia Lawall julia.law...@lip6.fr Delete jump to a label on the next line, when that label is not used elsewhere. A simplified version of the semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // smpl @r@ identifier l; @@ -if (...) goto l; -l: // /smpl Signed-off-by: Julia Lawall julia.law...@lip6.fr --- drivers/net/wireless/ti/wl1251/acx.c |3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/net/wireless/ti/wl1251/acx.c b/drivers/net/wireless/ti/wl1251/acx.c index 5695628..d6fbdda 100644 --- a/drivers/net/wireless/ti/wl1251/acx.c +++ b/drivers/net/wireless/ti/wl1251/acx.c @@ -53,10 +53,7 @@ int wl1251_acx_station_id(struct wl1251 *wl) mac-mac[i] = wl-mac_addr[ETH_ALEN - 1 - i]; ret = wl1251_cmd_configure(wl, DOT11_STATION_ID, mac, sizeof(*mac)); - if (ret 0) - goto out; -out: kfree(mac); return ret; } -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible issue in iproute2 package
Hi Jose, thanks for your report! On 05/28/2015 11:12 PM, Guzman Mosqueda, Jose R wrote: ... We're using iproute2 in a GNU-Linux project and I'm analyzing the code to try to find possible issues/gaps/risks. Since I'm not too familiar with the package yet I have a question about a particular piece of code that could result in a memory corruption: Version: 4.0.0 File: misc/ss.c Function: static void tcp_show_info(...) Line: ~1903 Description: There is a memory allocation for a s.cong_alg variable: s.cong_alg = malloc(strlen(cong_attr + 1)); The length is calculated about next position of the starting character. But next line there is a copy of the whole content: strcpy(s.cong_alg, cong_attr); I think there is a mistake and it should be something like: s.cong_alg = malloc(strlen(cong_attr) + 1); Is this the case? Is it a real bug? Also I don't see any checking for the value returned by the malloc call, what if it returns a NULL pointer? Cc'ing Vadim for ... commit 8250bc9ff4e55a3ef397ed8c7612f1392d164295 Author: Vadim Kochan vadi...@gmail.com Date: Tue Jan 20 16:14:24 2015 +0200 ss: Unify inet sockets output Signed-off-by: Vadim Kochan vadi...@gmail.com Also I found something similar about line 1903: s.cong_alg = malloc(strlen(cong_attr + 1)); strcpy(s.cong_alg, cong_attr); And another possible issue that I found: File: tc/tc_util.c Function: void print_rate(char *buf, int len, __u64 rate) Line: ~264 In the case that user inputs a high value for rate, the for loop will exit in the condition meaning that variable i get the value of 5 which will be an invalid index for the units array due to that array has only 5 elements. I hope you can help me by checking these issues and tell me whether they are real issues or not since you know much better the code. Also I don't know if you have already this reported, I didn't find a list of issues for this package. Can you tell me where is such list? I really appreciate any help on this. Thanks in advance. Jose G. N�r��y���b�X��ǧv�^�){.n�+���z�^�)���w*jg����ݢj/���z�ޖ��2�ޙ)ߡ�a�����G���h��j:+v���w�٥ -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 0/3] net: systemport: misc improvements
Hi David, These patches are highly inspired by changes from Petri on bcmgenet, last patch is a misc fix that I had pending for a while, but is not a candidate for 'net' at this point. Thanks! Florian Fainelli (3): net: systemport: Pre-calculate and utilize cb-bd_addr net: systemport: rewrite bcm_sysport_rx_refill net: systemport: Add a check for oversized packets drivers/net/ethernet/broadcom/bcmsysport.c | 107 - drivers/net/ethernet/broadcom/bcmsysport.h | 2 - 2 files changed, 58 insertions(+), 51 deletions(-) -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Request for advice on where to put Root Complex fix up code for downstream device
| From: Casey Leedom [lee...@chelsio.com] | Sent: Thursday, May 07, 2015 4:31 PM | | | From: Bjorn Helgaas [bhelg...@google.com] | | Sent: Thursday, May 07, 2015 4:04 PM | | | | There are a lot of fixups in drivers/pci/quirks.c. For things that have to | | be worked around either before a driver claims the device or if there is no | | driver at all, the fixup *has* to go in drivers/pci/quirks.c | | | | But for things like this, where the problem can only occur after a driver | | claims the device, I think it makes more sense to put the fixup in the | | driver itself. The only wrinkle here is that the fixup has to be done on a | | separate device, not the device claimed by the driver. But I think it | | probably still makes sense to put this fixup in the driver. | ... | One complication to doing this in cxgb4 is that it attaches to Physical | Function 4 of our T5 chip. Meanwhile, a completely separate storage | driver, csiostor, connections to PF5 and PF6 and there's no | requirement at all that cxgb4 be loaded. So if we go down the road of | putting the fixup code in the cxgb4 driver, we'll also need to duplicate | that code in the csiostor driver. I never heard back on this issue of needing to put the Root Complex fixup code in two different drivers -- cxgb4 and csiostor -- if we don't go down the path of using a PCI Quirk. I'm happy doing either and have verified both solutions locally. I'd just like to get a judgement call on this. It comes down to adding ~30 lines to drivers/net/eththernet/chelsio/cxgb4/cxgb4_main.c drivers/scsi/csiostor/csio_init.c or ~30 lines to drivers/pci/quirks.c | | Can you include a pointer to the relevant part of the spec? | | Sure: | | 2.2.9. Completion Rules | ... | Completion headers must supply the same values for | the Attribute as were supplied in the 20 header of | the corresponding Request, except as explicitly | allowed when IDO is used (see Section 2.2.6.4). | ... | 2.3.2. Completion Handling Rules | ... | If a received Completion matches the Transaction ID | of an outstanding Request, but in some other way | does not match the corresponding Request (e.g., a | problem with Attributes, Traffic Class, Byte Count, | Lower Address, etc), it is strongly recommended for | the Receiver to handle the Completion as a Malformed | TLP. However, if the Completion is otherwise properly | formed, it is permitted[22] for the Receiver to | handle the Completion as an Unexpected Completion. | | Can you use pci_upstream_bridge() here? There are a couple places where we | | want to find the Root Port, so we might factor that out someday. It'll be | | easier to find all those places if they use with pci_upstream_bridge(). | | It looks like pci_upstream_bridge() just traverses one like upstream toward the | Root Complex? Or am I misunderstanding that function? -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/9] ipv6: drop unneeded goto
From: Julia Lawall julia.law...@lip6.fr Delete jump to a label on the next line, when that label is not used elsewhere. A simplified version of the semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // smpl @r@ identifier l; @@ -if (...) goto l; -l: // /smpl Also remove the unnecessary ret variable. Signed-off-by: Julia Lawall julia.law...@lip6.fr --- net/ipv6/raw.c |8 +--- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c index 484a5c1..ca4700c 100644 --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -1327,13 +1327,7 @@ static struct inet_protosw rawv6_protosw = { int __init rawv6_init(void) { - int ret; - - ret = inet6_register_protosw(rawv6_protosw); - if (ret) - goto out; -out: - return ret; + return inet6_register_protosw(rawv6_protosw); } void rawv6_exit(void) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Possible issue in iproute2 package
Hi all I'm Jose Guzman from a security team at Intel. We're using iproute2 in a GNU-Linux project and I'm analyzing the code to try to find possible issues/gaps/risks. Since I'm not too familiar with the package yet I have a question about a particular piece of code that could result in a memory corruption: Version: 4.0.0 File: misc/ss.c Function: static void tcp_show_info(...) Line: ~1903 Description: There is a memory allocation for a s.cong_alg variable: s.cong_alg = malloc(strlen(cong_attr + 1)); The length is calculated about next position of the starting character. But next line there is a copy of the whole content: strcpy(s.cong_alg, cong_attr); I think there is a mistake and it should be something like: s.cong_alg = malloc(strlen(cong_attr) + 1); Is this the case? Is it a real bug? Also I don't see any checking for the value returned by the malloc call, what if it returns a NULL pointer? Also I found something similar about line 1903: s.cong_alg = malloc(strlen(cong_attr + 1)); strcpy(s.cong_alg, cong_attr); And another possible issue that I found: File: tc/tc_util.c Function: void print_rate(char *buf, int len, __u64 rate) Line: ~264 In the case that user inputs a high value for rate, the for loop will exit in the condition meaning that variable i get the value of 5 which will be an invalid index for the units array due to that array has only 5 elements. I hope you can help me by checking these issues and tell me whether they are real issues or not since you know much better the code. Also I don't know if you have already this reported, I didn't find a list of issues for this package. Can you tell me where is such list? I really appreciate any help on this. Thanks in advance. Jose G.
[PATCH 0/9] drop unneeded goto
These patches drop gotos that jump to a label that is at the next instruction, in the case that the label is not used elsewhere in the function. The complete semantic patch that performs this transformation is as follows: // smpl @r@ position p; identifier l; @@ if (...) goto l@p; l: @script:ocaml s@ p r.p; nm; @@ nm := (List.hd p).current_element @ok exists@ identifier s.nm,l; position p != r.p; @@ nm(...) { +... goto l@p; ...+ } @depends on !ok@ identifier s.nm; position r.p; identifier l; @@ nm(...) { ... - if(...) goto l@p; l: ... } // /smpl -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux-next: build failure after merge of most of the trees
On Thu, 2015-05-28 at 14:35 -0700, David Miller wrote: Bogus chunk in my local tree, didn't make it into the final commit I pushed out. Thanks for taking care of this before me ! -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net v2] switchdev: don't abort hardware ipv4 fib offload on failure to program fib entry in hardware
On Thu, May 28, 2015 at 08:40:11AM -0700, Scott Feldman wrote: On Thu, May 28, 2015 at 2:42 AM, Jiri Pirko j...@resnulli.us wrote: Mon, May 18, 2015 at 10:19:16PM CEST, da...@davemloft.net wrote: From: Roopa Prabhu ro...@cumulusnetworks.com Date: Sun, 17 May 2015 16:42:05 -0700 On most systems where you can offload routes to hardware, doing routing in software is not an option (the cpu limitations make routing impossible in software). You absolutely do not get to determine this policy, none of us do. What matters is that by default the damn switch device being there is %100 transparent to the user. And the way to achieve that default is to do software routes as a fallback. I am not going to entertain changes of this nature which fail route loading by default just because we've exceeded a device's HW capacity to offload. I thought I was _really_ clear about this at netdev 0.1 I certainly agree that by default, transparency 1:1 sw:hw mapping is what we need for fib. The current code is a good start! I see couple of issues regarding switchdev_fib_ipv4_abort: 1) If user adds and entry, switchdev_fib_ipv4_add fails, abort is executed - and, error returned. I would expect that route entry should be added in this case. The next attempt of adding the same entry will be successful. The current behaviour breaks the transparency you are reffering to. 2) When switchdev_fib_ipv4_abort happens to be executed, the offload is disabled for good (until reboot). That is certainly not nice, alhough I understand that is the easiest solution for now. I believe that we all agree that the 1:1 transparency, although it is a default, may not be optimal for real-life usage. HW resources are limited and user does not know them. The danger of hitting _abort and screwing-up the whole system is huge, unacceptable. So here, there are couple of more or less simple things that I suggest to do in order to move a little bit forward: 1) Introduce system-wide option to switch _abort to just plain fail. When HW does not have capacity, do not flush and fallback to sw, but rather just fail to add the entry. This would not break anything. Userspace has to be prepared that entry add could fail. 2) Introduce a way to propagate resources to userspace. Driver knows about resources used/available/potentially_available. Switchdev infra could be extended in order to propagate the info to the user. 3) Introduce couple of flags for entry add that would alter the default behaviour. Something like: NLM_F_SKIP_KERNEL NLM_F_SKIP_OFFLOAD Again, this does not break the current users. On the other hand, this gives new users a leverage to instruct kernel where the entry should be added to (or not added to). Any thoughts? Objections? I don't like these. Breaks transparency and forces the user in a position of having to know hardware failures modes (unique to each hardware device). I presented an option d) which avoids this issues; was it not understood? I actually really like the way Jiri succinctly covered the different cases to move us forward from what we have today (Thanks, Jiri!). I completely agree with you on both of your problem statements and the idea that what have is fine for the short-term. I see definite room to improve the the user experience available via upstream kernels. Option 1 has appeal since userspace applications that control FDB, FIB, etc entries could work without modification (the when in this mode the kernel could choose to ignore any NLM_F_* flags Jiri proposed), but I agree that a system-wide (or maybe offload-device-wide?) configuration option needs to exist as this should not be the default behavior. Option 2 could also work as userspace applications could query for space availability before attempting to add a route. This could be nice during bootup as then apps could periodically double check that their view of the world is accurate. Option 3 also has appeal since there exists the ability to allow fine-grained control from userspace applications since less used routes (or routes that could be summarized) could be combined in userspace if needed. The great part about all suggestions is that when combined they can provide a great user experience, but doing all 3 at once is probably too aggressive. My vote would be to see if we can work together on a combination of Option 1 and 3 together as they seem to provide a great first start to this... If an application tried to add a route (called A) to the route table in the kernel and code to support Option 1 existed (similar to what Roopa posted to start this series) then the kernel could fail to add route A. If the user noted that some other route (called B) was lower priority for _any_ reason, the user could delete route B from the kernel and hardware and add route A to hardware and kernel. Then the
Re: linux-next: build failure after merge of most of the trees
From: Joe Perches j...@perches.com Date: Thu, 28 May 2015 11:51:15 -0700 On Thu, 2015-05-28 at 11:42 -0700, David Miller wrote: I've applied the following to net-next, thanks for your report. [PATCH] treewide: Add missing vmalloc.h inclusion. All of these files were only building on non-x86 because of the indirect of inclusion of vmalloc.h by, of all things, net/inet_hashtables.h [] diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c [] @@ -444,6 +444,7 @@ static int skcipher_recvmsg(struct kiocb *unused, struct socket *sock, err = skcipher_wait_for_data(sk, flags); if (err) goto unlock; +used = ctx-used; huh? Bogus chunk in my local tree, didn't make it into the final commit I pushed out. But thanks for noticing. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 net-next 05/11] net: Add full IPv6 addresses to flow_keys
On Thu, 2015-05-28 at 11:19 -0700, Tom Herbert wrote: @@ -566,11 +640,15 @@ static const struct flow_dissector_key flow_keys_dissector_keys[] = { }, { .key_id = FLOW_DISSECTOR_KEY_IPV4_ADDRS, - .offset = offsetof(struct flow_keys, addrs), + .offset = offsetof(struct flow_keys, addrs.v4addrs), + }, + { + .key_id = FLOW_DISSECTOR_KEY_IPV6_ADDRS, + .offset = offsetof(struct flow_keys, addrs.v6addrs), }, { .key_id = FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS, - .offset = offsetof(struct flow_keys, addrs), + .offset = offsetof(struct flow_keys, addrs.v4addrs), Shouldn't it be offsetof(struct flow_keys, addrs.v6addrs), ? -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 1/3] net: systemport: Pre-calculate and utilize cb-bd_addr
There is a 1:1 mapping between the software maintained control block in priv-rx_cbs and the buffer address in priv-rx_bds, such that there is no need to keep computing the buffer address when refiling a control block. Signed-off-by: Florian Fainelli f.faine...@gmail.com --- drivers/net/ethernet/broadcom/bcmsysport.c | 18 +- drivers/net/ethernet/broadcom/bcmsysport.h | 2 -- 2 files changed, 9 insertions(+), 11 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c index 084a50a555de..267330ccd595 100644 --- a/drivers/net/ethernet/broadcom/bcmsysport.c +++ b/drivers/net/ethernet/broadcom/bcmsysport.c @@ -549,12 +549,7 @@ static int bcm_sysport_rx_refill(struct bcm_sysport_priv *priv, } dma_unmap_addr_set(cb, dma_addr, mapping); - dma_desc_set_addr(priv, priv-rx_bd_assign_ptr, mapping); - - priv-rx_bd_assign_index++; - priv-rx_bd_assign_index = (priv-num_rx_bds - 1); - priv-rx_bd_assign_ptr = priv-rx_bds + - (priv-rx_bd_assign_index * DESC_SIZE); + dma_desc_set_addr(priv, cb-bd_addr, mapping); netif_dbg(priv, rx_status, ndev, RX refill\n); @@ -568,7 +563,7 @@ static int bcm_sysport_alloc_rx_bufs(struct bcm_sysport_priv *priv) unsigned int i; for (i = 0; i priv-num_rx_bds; i++) { - cb = priv-rx_cbs[priv-rx_bd_assign_index]; + cb = priv-rx_cbs[i]; if (cb-skb) continue; @@ -1330,14 +1325,14 @@ static inline int tdma_enable_set(struct bcm_sysport_priv *priv, static int bcm_sysport_init_rx_ring(struct bcm_sysport_priv *priv) { + struct bcm_sysport_cb *cb; u32 reg; int ret; + int i; /* Initialize SW view of the RX ring */ priv-num_rx_bds = NUM_RX_DESC; priv-rx_bds = priv-base + SYS_PORT_RDMA_OFFSET; - priv-rx_bd_assign_ptr = priv-rx_bds; - priv-rx_bd_assign_index = 0; priv-rx_c_index = 0; priv-rx_read_ptr = 0; priv-rx_cbs = kcalloc(priv-num_rx_bds, sizeof(struct bcm_sysport_cb), @@ -1347,6 +1342,11 @@ static int bcm_sysport_init_rx_ring(struct bcm_sysport_priv *priv) return -ENOMEM; } + for (i = 0; i priv-num_rx_bds; i++) { + cb = priv-rx_cbs + i; + cb-bd_addr = priv-rx_bds + i * DESC_SIZE; + } + ret = bcm_sysport_alloc_rx_bufs(priv); if (ret) { netif_err(priv, hw, priv-netdev, SKB allocation failed\n); diff --git a/drivers/net/ethernet/broadcom/bcmsysport.h b/drivers/net/ethernet/broadcom/bcmsysport.h index 42a4b4a0bc14..f28bf545d7f4 100644 --- a/drivers/net/ethernet/broadcom/bcmsysport.h +++ b/drivers/net/ethernet/broadcom/bcmsysport.h @@ -663,8 +663,6 @@ struct bcm_sysport_priv { /* Receive queue */ void __iomem*rx_bds; - void __iomem*rx_bd_assign_ptr; - unsigned intrx_bd_assign_index; struct bcm_sysport_cb *rx_cbs; unsigned intnum_rx_bds; unsigned intrx_read_ptr; -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 2/3] net: systemport: rewrite bcm_sysport_rx_refill
Currently, bcm_sysport_desc_rx() calls bcm_sysport_rx_refill() at the end of Rx packet processing loop, after the current Rx packet has already been passed to napi_gro_receive(). However, bcm_sysport_rx_refill() might fail to allocate a new Rx skb, thus leaving a hole on the Rx queue where no valid Rx buffer exists. To eliminate this situation: 1. Rewrite bcm_sysport_rx_refill() to retain the current Rx skb on the Rx queue if a new replacement Rx skb can't be allocated and DMA-mapped. In this case, the data on the current Rx skb is effectively dropped. 2. Modify bcm_sysport_desc_rx() to call bcm_sysport_rx_refill() at the top of Rx packet processing loop, so that the new replacement Rx skb is already in place before the current Rx skb is processed. This is loosely inspired from d6707bec5986 (net: bcmgenet: rewrite bcmgenet_rx_refill()) Signed-off-by: Florian Fainelli f.faine...@gmail.com --- drivers/net/ethernet/broadcom/bcmsysport.c | 81 +++--- 1 file changed, 41 insertions(+), 40 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c index 267330ccd595..d777b0db9e63 100644 --- a/drivers/net/ethernet/broadcom/bcmsysport.c +++ b/drivers/net/ethernet/broadcom/bcmsysport.c @@ -524,62 +524,70 @@ static void bcm_sysport_free_cb(struct bcm_sysport_cb *cb) dma_unmap_addr_set(cb, dma_addr, 0); } -static int bcm_sysport_rx_refill(struct bcm_sysport_priv *priv, -struct bcm_sysport_cb *cb) +static struct sk_buff *bcm_sysport_rx_refill(struct bcm_sysport_priv *priv, +struct bcm_sysport_cb *cb) { struct device *kdev = priv-pdev-dev; struct net_device *ndev = priv-netdev; + struct sk_buff *skb, *rx_skb; dma_addr_t mapping; - int ret; - cb-skb = netdev_alloc_skb(priv-netdev, RX_BUF_LENGTH); - if (!cb-skb) { + /* Allocate a new SKB for a new packet */ + skb = netdev_alloc_skb(priv-netdev, RX_BUF_LENGTH); + if (!skb) { + priv-mib.alloc_rx_buff_failed++; netif_err(priv, rx_err, ndev, SKB alloc failed\n); - return -ENOMEM; + return NULL; } - mapping = dma_map_single(kdev, cb-skb-data, + mapping = dma_map_single(kdev, skb-data, RX_BUF_LENGTH, DMA_FROM_DEVICE); - ret = dma_mapping_error(kdev, mapping); - if (ret) { + if (dma_mapping_error(kdev, mapping)) { priv-mib.rx_dma_failed++; - bcm_sysport_free_cb(cb); + dev_kfree_skb_any(skb); netif_err(priv, rx_err, ndev, DMA mapping failure\n); - return ret; + return NULL; } + /* Grab the current SKB on the ring */ + rx_skb = cb-skb; + if (likely(rx_skb)) + dma_unmap_single(kdev, dma_unmap_addr(cb, dma_addr), +RX_BUF_LENGTH, DMA_FROM_DEVICE); + + /* Put the new SKB on the ring */ + cb-skb = skb; dma_unmap_addr_set(cb, dma_addr, mapping); dma_desc_set_addr(priv, cb-bd_addr, mapping); netif_dbg(priv, rx_status, ndev, RX refill\n); - return 0; + /* Return the current SKB to the caller */ + return rx_skb; } static int bcm_sysport_alloc_rx_bufs(struct bcm_sysport_priv *priv) { struct bcm_sysport_cb *cb; - int ret = 0; + struct sk_buff *skb; unsigned int i; for (i = 0; i priv-num_rx_bds; i++) { cb = priv-rx_cbs[i]; - if (cb-skb) - continue; - - ret = bcm_sysport_rx_refill(priv, cb); - if (ret) - break; + skb = bcm_sysport_rx_refill(priv, cb); + if (skb) + dev_kfree_skb(skb); + if (!cb-skb) + return -ENOMEM; } - return ret; + return 0; } /* Poll the hardware for up to budget packets to process */ static unsigned int bcm_sysport_desc_rx(struct bcm_sysport_priv *priv, unsigned int budget) { - struct device *kdev = priv-pdev-dev; struct net_device *ndev = priv-netdev; unsigned int processed = 0, to_process; struct bcm_sysport_cb *cb; @@ -587,7 +595,6 @@ static unsigned int bcm_sysport_desc_rx(struct bcm_sysport_priv *priv, unsigned int p_index; u16 len, status; struct bcm_rsb *rsb; - int ret; /* Determine how much we should process since last call */ p_index = rdma_readl(priv, RDMA_PROD_INDEX); @@ -605,13 +612,8 @@ static unsigned int bcm_sysport_desc_rx(struct bcm_sysport_priv *priv, while ((processed to_process) (processed budget)) { cb = priv-rx_cbs[priv-rx_read_ptr]; - skb = cb-skb; -
[PATCH net-next 3/3] net: systemport: Add a check for oversized packets
Occasionnaly we may get oversized packets from the hardware which exceed the nomimal 2KiB buffer size we allocate SKBs with. Add an early check which drops the packet to avoid invoking skb_over_panic() and move on to processing the next packet. Signed-off-by: Florian Fainelli f.faine...@gmail.com --- drivers/net/ethernet/broadcom/bcmsysport.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c index d777b0db9e63..909ad7a0d480 100644 --- a/drivers/net/ethernet/broadcom/bcmsysport.c +++ b/drivers/net/ethernet/broadcom/bcmsysport.c @@ -638,6 +638,14 @@ static unsigned int bcm_sysport_desc_rx(struct bcm_sysport_priv *priv, p_index, priv-rx_c_index, priv-rx_read_ptr, len, status); + if (unlikely(len RX_BUF_LENGTH)) { + netif_err(priv, rx_status, ndev, oversized packet\n); + ndev-stats.rx_length_errors++; + ndev-stats.rx_errors++; + dev_kfree_skb_any(skb); + goto next; + } + if (unlikely(!(status DESC_EOP) || !(status DESC_SOP))) { netif_err(priv, rx_status, ndev, fragmented packet!\n); ndev-stats.rx_dropped++; -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net 0/3] bna: misc bugfixes
These patches fix several bugs found during device initialization debugging. Cc: Rasesh Mody rasesh.m...@qlogic.com Ivan Vecera (3): bna: fix firmware loading on big-endian machines bna: remove unreasonable iocpf timer start bna: fix soft lock-up during firmware initialization failure drivers/net/ethernet/brocade/bna/bfa_ioc.c | 4 ++-- drivers/net/ethernet/brocade/bna/bnad.c | 4 drivers/net/ethernet/brocade/bna/cna_fwimg.c | 7 +++ 3 files changed, 9 insertions(+), 6 deletions(-) -- 2.3.6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net 1/3] bna: fix firmware loading on big-endian machines
Firmware required by bna is stored in appropriate files as sequence of LE32 integers. After loading by request_firmware() they need to be byte-swapped on big-endian arches. Without this conversion the NIC is unusable on big-endian machines. Cc: Rasesh Mody rasesh.m...@qlogic.com Signed-off-by: Ivan Vecera ivec...@redhat.com --- drivers/net/ethernet/brocade/bna/cna_fwimg.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/ethernet/brocade/bna/cna_fwimg.c b/drivers/net/ethernet/brocade/bna/cna_fwimg.c index ebf462d..badea36 100644 --- a/drivers/net/ethernet/brocade/bna/cna_fwimg.c +++ b/drivers/net/ethernet/brocade/bna/cna_fwimg.c @@ -30,6 +30,7 @@ cna_read_firmware(struct pci_dev *pdev, u32 **bfi_image, u32 *bfi_image_size, char *fw_name) { const struct firmware *fw; + u32 n; if (request_firmware(fw, fw_name, pdev-dev)) { pr_alert(Can't locate firmware %s\n, fw_name); @@ -40,6 +41,12 @@ cna_read_firmware(struct pci_dev *pdev, u32 **bfi_image, *bfi_image_size = fw-size/sizeof(u32); bfi_fw = fw; + /* Convert loaded firmware to host order as it is stored in file +* as sequence of LE32 integers. +*/ + for (n = 0; n *bfi_image_size; n++) + le32_to_cpus(*bfi_image + n); + return *bfi_image; error: return NULL; -- 2.3.6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Thu, 2015-05-28 at 22:05 +0300, Or Gerlitz wrote: On Thu, May 28, 2015 at 9:22 PM, Doug Ledford dledf...@redhat.com wrote: I don't think that is what Doug said. Indeed. There is no need to scrap things, but if the design as it stands, and the intended means of creating objects for use in containers, is going to result in an unworkable network, then we have to re-evaluate how the container constructs are created, and that then has possible consequences for how we would get from an incoming packet to the proper container. To be precise, do we agree that the issue here isn't in the design as it stands but rather in a problem we found in the intended way of assigning IP addresses through DHCP for the containers? No, I would say the problem *is* in the design. But the problem is the selected means of identifying the netdev to get to the namespace (and the proposed means of creating non-default namespace devices to exist in the container), not the namespace design itself. I'm not trying to stop the support train here, but at the same time, if the train is headed for a bridge that's out So what's your concrete saying here? where should we go from here? This excerpt is from the commit log of patch 3/12: The IB device and port, together with the P_Key and the IP address should be enough to uniquely identify the ULP net device. The problem here is that this is wrong. If we allow more than one device per pkey with the same GUID, then DHCP breaks, which is bad in and of itself, but it also breaks ipv6 link local addressing. Which means that this hunk in patch 4/12: +#if IS_ENABLED(CONFIG_IPV6) + case AF_INET6: + if (ipv6_chk_addr(net, addr_in6-sin6_addr, dev, 1)) + return true; + + break; +#endif can now be tricked into returning true for incorrect devices. Where do we go from here? First, I'm inclined to say we should modify the add_child portion of IPoIB to refuse to add links to a PKey if that GUID is already present on that PKey. You could then use different PKeys on the default GUID for separate namespaces. If you need separate namespaces on the same PKey, then enable alias GUIDs for use on the local adapter and require one GUID per namespace on the same PKey. Then I'm inclined to say that we should map for namespaces using device, port, guid/gid, pkey. And in this situation, since a unique guid/gid on any given pkey maps to a unique dhcp identifier and a unique ipv6 lladdr, this becomes freely interchangeable with device, port, pkey, address mappings that this patchset was built around. -- Doug Ledford dledf...@redhat.com GPG KeyID: 0E572FDD signature.asc Description: This is a digitally signed message part
Re: Drops in qdisc on ifb interface
On Thu, 2015-05-28 at 12:33 -0400, jsulli...@opensourcedevel.com wrote: Our initial testing has been single flow but the ultimate purpose is processing real time video in a complex application which ingests associated meta data, post to consumer facing cloud, does reporting back - so lots of different traffics with very different demands - a perfect tc environment. Wait, do you really plan using TCP for real time video ? -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Connection tracking and soft lockups with certain field values
Hi, I am currently playing with SYNPROXY target to optimize SYN filtering performance and by occasion found that TCP SYN packets containing port 0 can result in a soft lockup when conntrack is enabled just by itself, given high packet ratio (I`ve reached 450kpps so far with 60b packets on a /32-/32 flood with enabled flow control at the media level and middle-level E3 Xeon on receiver side). Same flood with port 0 going just well, producing same ceil numbers but without visible lockups in kernel log. I`ve tested the issue on a broad range of 3.x kernels and all of them are seemingly affected. Fast and dirty grep revealed special conditions for port 0 only for protocol-specific helpers, but there are none of them. Please find both same captures and traceback below. [ 152.001957] ixgbe :01:00.0 eth8: NIC Link is Up 10 Gbps, Flow Control: RX/TX [ 157.326410] sched: RT throttling activated [ 180.038105] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 21s! [rcuos/0:9] [ 180.038128] Modules linked in: xt_CT iptable_raw ipt_SYNPROXY nf_synproxy_core nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp xt_conntrack nf_conntrack iptable_filter ip_tables x_tables tun openvswitch(O) libcrc32c nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc bridge stp llc w83627ehf hwmon_vid jc42 loop fuse dm_crypt joydev hid_generic usbhid x86_pkg_temp_thermal intel_powerclamp ast coretemp igb ttm drm_kms_helper kvm_intel(O) drm iTCO_wdt iTCO_vendor_support sg pcspkr kvm(O) syscopyarea sysfillrect sysimgblt video thermal tpm_tis i2c_algo_bit ipmi_si ipmi_msghandler tpm i2c_i801 8250_fintek fan battery shpchp button ie31200_edac edac_core xhci_pci lpc_ich mfd_core xhci_hcd processor crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper [ 180.038157] ablk_helper cryptd mpt2sas raid_class ehci_pci ixgbe(O) ehci_hcd vxlan ip6_udp_tunnel udp_tunnel usbcore ptp usb_common pps_core dca dm_mirror dm_region_hash dm_log dm_mod [ 180.038165] CPU: 0 PID: 9 Comm: rcuos/0 Tainted: G O 3.18.10-default #5 [ 180.038166] Hardware name: Supermicro X10SL7-F/X10SL7-F, BIOS 2.00 04/24/2014 [ 180.038167] task: 880409624c20 ti: 88040963 task.ti: 88040963 [ 180.038168] RIP: 0010:[8155b972] [8155b972] dev_gro_receive+0x182/0x370 [ 180.038172] RSP: 0018:88041fc03d48 EFLAGS: 0296 [ 180.038173] RAX: 81cf6920 RBX: 000180200020 RCX: 67632533 [ 180.038174] RDX: 88040558 RSI: 8800d8392100 RDI: 8804054dc440 [ 180.038175] RBP: 88041fc03d98 R08: 002e R09: [ 180.038175] R10: 8800d8392100 R11: ea001019de00 R12: 88041fc03cb8 [ 180.038176] R13: 8165b83d R14: 88041fc03d98 R15: 8800d8392100 [ 180.038177] FS: () GS:88041fc0() knlGS: [ 180.038178] CS: 0010 DS: ES: CR0: 80050033 [ 180.038179] CR2: 7fed089d21b0 CR3: 01c15000 CR4: 001407f0 [ 180.038180] DR0: DR1: DR2: [ 180.038180] DR3: DR6: fffe0ff0 DR7: 0400 [ 180.038181] Stack: [ 180.038181] 0280 000e 000b 81cf8860 [ 180.038183] 88041fc03d98 8804054dc640 8800d8392100 8804054dc440 [ 180.038184] 000c 880407023fb0 88041fc03dc8 8155c0d0 [ 180.038185] Call Trace: [ 180.038186] IRQ [ 180.038189] [8155c0d0] napi_gro_receive+0x30/0x100 [ 180.038196] [a012cc39] ixgbe_clean_rx_irq+0x8d9/0x1030 [ixgbe] [ 180.038200] [a012e588] ixgbe_poll+0x478/0x690 [ixgbe] [ 180.038203] [8150da90] ? show_no_turbo+0x90/0x90 [ 180.038204] [8155bd99] net_rx_action+0x149/0x250 [ 180.038208] [8106629f] __do_softirq+0xdf/0x260 [ 180.038210] [8165c4fc] do_softirq_own_stack+0x1c/0x30 [ 180.038211] EOI [ 180.038213] [810664d5] do_softirq+0x65/0x70 [ 180.038214] [81066574] __local_bh_enable_ip+0x94/0xa0 [ 180.038216] [810bfa15] rcu_nocb_kthread+0x155/0x580 [ 180.038219] [8109fcc0] ? finish_wait+0x80/0x80 [ 180.038220] [810bf8c0] ? rcu_eqs_exit_common.isra.60+0xe0/0xe0 [ 180.038222] [8107fad9] kthread+0xc9/0xe0 [ 180.038224] [8107fa10] ? kthread_create_on_node+0x1a0/0x1a0 [ 180.038226] [8165a798] ret_from_fork+0x58/0x90 [ 180.038227] [8107fa10] ? kthread_create_on_node+0x1a0/0x1a0 [ 180.038228] Code: 00 48 8b 05 91 8a 79 00 48 89 45 c8 4c 8b 75 c8 49 81 fe e0 43 cf 81 49 8d 46 e0 0f 84 b6 01 00 00 66 44 39 28 74 0a 48 8b 40 20 eb db 0f 1f 40 00 48 83 78 10 00 74 ef 48 8b 93 d8 00 00 00 48 0-80.pcap Description: application/vnd.tcpdump.pcap 80-80.pcap Description: application/vnd.tcpdump.pcap
Re: [PATCH net-next 4/4] net/mlx4_core: Make sure there are no pending async events when freeing CQ
Hello. On 05/28/2015 06:41 PM, Or Gerlitz wrote: From: Matan Barak mat...@mellanox.com When freeing a CQ, we need to make sure there are no asynchronous events (on the ASYNC EQ) that could relate to this CQ before freeing it. This is done by introducing synchronize_irq. Signed-off-by: Matan Barak mat...@mellanox.com Signed-off-by: Ido Shamay i...@mellanox.com Signed-off-by: Or Gerlitz ogerl...@mellanox.com --- drivers/net/ethernet/mellanox/mlx4/cq.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/cq.c b/drivers/net/ethernet/mellanox/mlx4/cq.c index 7431cd4..1fc1dc5 100644 --- a/drivers/net/ethernet/mellanox/mlx4/cq.c +++ b/drivers/net/ethernet/mellanox/mlx4/cq.c @@ -369,6 +369,10 @@ void mlx4_cq_free(struct mlx4_dev *dev, struct mlx4_cq *cq) mlx4_warn(dev, HW2SW_CQ failed (%d) for CQN %06x\n, err, cq-cqn); synchronize_irq(priv-eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(cq-vector)].irq); + if (priv-eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(cq-vector)].irq != + priv-eq_table.eq[MLX4_EQ_ASYNC].irq) + synchronize_irq(priv-eq_table.eq[MLX4_EQ_ASYNC].irq); + I think one empty line was enough. spin_lock_irq(cq_table-lock); radix_tree_delete(cq_table-tree, cq-cqn); WBR, Sergei -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xfrm6: Do not use xfrm_local_error for path MTU issues in tunnels
On 05/28/2015 01:40 AM, Steffen Klassert wrote: On Thu, May 28, 2015 at 12:18:51AM -0700, Alexander Duyck wrote: On 05/27/2015 10:36 PM, Steffen Klassert wrote: On Wed, May 27, 2015 at 10:40:32AM -0700, Alexander Duyck wrote: This change makes it so that we use icmpv6_send to report PMTU issues back into tunnels in the case that the resulting packet is larger than the MTU of the outgoing interface. Previously xfrm_local_error was being used in this case, however this was resulting in no changes, I suspect due to the fact that the tunnel itself was being kept out of the loop. This patch fixes PMTU problems seen on ip6_vti tunnels and is based on the behavior seen if the socket was orphaned. Instead of requiring the socket to be orphaned this patch simply defaults to using icmpv6_send in the case that the frame came though a tunnel. We can use icmpv6_send() just in the case that the packet was already transmitted by a tunnel device, otherwise we get the bug back that I mentioned in my other mail. Not sure if we have something to know that the packet traversed a tunnel device. That's what I asked in the thread 'Looking for a lost patch'. Okay I will try to do some more digging. From what I can tell right now it looks like my ping attempts are getting hung up on the xfrm_local_error in __xfrm6_output. I wonder if we couldn't somehow make use of the skb-cb to store a pointer to the tunnel that could be checked to determine if we are going through a VTI or not. Maybe it is as easy as the patch below, could you please test it? Subject: [PATCH RFC] vti6: Add pmtu handling to vti6_xmit. We currently rely on the PMTU discovery of xfrm. However if a packet is localy sent, the PMTU mechanism of xfrm tries to to local socket notification what might not work for applications like ping that don't check for this. So add pmtu handling to vti6_xmit to report MTU changes immediately. Signed-off-by: Steffen Klassert steffen.klass...@secunet.com --- net/ipv6/ip6_vti.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c index ff3bd86..13cb771 100644 --- a/net/ipv6/ip6_vti.c +++ b/net/ipv6/ip6_vti.c @@ -434,6 +434,7 @@ vti6_xmit(struct sk_buff *skb, struct net_device *dev, struct flowi *fl) struct dst_entry *dst = skb_dst(skb); struct net_device *tdev; struct xfrm_state *x; + int mtu; int err = -1; if (!dst) @@ -468,6 +469,15 @@ vti6_xmit(struct sk_buff *skb, struct net_device *dev, struct flowi *fl) skb_dst_set(skb, dst); skb-dev = skb_dst(skb)-dev; + mtu = dst_mtu(dst); + if (!skb-ignore_df skb-len mtu) { + skb_dst(skb)-ops-update_pmtu(dst, NULL, skb, mtu); + + icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu); + + return -EMSGSIZE; + } + err = dst_output(skb); if (net_xmit_eval(err) == 0) { struct pcpu_sw_netstats *tstats = this_cpu_ptr(dev-tstats); That seems to be working for me. I'm able to ping and while the first packet fails the second one and all that follow make it through correctly after the ptmu update. - Alex -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 net-next 05/11] net: Add full IPv6 addresses to flow_keys
This patch adds full IPv6 addresses into flow_keys and uses them as input to the flow hash function. The implementation supports either IPv4 or IPv6 addresses in a union, and selector is used to determine how may words to input to jhash2. We also add flow_get_u32_dst and flow_get_u32_src functions which are used to get a u32 representation of the source and destination addresses. For IPv6, ipv6_addr_hash is called. These functions retain getting the legacy values of src and dst in flow_keys. With this patch, Ethertype and IP protocol are now included in the flow hash input. Signed-off-by: Tom Herbert t...@herbertland.com --- drivers/net/bonding/bond_main.c| 9 +- drivers/net/ethernet/cisco/enic/enic_clsf.c| 8 +- drivers/net/ethernet/cisco/enic/enic_ethtool.c | 4 +- include/net/flow_dissector.h | 52 +++ include/net/ip.h | 19 +++- include/net/ipv6.h | 21 - net/core/flow_dissector.c | 116 + net/ethernet/eth.c | 2 +- net/sched/cls_flow.c | 14 ++- net/sched/cls_flower.c | 11 +-- 10 files changed, 193 insertions(+), 63 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 2268438..19eb990 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -3059,8 +3059,7 @@ static bool bond_flow_dissect(struct bonding *bond, struct sk_buff *skb, if (unlikely(!pskb_may_pull(skb, noff + sizeof(*iph return false; iph = ip_hdr(skb); - fk-addrs.src = iph-saddr; - fk-addrs.dst = iph-daddr; + iph_to_flow_copy_v4addrs(fk, iph); noff += iph-ihl 2; if (!ip_is_fragment(iph)) proto = iph-protocol; @@ -3068,8 +3067,7 @@ static bool bond_flow_dissect(struct bonding *bond, struct sk_buff *skb, if (unlikely(!pskb_may_pull(skb, noff + sizeof(*iph6 return false; iph6 = ipv6_hdr(skb); - fk-addrs.src = (__force __be32)ipv6_addr_hash(iph6-saddr); - fk-addrs.dst = (__force __be32)ipv6_addr_hash(iph6-daddr); + iph_to_flow_copy_v6addrs(fk, iph6); noff += sizeof(*iph6); proto = iph6-nexthdr; } else { @@ -3103,7 +3101,8 @@ u32 bond_xmit_hash(struct bonding *bond, struct sk_buff *skb) hash = bond_eth_hash(skb); else hash = (__force u32)flow.ports.ports; - hash ^= (__force u32)flow.addrs.dst ^ (__force u32)flow.addrs.src; + hash ^= (__force u32)flow_get_u32_dst(flow) ^ + (__force u32)flow_get_u32_src(flow); hash ^= (hash 16); hash ^= (hash 8); diff --git a/drivers/net/ethernet/cisco/enic/enic_clsf.c b/drivers/net/ethernet/cisco/enic/enic_clsf.c index a31b57a..d106186 100644 --- a/drivers/net/ethernet/cisco/enic/enic_clsf.c +++ b/drivers/net/ethernet/cisco/enic/enic_clsf.c @@ -33,8 +33,8 @@ int enic_addfltr_5t(struct enic *enic, struct flow_keys *keys, u16 rq) return -EPROTONOSUPPORT; }; data.type = FILTER_IPV4_5TUPLE; - data.u.ipv4.src_addr = ntohl(keys-addrs.src); - data.u.ipv4.dst_addr = ntohl(keys-addrs.dst); + data.u.ipv4.src_addr = ntohl(keys-addrs.v4addrs.src); + data.u.ipv4.dst_addr = ntohl(keys-addrs.v4addrs.dst); data.u.ipv4.src_port = ntohs(keys-ports.src); data.u.ipv4.dst_port = ntohs(keys-ports.dst); data.u.ipv4.flags = FILTER_FIELDS_IPV4_5TUPLE; @@ -158,8 +158,8 @@ static struct enic_rfs_fltr_node *htbl_key_search(struct hlist_head *h, struct enic_rfs_fltr_node *tpos; hlist_for_each_entry(tpos, h, node) - if (tpos-keys.addrs.src == k-addrs.src - tpos-keys.addrs.dst == k-addrs.dst + if (tpos-keys.addrs.v4addrs.src == k-addrs.v4addrs.src + tpos-keys.addrs.v4addrs.dst == k-addrs.v4addrs.dst tpos-keys.ports.ports == k-ports.ports tpos-keys.basic.ip_proto == k-basic.ip_proto tpos-keys.basic.n_proto == k-basic.n_proto) diff --git a/drivers/net/ethernet/cisco/enic/enic_ethtool.c b/drivers/net/ethernet/cisco/enic/enic_ethtool.c index 117c096..73874b2 100644 --- a/drivers/net/ethernet/cisco/enic/enic_ethtool.c +++ b/drivers/net/ethernet/cisco/enic/enic_ethtool.c @@ -346,10 +346,10 @@ static int enic_grxclsrule(struct enic *enic, struct ethtool_rxnfc *cmd) break; } - fsp-h_u.tcp_ip4_spec.ip4src = n-keys.addrs.src; + fsp-h_u.tcp_ip4_spec.ip4src = flow_get_u32_src(n-keys); fsp-m_u.tcp_ip4_spec.ip4src = (__u32)~0; - fsp-h_u.tcp_ip4_spec.ip4dst =
[PATCH v4 net-next 09/11] net: Add IPv6 flow label to flow_keys
In flow_dissector set the flow label in flow_keys for IPv6. This also removes the shortcircuiting of flow dissection when a non-zero label is present, the flow label can be considered to provide additional entropy for a hash. Signed-off-by: Tom Herbert t...@herbertland.com --- include/net/flow_dissector.h | 4 +++- net/core/flow_dissector.c| 31 +++ 2 files changed, 14 insertions(+), 21 deletions(-) diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h index 08480fb..14d8483 100644 --- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -28,7 +28,8 @@ struct flow_dissector_key_basic { }; struct flow_dissector_key_tags { - u32 vlan_id:12; + u32 vlan_id:12, + flow_label:20; }; /** @@ -111,6 +112,7 @@ enum flow_dissector_key_id { FLOW_DISSECTOR_KEY_ETH_ADDRS, /* struct flow_dissector_key_eth_addrs */ FLOW_DISSECTOR_KEY_TIPC_ADDRS, /* struct flow_dissector_key_tipc_addrs */ FLOW_DISSECTOR_KEY_VLANID, /* struct flow_dissector_key_flow_tags */ + FLOW_DISSECTOR_KEY_FLOW_LABEL, /* struct flow_dissector_key_flow_tags */ FLOW_DISSECTOR_KEY_MAX, }; diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 5c66cb2..ba089d9 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -190,7 +190,7 @@ ip: case htons(ETH_P_IPV6): { const struct ipv6hdr *iph; struct ipv6hdr _iph; - __be32 flow_label; + u32 flow_label; ipv6: iph = __skb_header_pointer(skb, nhoff, sizeof(_iph), data, hlen, _iph); @@ -210,30 +210,17 @@ ipv6: memcpy(key_ipv6_addrs, iph-saddr, sizeof(*key_ipv6_addrs)); key_control-addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS; - goto flow_label; } - break; -flow_label: + flow_label = ip6_flowlabel(iph); if (flow_label) { - /* Awesome, IPv6 packet has a flow label so we can -* use that to represent the ports without any -* further dissection. -*/ - - key_basic-n_proto = proto; - key_basic-ip_proto = ip_proto; - key_control-thoff = (u16)nhoff; - if (skb_flow_dissector_uses_key(flow_dissector, - FLOW_DISSECTOR_KEY_PORTS)) { - key_ports = skb_flow_dissector_target(flow_dissector, - FLOW_DISSECTOR_KEY_PORTS, - target_container); - key_ports-ports = flow_label; + FLOW_DISSECTOR_KEY_FLOW_LABEL)) { + key_tags = skb_flow_dissector_target(flow_dissector, + FLOW_DISSECTOR_KEY_FLOW_LABEL, + target_container); + key_tags-flow_label = ntohl(flow_label); } - - return true; } break; @@ -659,6 +646,10 @@ static const struct flow_dissector_key flow_keys_dissector_keys[] = { .key_id = FLOW_DISSECTOR_KEY_VLANID, .offset = offsetof(struct flow_keys, tags), }, + { + .key_id = FLOW_DISSECTOR_KEY_FLOW_LABEL, + .offset = offsetof(struct flow_keys, tags), + }, }; static const struct flow_dissector_key flow_keys_buf_dissector_keys[] = { -- 1.8.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 net-next 03/11] net: Remove superfluous setting of key_basic
key_basic is set twice in __skb_flow_dissect which seems unnecessary. Remove second one. Acked-by: Jiri Pirko j...@resnulli.us Signed-off-by: Tom Herbert t...@herbertland.com --- net/core/flow_dissector.c | 6 -- 1 file changed, 6 deletions(-) diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 7f69916..0763795 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -343,12 +343,6 @@ flow_label: break; } - /* It is ensured by skb_flow_dissector_init() that basic key will -* be always present. -*/ - key_basic = skb_flow_dissector_target(flow_dissector, - FLOW_DISSECTOR_KEY_BASIC, - target_container); key_basic-n_proto = proto; key_basic-ip_proto = ip_proto; key_basic-thoff = (u16) nhoff; -- 1.8.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 net-next 00/11] net: Increase inputs to flow_keys hashing
This patch set adds new fields to the flow_keys structure and hashes over these fields to get a better flow hash. In particular, these patches now include hashing over the full IPv6 addresses in order to defend against address spoofing that always results in the same hash. The new input also includes the Ethertype, L4 protocol, VLAN, flow label, GRE keyid, and MPLS entropy label. In order to increase hash inputs, we switch to using jhash2 which operates an an array of u32's. jhash2 operates on multiples of three words. The data in the hash is constructed for that, and there are are two variants for IPv4 and Ipv6 addressing. For IPv4 addresses, jhash is performed over six u32's and for IPv6 it is done over twelve. flow_keys can store either IPv4 or IPv6 addresses (addr_proto field is a selector). ipv6_addr_hash is no longer used to convert addresses for setting in flow table. For legacy uses of flow keys outside of flow_dissector the flow_get_u32_src and flow_get_u32_dst functions have been added to get u32 representation representations of addresses in flow_keys. For flow lables we also eliminate the short circuit in flow_dissector for non-zero flow label. The flow label is now considered additional input to ports. Testing: Ran netperf TCP_RR for 200 flows using IPv4 and IPv6 comparing before the patches and with the patches. Did not detect any performance degradation. v2: - Took out MPLS entropy label. Will add this later. v3: - Ensure hash start offset is a four byte boundary. Add BUG_BUILD_ON to check for this. - Fixes sparse error in GRE to get entropy from keyid. v4: - Rebase to Jiri changes to generalize flow dissection - Support TIPC as its own address - Bring back MPLS entropy label dissection - Remove FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS v5: - Minor fixes from feedback Tom Herbert (11): net: Simplify GRE case in flow_dissector mpls: Add definition for IPPROTO_MPLS net: Remove superfluous setting of key_basic net: Get skb hash over flow_keys structure net: Add full IPv6 addresses to flow_keys net: Add keys for TIPC address net: Get rid of IPv6 hash addresses flow keys net: Add VLAN ID to flow_keys net: Add IPv6 flow label to flow_keys net: Add GRE keyid in flow_keys mpls: Add MPLS entropy label in flow_keys drivers/net/bonding/bond_main.c| 9 +- drivers/net/ethernet/cisco/enic/enic_clsf.c| 8 +- drivers/net/ethernet/cisco/enic/enic_ethtool.c | 4 +- include/linux/skbuff.h | 2 +- include/net/flow_dissector.h | 97 ++-- include/net/ip.h | 21 +- include/net/ipv6.h | 23 +- include/uapi/linux/in.h| 2 + net/core/flow_dissector.c | 329 ++--- net/ethernet/eth.c | 2 +- net/sched/cls_flow.c | 14 +- net/sched/cls_flower.c | 13 +- 12 files changed, 388 insertions(+), 136 deletions(-) -- 1.8.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 net-next 02/11] mpls: Add definition for IPPROTO_MPLS
Add uapi define for MPLS over IP. Acked-by: Jiri Pirko j...@resnulli.us Signed-off-by: Tom Herbert t...@herbertland.com --- include/uapi/linux/in.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h index 589ced0..641338b 100644 --- a/include/uapi/linux/in.h +++ b/include/uapi/linux/in.h @@ -69,6 +69,8 @@ enum { #define IPPROTO_SCTP IPPROTO_SCTP IPPROTO_UDPLITE = 136, /* UDP-Lite (RFC 3828) */ #define IPPROTO_UDPLITEIPPROTO_UDPLITE + IPPROTO_MPLS = 137, /* MPLS in IP (RFC 4023)*/ +#define IPPROTO_MPLS IPPROTO_MPLS IPPROTO_RAW = 255, /* Raw IP packets */ #define IPPROTO_RAWIPPROTO_RAW IPPROTO_MAX -- 1.8.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 net-next 04/11] net: Get skb hash over flow_keys structure
This patch changes flow hashing to use jhash2 over the flow_keys structure instead just doing jhash_3words over src, dst, and ports. This method will allow us take more input into the hashing function so that we can include full IPv6 addresses, VLAN, flow labels etc. without needing to resort to xor'ing which makes for a poor hash. Acked-by: Jiri Pirko j...@resnulli.us Signed-off-by: Tom Herbert t...@herbertland.com --- include/linux/skbuff.h | 2 +- include/net/flow_dissector.h | 21 ++--- include/net/ip.h | 2 ++ include/net/ipv6.h | 2 ++ net/core/flow_dissector.c| 54 +--- net/sched/cls_flower.c | 2 ++ 6 files changed, 66 insertions(+), 17 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 6b41c15..cc612fc 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1943,7 +1943,7 @@ static inline void skb_probe_transport_header(struct sk_buff *skb, if (skb_transport_header_was_set(skb)) return; else if (skb_flow_dissect_flow_keys(skb, keys)) - skb_set_transport_header(skb, keys.basic.thoff); + skb_set_transport_header(skb, keys.control.thoff); else skb_set_transport_header(skb, offset_hint); } diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h index bac9c14..cba6a10 100644 --- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -7,15 +7,24 @@ #include uapi/linux/if_ether.h /** + * struct flow_dissector_key_control: + * @thoff: Transport header offset + */ +struct flow_dissector_key_control { + u16 thoff; + u16 padding; +}; + +/** * struct flow_dissector_key_basic: * @thoff: Transport header offset * @n_proto: Network header protocol (eg. IPv4/IPv6) * @ip_proto: Transport header protocol (eg. TCP/UDP) */ struct flow_dissector_key_basic { - u16 thoff; __be16 n_proto; u8 ip_proto; + u8 padding; }; /** @@ -70,6 +79,7 @@ struct flow_dissector_key_eth_addrs { }; enum flow_dissector_key_id { + FLOW_DISSECTOR_KEY_CONTROL, /* struct flow_dissector_key_control */ FLOW_DISSECTOR_KEY_BASIC, /* struct flow_dissector_key_basic */ FLOW_DISSECTOR_KEY_IPV4_ADDRS, /* struct flow_dissector_key_addrs */ FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS, /* struct flow_dissector_key_addrs */ @@ -109,11 +119,16 @@ static inline bool skb_flow_dissect(const struct sk_buff *skb, } struct flow_keys { - struct flow_dissector_key_addrs addrs; - struct flow_dissector_key_ports ports; + struct flow_dissector_key_control control; +#define FLOW_KEYS_HASH_START_FIELD basic struct flow_dissector_key_basic basic; + struct flow_dissector_key_ports ports; + struct flow_dissector_key_addrs addrs; }; +#define FLOW_KEYS_HASH_OFFSET \ + offsetof(struct flow_keys, FLOW_KEYS_HASH_START_FIELD) + extern struct flow_dissector flow_keys_dissector; extern struct flow_dissector flow_keys_buf_dissector; diff --git a/include/net/ip.h b/include/net/ip.h index 9b976cf..16cfc87 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -360,6 +360,8 @@ static inline void inet_set_txhash(struct sock *sk) struct inet_sock *inet = inet_sk(sk); struct flow_keys keys; + memset(keys, 0, sizeof(keys)); + keys.addrs.src = inet-inet_saddr; keys.addrs.dst = inet-inet_daddr; keys.ports.src = inet-inet_sport; diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 35d485c..474ca46 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -699,6 +699,8 @@ static inline void ip6_set_txhash(struct sock *sk) struct ipv6_pinfo *np = inet6_sk(sk); struct flow_keys keys; + memset(keys, 0, sizeof(keys)); + keys.addrs.src = (__force __be32)ipv6_addr_hash(np-saddr); keys.addrs.dst = (__force __be32)ipv6_addr_hash(sk-sk_v6_daddr); keys.ports.src = inet-inet_sport; diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 0763795..55b5f29 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -57,10 +57,12 @@ void skb_flow_dissector_init(struct flow_dissector *flow_dissector, flow_dissector-offset[key-key_id] = key-offset; } - /* Ensure that the dissector always includes basic key. That way -* we are able to avoid handling lack of it in fast path. + /* Ensure that the dissector always includes control and basic key. +* That way we are able to avoid handling lack of these in fast path. */ BUG_ON(!skb_flow_dissector_uses_key(flow_dissector, + FLOW_DISSECTOR_KEY_CONTROL)); + BUG_ON(!skb_flow_dissector_uses_key(flow_dissector, FLOW_DISSECTOR_KEY_BASIC)); }
[PATCH v4 net-next 01/11] net: Simplify GRE case in flow_dissector
Do break when we see routing flag or a non-zero version number in GRE header. Acked-by: Jiri Pirko j...@resnulli.us Signed-off-by: Tom Herbert t...@herbertland.com --- net/core/flow_dissector.c | 44 ++-- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 1f2d893..7f69916 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -308,30 +308,30 @@ flow_label: * Only look inside GRE if version zero and no * routing */ - if (!(hdr-flags (GRE_VERSION|GRE_ROUTING))) { - proto = hdr-proto; + if (hdr-flags (GRE_VERSION | GRE_ROUTING)) + break; + + proto = hdr-proto; + nhoff += 4; + if (hdr-flags GRE_CSUM) nhoff += 4; - if (hdr-flags GRE_CSUM) - nhoff += 4; - if (hdr-flags GRE_KEY) - nhoff += 4; - if (hdr-flags GRE_SEQ) - nhoff += 4; - if (proto == htons(ETH_P_TEB)) { - const struct ethhdr *eth; - struct ethhdr _eth; - - eth = __skb_header_pointer(skb, nhoff, - sizeof(_eth), - data, hlen, _eth); - if (!eth) - return false; - proto = eth-h_proto; - nhoff += sizeof(*eth); - } - goto again; + if (hdr-flags GRE_KEY) + nhoff += 4; + if (hdr-flags GRE_SEQ) + nhoff += 4; + if (proto == htons(ETH_P_TEB)) { + const struct ethhdr *eth; + struct ethhdr _eth; + + eth = __skb_header_pointer(skb, nhoff, + sizeof(_eth), + data, hlen, _eth); + if (!eth) + return false; + proto = eth-h_proto; + nhoff += sizeof(*eth); } - break; + goto again; } case IPPROTO_IPIP: proto = htons(ETH_P_IP); -- 1.8.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 net-next 10/11] net: Add GRE keyid in flow_keys
In flow dissector if a GRE header contains a keyid this is saved in the new keyid field of flow_keys. The GRE keyid is then represented in the flow hash function input. Signed-off-by: Tom Herbert t...@herbertland.com --- include/net/flow_dissector.h | 6 ++ net/core/flow_dissector.c| 24 +++- 2 files changed, 29 insertions(+), 1 deletion(-) diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h index 14d8483..5d4257b 100644 --- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -32,6 +32,10 @@ struct flow_dissector_key_tags { flow_label:20; }; +struct flow_dissector_key_keyid { + u32 keyid; +}; + /** * struct flow_dissector_key_ipv4_addrs: * @src: source ip address @@ -113,6 +117,7 @@ enum flow_dissector_key_id { FLOW_DISSECTOR_KEY_TIPC_ADDRS, /* struct flow_dissector_key_tipc_addrs */ FLOW_DISSECTOR_KEY_VLANID, /* struct flow_dissector_key_flow_tags */ FLOW_DISSECTOR_KEY_FLOW_LABEL, /* struct flow_dissector_key_flow_tags */ + FLOW_DISSECTOR_KEY_GRE_KEYID, /* struct flow_dissector_key_keyid */ FLOW_DISSECTOR_KEY_MAX, }; @@ -150,6 +155,7 @@ struct flow_keys { #define FLOW_KEYS_HASH_START_FIELD basic struct flow_dissector_key_basic basic; struct flow_dissector_key_tags tags; + struct flow_dissector_key_keyid keyid; struct flow_dissector_key_ports ports; struct flow_dissector_key_addrs addrs; }; diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index ba089d9..ea318d5 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -127,6 +127,7 @@ bool __skb_flow_dissect(const struct sk_buff *skb, struct flow_dissector_key_addrs *key_addrs; struct flow_dissector_key_ports *key_ports; struct flow_dissector_key_tags *key_tags; + struct flow_dissector_key_keyid *key_keyid; u8 ip_proto; if (!data) { @@ -315,8 +316,25 @@ ipv6: nhoff += 4; if (hdr-flags GRE_CSUM) nhoff += 4; - if (hdr-flags GRE_KEY) + if (hdr-flags GRE_KEY) { + const __be32 *keyid; + __be32 _keyid; + + keyid = __skb_header_pointer(skb, nhoff, sizeof(_keyid), +data, hlen, _keyid); + + if (!keyid) + return false; + + if (skb_flow_dissector_uses_key(flow_dissector, + FLOW_DISSECTOR_KEY_GRE_KEYID)) { + key_keyid = skb_flow_dissector_target(flow_dissector, + FLOW_DISSECTOR_KEY_GRE_KEYID, + target_container); + key_keyid-keyid = ntohl(*keyid); + } nhoff += 4; + } if (hdr-flags GRE_SEQ) nhoff += 4; if (proto == htons(ETH_P_TEB)) { @@ -650,6 +668,10 @@ static const struct flow_dissector_key flow_keys_dissector_keys[] = { .key_id = FLOW_DISSECTOR_KEY_FLOW_LABEL, .offset = offsetof(struct flow_keys, tags), }, + { + .key_id = FLOW_DISSECTOR_KEY_GRE_KEYID, + .offset = offsetof(struct flow_keys, keyid), + }, }; static const struct flow_dissector_key flow_keys_buf_dissector_keys[] = { -- 1.8.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux-next: build failure after merge of most of the trees
From: Stephen Rothwell s...@canb.auug.org.au Date: Thu, 28 May 2015 22:06:07 +1000 Ouch :-( The only thing I will say on this matter is that the _only_ way this problem will go away is if someone does the work necessary to get rid of that implicit vmalloc.h include that happens on all x86 platform builds. So if you want this to stop happening, work on that. I've applied the following to net-next, thanks for your report. [PATCH] treewide: Add missing vmalloc.h inclusion. All of these files were only building on non-x86 because of the indirect of inclusion of vmalloc.h by, of all things, net/inet_hashtables.h None of this got caught during build testing, because on x86 there is an implicit vmalloc.h include via on of the arch asm/ headers. This fixes all of these Reported-by: Stephen Rothwell s...@canb.auug.org.au Signed-off-by: David S. Miller da...@davemloft.net --- crypto/algif_skcipher.c| 1 + drivers/scsi/qla2xxx/tcm_qla2xxx.c | 1 + drivers/target/iscsi/iscsi_target.c| 1 + drivers/target/target_core_file.c | 1 + drivers/target/target_core_pr.c| 1 + drivers/target/target_core_transport.c | 1 + drivers/target/target_core_user.c | 1 + drivers/vhost/scsi.c | 1 + 8 files changed, 8 insertions(+) diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c index 37110fd..4d1c315 100644 --- a/crypto/algif_skcipher.c +++ b/crypto/algif_skcipher.c @@ -444,6 +444,7 @@ static int skcipher_recvmsg(struct kiocb *unused, struct socket *sock, err = skcipher_wait_for_data(sk, flags); if (err) goto unlock; + used = ctx-used; } used = min_t(unsigned long, used, iov_iter_count(msg-msg_iter)); diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c b/drivers/scsi/qla2xxx/tcm_qla2xxx.c index 73f9fee..54c986a 100644 --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c @@ -27,6 +27,7 @@ #include linux/moduleparam.h #include generated/utsrelease.h #include linux/utsname.h +#include linux/vmalloc.h #include linux/init.h #include linux/list.h #include linux/slab.h diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index aebde32..f2ce95c 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -21,6 +21,7 @@ #include linux/crypto.h #include linux/completion.h #include linux/module.h +#include linux/vmalloc.h #include linux/idr.h #include asm/unaligned.h #include scsi/scsi_device.h diff --git a/drivers/target/target_core_file.c b/drivers/target/target_core_file.c index d836de2..5f8b119 100644 --- a/drivers/target/target_core_file.c +++ b/drivers/target/target_core_file.c @@ -30,6 +30,7 @@ #include linux/slab.h #include linux/spinlock.h #include linux/module.h +#include linux/vmalloc.h #include linux/falloc.h #include scsi/scsi.h #include scsi/scsi_host.h diff --git a/drivers/target/target_core_pr.c b/drivers/target/target_core_pr.c index 283cf78..06cd53e 100644 --- a/drivers/target/target_core_pr.c +++ b/drivers/target/target_core_pr.c @@ -27,6 +27,7 @@ #include linux/slab.h #include linux/spinlock.h #include linux/list.h +#include linux/vmalloc.h #include linux/file.h #include scsi/scsi.h #include scsi/scsi_cmnd.h diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index 0adc0f6..c99d2ea 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -34,6 +34,7 @@ #include linux/cdrom.h #include linux/module.h #include linux/ratelimit.h +#include linux/vmalloc.h #include asm/unaligned.h #include net/sock.h #include net/tcp.h diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c index 1a1bcf7..ca43a10 100644 --- a/drivers/target/target_core_user.c +++ b/drivers/target/target_core_user.c @@ -21,6 +21,7 @@ #include linux/idr.h #include linux/timer.h #include linux/parser.h +#include linux/vmalloc.h #include scsi/scsi.h #include scsi/scsi_host.h #include linux/uio_driver.h diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index dc78d87..16b45ca 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -35,6 +35,7 @@ #include linux/compat.h #include linux/eventfd.h #include linux/fs.h +#include linux/vmalloc.h #include linux/miscdevice.h #include asm/unaligned.h #include scsi/scsi.h -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux-next: build failure after merge of most of the trees
On Thu, 2015-05-28 at 11:42 -0700, David Miller wrote: I've applied the following to net-next, thanks for your report. [PATCH] treewide: Add missing vmalloc.h inclusion. All of these files were only building on non-x86 because of the indirect of inclusion of vmalloc.h by, of all things, net/inet_hashtables.h [] diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c [] @@ -444,6 +444,7 @@ static int skcipher_recvmsg(struct kiocb *unused, struct socket *sock, err = skcipher_wait_for_data(sk, flags); if (err) goto unlock; + used = ctx-used; huh? -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next] bpf: allow BPF programs access skb-skb_iif and skb-dev-ifindex fields
On 05/28/2015 12:30 AM, Alexei Starovoitov wrote: classic BPF already exposes skb-dev-ifindex via SKF_AD_IFINDEX extension. Allow eBPF program to access it as well. Note that classic aborts execution of the program if 'skb-dev == NULL' (which is inconvenient for program writers), whereas eBPF returns zero in such case. That's better, yep. Also expose the 'skb_iif' field, since programs triggered by redirected packet need to known the original interface index. Summary: __skb-ifindex - skb-dev-ifindex __skb-ingress_ifindex - skb-skb_iif Signed-off-by: Alexei Starovoitov a...@plumgrid.com Acked-by: Daniel Borkmann dan...@iogearbox.net -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next V5 10/11] net/mlx5: Ethernet resource handling files
This patch contains the resource handling files: - flow_table.c: This file contains the code to handle the low level API to configure hardware flow table. It is separated from the flow_table_en.c, because it will be used in the future by Raw Ethernet QP in mlx5_ib too. - en_flow_table.[ch]: Ethernet flow steering handling. The flow table object contain a mapping between flow specs and TIRs. This mechanism will be used also to configure e-switch in the future, when SR-IOV support will be added. - transobj.[ch] - Low level functions to create/modify/destroy the transport objects: RQ/SQ/TIR/TIS - vport.[ch] - Handle attributes of a virtual port (vPort) in the embedded switch. Currently this switch is a passthrough, until SR-IOV support will be added. Signed-off-by: Amir Vadai am...@mellanox.com --- .../ethernet/mellanox/mlx5/core/en_flow_table.c| 858 + .../net/ethernet/mellanox/mlx5/core/flow_table.c | 422 ++ drivers/net/ethernet/mellanox/mlx5/core/transobj.c | 169 drivers/net/ethernet/mellanox/mlx5/core/transobj.h | 47 ++ drivers/net/ethernet/mellanox/mlx5/core/vport.c| 84 ++ drivers/net/ethernet/mellanox/mlx5/core/vport.h| 41 + include/linux/mlx5/flow_table.h| 54 ++ 7 files changed, 1675 insertions(+) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_flow_table.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/flow_table.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/transobj.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/transobj.h create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/vport.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/vport.h create mode 100644 include/linux/mlx5/flow_table.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_flow_table.c b/drivers/net/ethernet/mellanox/mlx5/core/en_flow_table.c new file mode 100644 index 000..6feebda --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_flow_table.c @@ -0,0 +1,858 @@ +/* + * Copyright (c) 2015, Mellanox Technologies. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#include linux/list.h +#include linux/ip.h +#include linux/ipv6.h +#include linux/tcp.h +#include linux/mlx5/flow_table.h +#include en.h + +enum { + MLX5E_FULLMATCH = 0, + MLX5E_ALLMULTI = 1, + MLX5E_PROMISC = 2, +}; + +enum { + MLX5E_UC= 0, + MLX5E_MC_IPV4 = 1, + MLX5E_MC_IPV6 = 2, + MLX5E_MC_OTHER = 3, +}; + +enum { + MLX5E_ACTION_NONE = 0, + MLX5E_ACTION_ADD = 1, + MLX5E_ACTION_DEL = 2, +}; + +struct mlx5e_eth_addr_hash_node { + struct hlist_node hlist; + u8 action; + struct mlx5e_eth_addr_info ai; +}; + +static inline int mlx5e_hash_eth_addr(u8 *addr) +{ + return addr[5]; +} + +static void mlx5e_add_eth_addr_to_hash(struct hlist_head *hash, u8 *addr) +{ + struct mlx5e_eth_addr_hash_node *hn; + int ix = mlx5e_hash_eth_addr(addr); + int found = 0; + + hlist_for_each_entry(hn, hash[ix], hlist) + if (ether_addr_equal_64bits(hn-ai.addr, addr)) { + found = 1; + break; + } + + if (found) { + hn-action = MLX5E_ACTION_NONE; + return; + } + + hn = kzalloc(sizeof(*hn), GFP_ATOMIC); + if (!hn) + return; + + ether_addr_copy(hn-ai.addr, addr); +
[PATCH net-next V5 05/11] net/mlx5_core: Implement access functions of ptys register fields
From: Saeed Mahameed sae...@mellanox.com Those registers will be used by the ethtool to set/get settings. Signed-off-by: Rana Shahout ra...@mellanox.com Signed-off-by: Saeed Mahameed sae...@mellanox.com Signed-off-by: Amir Vadai am...@mellanox.com --- drivers/net/ethernet/mellanox/mlx5/core/port.c | 77 ++ include/linux/mlx5/driver.h| 14 + 2 files changed, 91 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c index 49e90f2..6e2d99c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/port.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c @@ -102,3 +102,80 @@ int mlx5_set_port_caps(struct mlx5_core_dev *dev, u8 port_num, u32 caps) return err; } EXPORT_SYMBOL_GPL(mlx5_set_port_caps); + +int mlx5_query_port_ptys(struct mlx5_core_dev *dev, u32 *ptys, +int ptys_size, int proto_mask) +{ + u32 in[MLX5_ST_SZ_DW(ptys_reg)]; + int err; + + memset(in, 0, sizeof(in)); + MLX5_SET(ptys_reg, in, local_port, 1); + MLX5_SET(ptys_reg, in, proto_mask, proto_mask); + + err = mlx5_core_access_reg(dev, in, sizeof(in), ptys, + ptys_size, MLX5_REG_PTYS, 0, 0); + + return err; +} +EXPORT_SYMBOL_GPL(mlx5_query_port_ptys); + +int mlx5_query_port_proto_cap(struct mlx5_core_dev *dev, + u32 *proto_cap, int proto_mask) +{ + u32 out[MLX5_ST_SZ_DW(ptys_reg)]; + int err; + + err = mlx5_query_port_ptys(dev, out, sizeof(out), proto_mask); + if (err) + return err; + + if (proto_mask == MLX5_PTYS_EN) + *proto_cap = MLX5_GET(ptys_reg, out, eth_proto_capability); + else + *proto_cap = MLX5_GET(ptys_reg, out, ib_proto_capability); + + return 0; +} +EXPORT_SYMBOL_GPL(mlx5_query_port_proto_cap); + +int mlx5_query_port_proto_admin(struct mlx5_core_dev *dev, + u32 *proto_admin, int proto_mask) +{ + u32 out[MLX5_ST_SZ_DW(ptys_reg)]; + int err; + + err = mlx5_query_port_ptys(dev, out, sizeof(out), proto_mask); + if (err) + return err; + + if (proto_mask == MLX5_PTYS_EN) + *proto_admin = MLX5_GET(ptys_reg, out, eth_proto_admin); + else + *proto_admin = MLX5_GET(ptys_reg, out, ib_proto_admin); + + return 0; +} +EXPORT_SYMBOL_GPL(mlx5_query_port_proto_admin); + +int mlx5_set_port_proto(struct mlx5_core_dev *dev, u32 proto_admin, + int proto_mask) +{ + u32 in[MLX5_ST_SZ_DW(ptys_reg)]; + u32 out[MLX5_ST_SZ_DW(ptys_reg)]; + int err; + + memset(in, 0, sizeof(in)); + + MLX5_SET(ptys_reg, in, local_port, 1); + MLX5_SET(ptys_reg, in, proto_mask, proto_mask); + if (proto_mask == MLX5_PTYS_EN) + MLX5_SET(ptys_reg, in, eth_proto_admin, proto_admin); + else + MLX5_SET(ptys_reg, in, ib_proto_admin, proto_admin); + + err = mlx5_core_access_reg(dev, in, sizeof(in), out, + sizeof(out), MLX5_REG_PTYS, 0, 1); + return err; +} +EXPORT_SYMBOL_GPL(mlx5_set_port_proto); diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index 6b91991..266d549 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -504,6 +504,11 @@ enum { MLX5_COMP_EQ_SIZE = 1024, }; +enum { + MLX5_PTYS_IB = 1 0, + MLX5_PTYS_EN = 1 2, +}; + struct mlx5_db_pgdir { struct list_headlist; DECLARE_BITMAP(bitmap, MLX5_DB_PER_PAGE); @@ -686,7 +691,16 @@ void mlx5_qp_debugfs_cleanup(struct mlx5_core_dev *dev); int mlx5_core_access_reg(struct mlx5_core_dev *dev, void *data_in, int size_in, void *data_out, int size_out, u16 reg_num, int arg, int write); + int mlx5_set_port_caps(struct mlx5_core_dev *dev, u8 port_num, u32 caps); +int mlx5_query_port_ptys(struct mlx5_core_dev *dev, u32 *ptys, +int ptys_size, int proto_mask); +int mlx5_query_port_proto_cap(struct mlx5_core_dev *dev, + u32 *proto_cap, int proto_mask); +int mlx5_query_port_proto_admin(struct mlx5_core_dev *dev, + u32 *proto_admin, int proto_mask); +int mlx5_set_port_proto(struct mlx5_core_dev *dev, u32 proto_admin, + int proto_mask); int mlx5_debug_eq_add(struct mlx5_core_dev *dev, struct mlx5_eq *eq); void mlx5_debug_eq_remove(struct mlx5_core_dev *dev, struct mlx5_eq *eq); -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next V5 01/11] net/mlx5_core,mlx5_ib: Do not use vmap() on coherent memory
As David Daney pointed in mlx4_core driver [1], mlx5_core is also misusing the DMA-API. This patch is removing the code that vmap() memory allocated by dma_alloc_coherent(). After this patch, users of this drivers might fail allocating resources on memory fragmeneted systems. This will be fixed later on. [1] - https://patchwork.ozlabs.org/patch/458531/ CC: David Daney david.da...@cavium.com Signed-off-by: Amir Vadai am...@mellanox.com --- drivers/infiniband/hw/mlx5/cq.c | 3 +- drivers/infiniband/hw/mlx5/qp.c | 2 +- drivers/infiniband/hw/mlx5/srq.c| 2 +- drivers/net/ethernet/mellanox/mlx5/core/alloc.c | 96 + drivers/net/ethernet/mellanox/mlx5/core/eq.c| 3 +- include/linux/mlx5/driver.h | 9 +-- 6 files changed, 22 insertions(+), 93 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c index 2ee6b10..4e88b18 100644 --- a/drivers/infiniband/hw/mlx5/cq.c +++ b/drivers/infiniband/hw/mlx5/cq.c @@ -590,8 +590,7 @@ static int alloc_cq_buf(struct mlx5_ib_dev *dev, struct mlx5_ib_cq_buf *buf, { int err; - err = mlx5_buf_alloc(dev-mdev, nent * cqe_size, -PAGE_SIZE * 2, buf-buf); + err = mlx5_buf_alloc(dev-mdev, nent * cqe_size, buf-buf); if (err) return err; diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c index d35f62d..426eb88 100644 --- a/drivers/infiniband/hw/mlx5/qp.c +++ b/drivers/infiniband/hw/mlx5/qp.c @@ -768,7 +768,7 @@ static int create_kernel_qp(struct mlx5_ib_dev *dev, qp-sq.offset = qp-rq.wqe_cnt qp-rq.wqe_shift; qp-buf_size = err + (qp-rq.wqe_cnt qp-rq.wqe_shift); - err = mlx5_buf_alloc(dev-mdev, qp-buf_size, PAGE_SIZE * 2, qp-buf); + err = mlx5_buf_alloc(dev-mdev, qp-buf_size, qp-buf); if (err) { mlx5_ib_dbg(dev, err %d\n, err); goto err_uuar; diff --git a/drivers/infiniband/hw/mlx5/srq.c b/drivers/infiniband/hw/mlx5/srq.c index 02d77a2..4242e1d 100644 --- a/drivers/infiniband/hw/mlx5/srq.c +++ b/drivers/infiniband/hw/mlx5/srq.c @@ -165,7 +165,7 @@ static int create_srq_kernel(struct mlx5_ib_dev *dev, struct mlx5_ib_srq *srq, return err; } - if (mlx5_buf_alloc(dev-mdev, buf_size, PAGE_SIZE * 2, srq-buf)) { + if (mlx5_buf_alloc(dev-mdev, buf_size, srq-buf)) { mlx5_ib_dbg(dev, buf alloc failed\n); err = -ENOMEM; goto err_db; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c index ac0f7bf..0715b49 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/alloc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/alloc.c @@ -42,95 +42,36 @@ #include mlx5_core.h /* Handling for queue buffers -- we allocate a bunch of memory and - * register it in a memory region at HCA virtual address 0. If the - * requested size is max_direct, we split the allocation into - * multiple pages, so we don't require too much contiguous memory. + * register it in a memory region at HCA virtual address 0. */ -int mlx5_buf_alloc(struct mlx5_core_dev *dev, int size, int max_direct, - struct mlx5_buf *buf) +int mlx5_buf_alloc(struct mlx5_core_dev *dev, int size, struct mlx5_buf *buf) { dma_addr_t t; buf-size = size; - if (size = max_direct) { - buf-nbufs= 1; - buf-npages = 1; - buf-page_shift = (u8)get_order(size) + PAGE_SHIFT; - buf-direct.buf = dma_zalloc_coherent(dev-pdev-dev, - size, t, GFP_KERNEL); - if (!buf-direct.buf) - return -ENOMEM; - - buf-direct.map = t; - - while (t ((1 buf-page_shift) - 1)) { - --buf-page_shift; - buf-npages *= 2; - } - } else { - int i; - - buf-direct.buf = NULL; - buf-nbufs = (size + PAGE_SIZE - 1) / PAGE_SIZE; - buf-npages = buf-nbufs; - buf-page_shift = PAGE_SHIFT; - buf-page_list = kcalloc(buf-nbufs, sizeof(*buf-page_list), - GFP_KERNEL); - if (!buf-page_list) - return -ENOMEM; - - for (i = 0; i buf-nbufs; i++) { - buf-page_list[i].buf = - dma_zalloc_coherent(dev-pdev-dev, PAGE_SIZE, - t, GFP_KERNEL); - if (!buf-page_list[i].buf) - goto err_free; - - buf-page_list[i].map = t; - } - - if (BITS_PER_LONG == 64) { - struct
[PATCH net-next V5 07/11] net/mlx5_core: Modify CQ moderation parameters
From: Rana Shahout ra...@mellanox.com Introduce mlx5_core_modify_cq_moderation() to be used by the netdev, to set hardware coalescing. Signed-off-by: Rana Shahout ra...@mellanox.com Signed-off-by: Saeed Mahameed sae...@mellanox.com Signed-off-by: Amir Vadai am...@mellanox.com --- drivers/net/ethernet/mellanox/mlx5/core/cq.c | 18 ++ include/linux/mlx5/cq.h | 3 +++ 2 files changed, 21 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cq.c b/drivers/net/ethernet/mellanox/mlx5/core/cq.c index eb0cf81..04ab7e4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/cq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cq.c @@ -219,6 +219,24 @@ int mlx5_core_modify_cq(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq, } EXPORT_SYMBOL(mlx5_core_modify_cq); +int mlx5_core_modify_cq_moderation(struct mlx5_core_dev *dev, + struct mlx5_core_cq *cq, + u16 cq_period, + u16 cq_max_count) +{ + struct mlx5_modify_cq_mbox_in in; + + memset(in, 0, sizeof(in)); + + in.cqn = cpu_to_be32(cq-cqn); + in.ctx.cq_period= cpu_to_be16(cq_period); + in.ctx.cq_max_count = cpu_to_be16(cq_max_count); + in.field_select = cpu_to_be32(MLX5_CQ_MODIFY_PERIOD | + MLX5_CQ_MODIFY_COUNT); + + return mlx5_core_modify_cq(dev, cq, in, sizeof(in)); +} + int mlx5_init_cq_table(struct mlx5_core_dev *dev) { struct mlx5_cq_table *table = dev-priv.cq_table; diff --git a/include/linux/mlx5/cq.h b/include/linux/mlx5/cq.h index 2695ced..abc4767 100644 --- a/include/linux/mlx5/cq.h +++ b/include/linux/mlx5/cq.h @@ -169,6 +169,9 @@ int mlx5_core_query_cq(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq, struct mlx5_query_cq_mbox_out *out); int mlx5_core_modify_cq(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq, struct mlx5_modify_cq_mbox_in *in, int in_sz); +int mlx5_core_modify_cq_moderation(struct mlx5_core_dev *dev, + struct mlx5_core_cq *cq, u16 cq_period, + u16 cq_max_count); int mlx5_debug_cq_add(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq); void mlx5_debug_cq_remove(struct mlx5_core_dev *dev, struct mlx5_core_cq *cq); -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next V5 04/11] net/mlx5_core: New device capabilities handling
From: Saeed Mahameed sae...@mellanox.com - Query all supported types of dev caps on driver load. - Store the Cap data outbox per cap type into driver private data. - Introduce new Macros to access/dump stored caps (using the auto generated data types). - Obsolete SW representation of dev caps (no need for SW copy for each cap). - Modify IB driver to use new macros for checking caps. Signed-off-by: Saeed Mahameed sae...@mellanox.com Signed-off-by: Amir Vadai am...@mellanox.com --- drivers/infiniband/hw/mlx5/cq.c| 8 +- drivers/infiniband/hw/mlx5/mad.c | 2 +- drivers/infiniband/hw/mlx5/main.c | 113 --- drivers/infiniband/hw/mlx5/mlx5_ib.h | 6 +- drivers/infiniband/hw/mlx5/mr.c| 3 +- drivers/infiniband/hw/mlx5/odp.c | 47 +++ drivers/infiniband/hw/mlx5/qp.c| 84 +-- drivers/infiniband/hw/mlx5/srq.c | 7 +- drivers/net/ethernet/mellanox/mlx5/core/eq.c | 4 +- drivers/net/ethernet/mellanox/mlx5/core/fw.c | 90 +++- drivers/net/ethernet/mellanox/mlx5/core/main.c | 154 +++-- .../net/ethernet/mellanox/mlx5/core/mlx5_core.h| 10 +- drivers/net/ethernet/mellanox/mlx5/core/uar.c | 7 +- include/linux/mlx5/device.h| 66 - include/linux/mlx5/driver.h| 58 +--- 15 files changed, 310 insertions(+), 349 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c index 4e88b18..e2bea9a 100644 --- a/drivers/infiniband/hw/mlx5/cq.c +++ b/drivers/infiniband/hw/mlx5/cq.c @@ -753,7 +753,7 @@ struct ib_cq *mlx5_ib_create_cq(struct ib_device *ibdev, int entries, return ERR_PTR(-EINVAL); entries = roundup_pow_of_two(entries + 1); - if (entries dev-mdev-caps.gen.max_cqes) + if (entries (1 MLX5_CAP_GEN(dev-mdev, log_max_cq_sz))) return ERR_PTR(-EINVAL); cq = kzalloc(sizeof(*cq), GFP_KERNEL); @@ -920,7 +920,7 @@ int mlx5_ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period) int err; u32 fsel; - if (!(dev-mdev-caps.gen.flags MLX5_DEV_CAP_FLAG_CQ_MODER)) + if (!MLX5_CAP_GEN(dev-mdev, cq_moderation)) return -ENOSYS; in = kzalloc(sizeof(*in), GFP_KERNEL); @@ -1075,7 +1075,7 @@ int mlx5_ib_resize_cq(struct ib_cq *ibcq, int entries, struct ib_udata *udata) int uninitialized_var(cqe_size); unsigned long flags; - if (!(dev-mdev-caps.gen.flags MLX5_DEV_CAP_FLAG_RESIZE_CQ)) { + if (!MLX5_CAP_GEN(dev-mdev, cq_resize)) { pr_info(Firmware does not support resize CQ\n); return -ENOSYS; } @@ -1084,7 +1084,7 @@ int mlx5_ib_resize_cq(struct ib_cq *ibcq, int entries, struct ib_udata *udata) return -EINVAL; entries = roundup_pow_of_two(entries + 1); - if (entries dev-mdev-caps.gen.max_cqes + 1) + if (entries (1 MLX5_CAP_GEN(dev-mdev, log_max_cq_sz)) + 1) return -EINVAL; if (entries == ibcq-cqe + 1) diff --git a/drivers/infiniband/hw/mlx5/mad.c b/drivers/infiniband/hw/mlx5/mad.c index 9cf9a37..f2d9e70 100644 --- a/drivers/infiniband/hw/mlx5/mad.c +++ b/drivers/infiniband/hw/mlx5/mad.c @@ -129,7 +129,7 @@ int mlx5_query_ext_port_caps(struct mlx5_ib_dev *dev, u8 port) packet_error = be16_to_cpu(out_mad-status); - dev-mdev-caps.gen.ext_port_cap[port - 1] = (!err !packet_error) ? + dev-mdev-port_caps[port - 1].ext_port_cap = (!err !packet_error) ? MLX_EXT_PORT_CAP_FLAG_EXTENDED_PORT_INFO : 0; out: diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 57c9809..9075649 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -66,15 +66,13 @@ static int mlx5_ib_query_device(struct ib_device *ibdev, struct ib_device_attr *props) { struct mlx5_ib_dev *dev = to_mdev(ibdev); + struct mlx5_core_dev *mdev = dev-mdev; struct ib_smp *in_mad = NULL; struct ib_smp *out_mad = NULL; - struct mlx5_general_caps *gen; int err = -ENOMEM; int max_rq_sg; int max_sq_sg; - u64 flags; - gen = dev-mdev-caps.gen; in_mad = kzalloc(sizeof(*in_mad), GFP_KERNEL); out_mad = kmalloc(sizeof(*out_mad), GFP_KERNEL); if (!in_mad || !out_mad) @@ -96,18 +94,18 @@ static int mlx5_ib_query_device(struct ib_device *ibdev, IB_DEVICE_PORT_ACTIVE_EVENT | IB_DEVICE_SYS_IMAGE_GUID| IB_DEVICE_RC_RNR_NAK_GEN; - flags = gen-flags; - if (flags MLX5_DEV_CAP_FLAG_BAD_PKEY_CNTR) + + if (MLX5_CAP_GEN(mdev, pkv)) props-device_cap_flags |=
[PATCH net-next V5 11/11] net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality
This is the Ethernet part of the driver for the Mellanox ConnectX(R)-4 Single/Dual-Port Adapter supporting 100Gb/s with VPI. The driver extends the existing mlx5 driver with Ethernet functionality. This patch contains the driver entry points but does not include transmit and receive (see the previous patch in the series) routines. It also adds the option MLX5_CORE_EN to Kconfig to enable/disable the Ethernet functionality. Currently, Kconfig is programmed to make Ethernet and Infiniband functionality mutally exclusive. Also changed MLX5_INFINIBAND to be depandant on MLX5_CORE instead of selecting it, since MLX5_CORE could be selected without MLX5_INFINIBAND being selected. Signed-off-by: Amir Vadai am...@mellanox.com --- drivers/infiniband/hw/mlx5/Kconfig |4 +- drivers/net/ethernet/mellanox/mlx5/core/Kconfig| 14 +- drivers/net/ethernet/mellanox/mlx5/core/Makefile |3 + drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 19 - drivers/net/ethernet/mellanox/mlx5/core/en.h | 520 ++ .../net/ethernet/mellanox/mlx5/core/en_ethtool.c | 679 +++ drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 1899 drivers/net/ethernet/mellanox/mlx5/core/main.c | 74 +- .../net/ethernet/mellanox/mlx5/core/mlx5_core.h|9 +- include/linux/mlx5/device.h| 19 + include/linux/mlx5/driver.h|1 + 11 files changed, 3213 insertions(+), 28 deletions(-) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en.h create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_main.c diff --git a/drivers/infiniband/hw/mlx5/Kconfig b/drivers/infiniband/hw/mlx5/Kconfig index 10df386..bce263b 100644 --- a/drivers/infiniband/hw/mlx5/Kconfig +++ b/drivers/infiniband/hw/mlx5/Kconfig @@ -1,8 +1,6 @@ config MLX5_INFINIBAND tristate Mellanox Connect-IB HCA support - depends on NETDEVICES ETHERNET PCI - select NET_VENDOR_MELLANOX - select MLX5_CORE + depends on NETDEVICES ETHERNET PCI MLX5_CORE ---help--- This driver provides low-level InfiniBand support for Mellanox Connect-IB PCI Express host channel adapters (HCAs). diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig index 8ff57e8..0d7aef0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig +++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig @@ -3,6 +3,18 @@ # config MLX5_CORE - tristate + tristate Mellanox Technologies ConnectX-4 and Connect-IB core driver depends on PCI default n + ---help--- + Core driver for low level functionality of the ConnectX-4 and + Connect-IB cards by Mellanox Technologies. + +config MLX5_CORE_EN + bool Mellanox Technologies ConnectX-4 Ethernet support + depends on MLX5_INFINIBAND=n NETDEVICES ETHERNET PCI MLX5_CORE + default n + ---help--- + Ethernet support in Mellanox Technologies ConnectX-4 NIC. + Ethernet and Infiniband support in ConnectX-4 are currently mutually + exclusive. diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index 105780b..87e9e60 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -3,3 +3,6 @@ obj-$(CONFIG_MLX5_CORE) += mlx5_core.o mlx5_core-y := main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \ health.o mcg.o cq.o srq.o alloc.o qp.o port.o mr.o pd.o \ mad.o +mlx5_core-$(CONFIG_MLX5_CORE_EN) += wq.o flow_table.o vport.o transobj.o \ + en_main.o en_flow_table.o en_ethtool.o en_tx.o en_rx.o \ + en_txrx.o diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c index 2f22cd2..75ff58d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -75,25 +75,6 @@ enum { MLX5_CMD_DELIVERY_STAT_CMD_DESCR_ERR= 0x10, }; -enum { - MLX5_CMD_STAT_OK= 0x0, - MLX5_CMD_STAT_INT_ERR = 0x1, - MLX5_CMD_STAT_BAD_OP_ERR= 0x2, - MLX5_CMD_STAT_BAD_PARAM_ERR = 0x3, - MLX5_CMD_STAT_BAD_SYS_STATE_ERR = 0x4, - MLX5_CMD_STAT_BAD_RES_ERR = 0x5, - MLX5_CMD_STAT_RES_BUSY = 0x6, - MLX5_CMD_STAT_LIM_ERR = 0x8, - MLX5_CMD_STAT_BAD_RES_STATE_ERR = 0x9, - MLX5_CMD_STAT_IX_ERR= 0xa, - MLX5_CMD_STAT_NO_RES_ERR= 0xf, - MLX5_CMD_STAT_BAD_INP_LEN_ERR = 0x50, - MLX5_CMD_STAT_BAD_OUTP_LEN_ERR = 0x51, -
[PATCH net-next V5 06/11] net/mlx5_core: Implement get/set port status
From: Rana Shahout ra...@mellanox.com Implemet get/set port status low level functions to be exposed by the netdev. Signed-off-by: Rana Shahout ra...@mellanox.com Signed-off-by: Saeed Mahameed sae...@mellanox.com Signed-off-by: Amir Vadai am...@mellanox.com --- drivers/net/ethernet/mellanox/mlx5/core/port.c | 32 ++ include/linux/mlx5/driver.h| 8 +++ 2 files changed, 40 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c index 6e2d99c..742a6fb 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/port.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c @@ -179,3 +179,35 @@ int mlx5_set_port_proto(struct mlx5_core_dev *dev, u32 proto_admin, return err; } EXPORT_SYMBOL_GPL(mlx5_set_port_proto); + +int mlx5_set_port_status(struct mlx5_core_dev *dev, +enum mlx5_port_status status) +{ + u32 in[MLX5_ST_SZ_DW(paos_reg)]; + u32 out[MLX5_ST_SZ_DW(paos_reg)]; + + memset(in, 0, sizeof(in)); + + MLX5_SET(paos_reg, in, admin_status, status); + MLX5_SET(paos_reg, in, ase, 1); + + return mlx5_core_access_reg(dev, in, sizeof(in), out, + sizeof(out), MLX5_REG_PAOS, 0, 1); +} + +int mlx5_query_port_status(struct mlx5_core_dev *dev, u8 *status) +{ + u32 in[MLX5_ST_SZ_DW(paos_reg)]; + u32 out[MLX5_ST_SZ_DW(paos_reg)]; + int err; + + memset(in, 0, sizeof(in)); + + err = mlx5_core_access_reg(dev, in, sizeof(in), out, + sizeof(out), MLX5_REG_PAOS, 0, 0); + if (err) + return err; + + *status = MLX5_GET(paos_reg, out, oper_status); + return err; +} diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index 266d549..6438444 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -149,6 +149,11 @@ enum mlx5_dev_event { MLX5_DEV_EVENT_CLIENT_REREG, }; +enum mlx5_port_status { + MLX5_PORT_UP= 1 1, + MLX5_PORT_DOWN = 1 2, +}; + struct mlx5_uuar_info { struct mlx5_uar*uars; int num_uars; @@ -701,6 +706,9 @@ int mlx5_query_port_proto_admin(struct mlx5_core_dev *dev, u32 *proto_admin, int proto_mask); int mlx5_set_port_proto(struct mlx5_core_dev *dev, u32 proto_admin, int proto_mask); +int mlx5_set_port_status(struct mlx5_core_dev *dev, +enum mlx5_port_status status); +int mlx5_query_port_status(struct mlx5_core_dev *dev, u8 *status); int mlx5_debug_eq_add(struct mlx5_core_dev *dev, struct mlx5_eq *eq); void mlx5_debug_eq_remove(struct mlx5_core_dev *dev, struct mlx5_eq *eq); -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next V5 08/11] net/mlx5_core: Set/Query port MTU commands
From: Saeed Mahameed sae...@mellanox.com Introduce set/Query low level functions to access MTU in hardware. To be used by the netdev. Signed-off-by: Saeed Mahameed sae...@mellanox.com Signed-off-by: Amir Vadai am...@mellanox.com --- drivers/net/ethernet/mellanox/mlx5/core/port.c | 53 ++ include/linux/mlx5/driver.h| 4 ++ 2 files changed, 57 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c index 742a6fb..7d3d0f9 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/port.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c @@ -211,3 +211,56 @@ int mlx5_query_port_status(struct mlx5_core_dev *dev, u8 *status) *status = MLX5_GET(paos_reg, out, oper_status); return err; } + +static int mlx5_query_port_mtu(struct mlx5_core_dev *dev, + int *admin_mtu, int *max_mtu, int *oper_mtu) +{ + u32 in[MLX5_ST_SZ_DW(pmtu_reg)]; + u32 out[MLX5_ST_SZ_DW(pmtu_reg)]; + int err; + + memset(in, 0, sizeof(in)); + + MLX5_SET(pmtu_reg, in, local_port, 1); + + err = mlx5_core_access_reg(dev, in, sizeof(in), out, + sizeof(out), MLX5_REG_PMTU, 0, 0); + if (err) + return err; + + if (max_mtu) + *max_mtu = MLX5_GET(pmtu_reg, out, max_mtu); + if (oper_mtu) + *oper_mtu = MLX5_GET(pmtu_reg, out, oper_mtu); + if (admin_mtu) + *admin_mtu = MLX5_GET(pmtu_reg, out, admin_mtu); + + return 0; +} + +int mlx5_set_port_mtu(struct mlx5_core_dev *dev, int mtu) +{ + u32 in[MLX5_ST_SZ_DW(pmtu_reg)]; + u32 out[MLX5_ST_SZ_DW(pmtu_reg)]; + + memset(in, 0, sizeof(in)); + + MLX5_SET(pmtu_reg, in, admin_mtu, mtu); + MLX5_SET(pmtu_reg, in, local_port, 1); + + return mlx5_core_access_reg(dev, in, sizeof(in), out, sizeof(out), + MLX5_REG_PMTU, 0, 1); +} +EXPORT_SYMBOL_GPL(mlx5_set_port_mtu); + +int mlx5_query_port_max_mtu(struct mlx5_core_dev *dev, int *max_mtu) +{ + return mlx5_query_port_mtu(dev, NULL, max_mtu, NULL); +} +EXPORT_SYMBOL_GPL(mlx5_query_port_max_mtu); + +int mlx5_query_port_oper_mtu(struct mlx5_core_dev *dev, int *oper_mtu) +{ + return mlx5_query_port_mtu(dev, NULL, NULL, oper_mtu); +} +EXPORT_SYMBOL_GPL(mlx5_query_port_oper_mtu); diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h index 6438444..5173847 100644 --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -710,6 +710,10 @@ int mlx5_set_port_status(struct mlx5_core_dev *dev, enum mlx5_port_status status); int mlx5_query_port_status(struct mlx5_core_dev *dev, u8 *status); +int mlx5_set_port_mtu(struct mlx5_core_dev *dev, int mtu); +int mlx5_query_port_max_mtu(struct mlx5_core_dev *dev, int *max_mtu); +int mlx5_query_port_oper_mtu(struct mlx5_core_dev *dev, int *oper_mtu); + int mlx5_debug_eq_add(struct mlx5_core_dev *dev, struct mlx5_eq *eq); void mlx5_debug_eq_remove(struct mlx5_core_dev *dev, struct mlx5_eq *eq); int mlx5_core_eq_query(struct mlx5_core_dev *dev, struct mlx5_eq *eq, -- 1.9.3 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next V5 09/11] net/mlx5: Ethernet Datapath files
en_[rt]x.c contains the data path related code specific to tx or rx. en_txrx.c contains data path code which is common for both the rx and tx, this is mainly napi related code. Below are the objects that are being used by the hardware and the driver in the data path: Channel - one channel per IRQ. Every channel object contains: RQ - describes the rx queue TIR - One TIR (Transport Interface Receive) object per flow type. TIR contains attributes for a type of rx flow (e.g IPv4, IPv6 etc). A flow is defined in the Flow Table. Currently TIR describes the RSS hash parameters if exists and LRO attributes. SQ - describes the a tx queue. There is one SQ (Send Queue) per TC (traffic class). TIS - There is one TIS (Transport Interface Send) per TC. It describes the TC and may later be extended to describe more transport properties. Both RQ and SQ inherit from the object WQ (work queue). This common code to describe the layout of CQE's WQE's in memory is in the files wq.[cj] For every channel there is one NAPI context that is used for RX and for TX. Driver is using netdev_alloc_skb() to allocate skb's. Signed-off-by: Amir Vadai am...@mellanox.com --- drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 249 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 344 ++ drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c | 107 +++ drivers/net/ethernet/mellanox/mlx5/core/wq.c | 183 drivers/net/ethernet/mellanox/mlx5/core/wq.h | 171 +++ 5 files changed, 1054 insertions(+) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/wq.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/wq.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c new file mode 100644 index 000..ce1317c --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -0,0 +1,249 @@ +/* + * Copyright (c) 2015, Mellanox Technologies. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#include linux/ip.h +#include linux/ipv6.h +#include linux/tcp.h +#include en.h + +static inline int mlx5e_alloc_rx_wqe(struct mlx5e_rq *rq, +struct mlx5e_rx_wqe *wqe, u16 ix) +{ + struct sk_buff *skb; + dma_addr_t dma_addr; + + skb = netdev_alloc_skb(rq-netdev, rq-wqe_sz); + if (unlikely(!skb)) + return -ENOMEM; + + skb_reserve(skb, MLX5E_NET_IP_ALIGN); + + dma_addr = dma_map_single(rq-pdev, + /* hw start padding */ + skb-data - MLX5E_NET_IP_ALIGN, + /* hw end padding */ + rq-wqe_sz, + DMA_FROM_DEVICE); + + if (unlikely(dma_mapping_error(rq-pdev, dma_addr))) + goto err_free_skb; + + *((dma_addr_t *)skb-cb) = dma_addr; + wqe-data.addr = cpu_to_be64(dma_addr + MLX5E_NET_IP_ALIGN); + + rq-skb[ix] = skb; + + return 0; + +err_free_skb: + dev_kfree_skb(skb); + + return -ENOMEM; +} + +bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq) +{ + struct mlx5_wq_ll *wq = rq-wq; + + if (unlikely(!test_bit(MLX5E_RQ_STATE_POST_WQES_ENABLE, rq-state))) + return false; + + while
Re: [PATCH net-next] openvswitch: include datapath actions with sampled-packet upcall to userspace
On Wed, May 27, 2015 at 10:57 PM, Pravin Shelar pshe...@nicira.com wrote: On Wed, May 27, 2015 at 9:16 PM, Jesse Gross je...@nicira.com wrote: On Wed, May 27, 2015 at 7:46 PM, Pravin Shelar pshe...@nicira.com wrote: On Wed, May 27, 2015 at 2:10 PM, Jesse Gross je...@nicira.com wrote: On Fri, May 22, 2015 at 10:53 AM, Pravin Shelar pshe...@nicira.com wrote: On Wed, May 20, 2015 at 12:32 PM, Neil McKee neil.mc...@inmon.com wrote: diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c index b491c1c..ee5760d 100644 --- a/net/openvswitch/actions.c +++ b/net/openvswitch/actions.c @@ -608,7 +608,8 @@ static void do_output(struct datapath *dp, struct sk_buff *skb, int out_port) } static int output_userspace(struct datapath *dp, struct sk_buff *skb, - struct sw_flow_key *key, const struct nlattr *attr) + struct sw_flow_key *key, const struct nlattr *attr, + const struct nlattr *actions, int actions_len) { struct ovs_tunnel_info info; struct dp_upcall_info upcall; @@ -619,6 +620,8 @@ static int output_userspace(struct datapath *dp, struct sk_buff *skb, upcall.userdata = NULL; upcall.portid = 0; upcall.egress_tun_info = NULL; + upcall.actions = actions; + upcall.actions_len = actions_len; Rather than unconditionally passing actions to the upcall, there should be attribute in ovs_userspace_attr to request the actions list. Why? It seems simpler to just always pass the actions and I'm not sure that this is really performance critical (which is the only reason that comes to mind to not always pass this). This is only required for sFlow sampling so I do not think we should send it on every upcall. But what is the downside? This increases memory allocation in atomic context but if you think this makes code complicated then I am fine without the attribute. OK, I see. My guess is that there are only likely to be a significant set of actions for sampling use cases anyways so if this is a real problem then a flag is probably not going to make much of a difference. One possibility is to retry with a smaller size if allocation fails and not include the actions in that case. Userspace is already going to have to handle the case where actions are omitted for existing kernels. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Drops in qdisc on ifb interface
On May 28, 2015 at 1:17 PM Eric Dumazet eric.duma...@gmail.com wrote: On Thu, 2015-05-28 at 12:33 -0400, jsulli...@opensourcedevel.com wrote: Our initial testing has been single flow but the ultimate purpose is processing real time video in a complex application which ingests associated meta data, post to consumer facing cloud, does reporting back - so lots of different traffics with very different demands - a perfect tc environment. Wait, do you really plan using TCP for real time video ? The overall product does but the video source feeds come over a different network via UDP. There are, however, RTMP quality control feeds coming across this connection. There may also occasionally be test UDP source feeds on this connection but those are not production. Thanks - John -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Thu, May 28, 2015 at 07:21:11PM +0300, Or Gerlitz wrote: Anything else except for that (you said reworking of the network scripts and NetworkManager assumptions to make it work)?? IPv6 becomes very broken, child interfaces will generate the same IPv6 addreses for radv and link local resulting in duplicate address scenarios. About the only thing that will work properly is statically assigned IPv4 addresses. I don't see why we should stop the whole RDMA containers support train just b/c we found out the IPoIB DHCP bug which was there for few years before this effort started. I don't think that is what Doug said. Jason -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 net-next 08/11] net: Add VLAN ID to flow_keys
In flow_dissector set vlan_id in flow_keys when VLAN is found. Signed-off-by: Tom Herbert t...@herbertland.com --- include/net/flow_dissector.h | 6 ++ net/core/flow_dissector.c| 14 ++ 2 files changed, 20 insertions(+) diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h index 59f00f9..08480fb 100644 --- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -27,6 +27,10 @@ struct flow_dissector_key_basic { u8 padding; }; +struct flow_dissector_key_tags { + u32 vlan_id:12; +}; + /** * struct flow_dissector_key_ipv4_addrs: * @src: source ip address @@ -106,6 +110,7 @@ enum flow_dissector_key_id { FLOW_DISSECTOR_KEY_PORTS, /* struct flow_dissector_key_ports */ FLOW_DISSECTOR_KEY_ETH_ADDRS, /* struct flow_dissector_key_eth_addrs */ FLOW_DISSECTOR_KEY_TIPC_ADDRS, /* struct flow_dissector_key_tipc_addrs */ + FLOW_DISSECTOR_KEY_VLANID, /* struct flow_dissector_key_flow_tags */ FLOW_DISSECTOR_KEY_MAX, }; @@ -142,6 +147,7 @@ struct flow_keys { struct flow_dissector_key_control control; #define FLOW_KEYS_HASH_START_FIELD basic struct flow_dissector_key_basic basic; + struct flow_dissector_key_tags tags; struct flow_dissector_key_ports ports; struct flow_dissector_key_addrs addrs; }; diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 5348a46..5c66cb2 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -126,6 +126,7 @@ bool __skb_flow_dissect(const struct sk_buff *skb, struct flow_dissector_key_basic *key_basic; struct flow_dissector_key_addrs *key_addrs; struct flow_dissector_key_ports *key_ports; + struct flow_dissector_key_tags *key_tags; u8 ip_proto; if (!data) { @@ -246,6 +247,15 @@ flow_label: if (!vlan) return false; + if (skb_flow_dissector_uses_key(flow_dissector, + FLOW_DISSECTOR_KEY_VLANID)) { + key_tags = skb_flow_dissector_target(flow_dissector, + FLOW_DISSECTOR_KEY_VLANID, +target_container); + + key_tags-vlan_id = skb_vlan_tag_get_id(skb); + } + proto = vlan-h_vlan_encapsulated_proto; nhoff += sizeof(*vlan); goto again; @@ -645,6 +655,10 @@ static const struct flow_dissector_key flow_keys_dissector_keys[] = { .key_id = FLOW_DISSECTOR_KEY_PORTS, .offset = offsetof(struct flow_keys, ports), }, + { + .key_id = FLOW_DISSECTOR_KEY_VLANID, + .offset = offsetof(struct flow_keys, tags), + }, }; static const struct flow_dissector_key flow_keys_buf_dissector_keys[] = { -- 1.8.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 net-next 07/11] net: Get rid of IPv6 hash addresses flow keys
We don't need to return the IPv6 address hash as part of flow keys. In general, using the IPv6 address hash is risky in a hash value since the underlying use of xor provides no entropy. If someone really needs the hash value they can get it from the full IPv6 addresses in flow keys (e.g. from flow_get_u32_src). Signed-off-by: Tom Herbert t...@herbertland.com --- include/net/flow_dissector.h | 1 - net/core/flow_dissector.c| 17 - 2 files changed, 18 deletions(-) diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h index 3ee606a..59f00f9 100644 --- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -103,7 +103,6 @@ enum flow_dissector_key_id { FLOW_DISSECTOR_KEY_BASIC, /* struct flow_dissector_key_basic */ FLOW_DISSECTOR_KEY_IPV4_ADDRS, /* struct flow_dissector_key_ipv4_addrs */ FLOW_DISSECTOR_KEY_IPV6_ADDRS, /* struct flow_dissector_key_ipv6_addrs */ - FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS, /* struct flow_dissector_key_addrs */ FLOW_DISSECTOR_KEY_PORTS, /* struct flow_dissector_key_ports */ FLOW_DISSECTOR_KEY_ETH_ADDRS, /* struct flow_dissector_key_eth_addrs */ FLOW_DISSECTOR_KEY_TIPC_ADDRS, /* struct flow_dissector_key_tipc_addrs */ diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 91861c3..5348a46 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -200,19 +200,6 @@ ipv6: nhoff += sizeof(struct ipv6hdr); if (skb_flow_dissector_uses_key(flow_dissector, - FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS)) { - key_addrs = skb_flow_dissector_target(flow_dissector, - FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS, - target_container); - - key_addrs-v4addrs.src = - (__force __be32)ipv6_addr_hash(iph-saddr); - key_addrs-v4addrs.dst = - (__force __be32)ipv6_addr_hash(iph-daddr); - key_control-addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS; - goto flow_label; - } - if (skb_flow_dissector_uses_key(flow_dissector, FLOW_DISSECTOR_KEY_IPV6_ADDRS)) { struct flow_dissector_key_ipv6_addrs *key_ipv6_addrs; @@ -651,10 +638,6 @@ static const struct flow_dissector_key flow_keys_dissector_keys[] = { .offset = offsetof(struct flow_keys, addrs.v6addrs), }, { - .key_id = FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS, - .offset = offsetof(struct flow_keys, addrs.v4addrs), - }, - { .key_id = FLOW_DISSECTOR_KEY_TIPC_ADDRS, .offset = offsetof(struct flow_keys, addrs.tipcaddrs), }, -- 1.8.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 net-next 06/11] net: Add keys for TIPC address
Add a new flow key for TIPC addresses. Signed-off-by: Tom Herbert t...@herbertland.com --- include/net/flow_dissector.h | 10 ++ net/core/flow_dissector.c| 18 +- 2 files changed, 23 insertions(+), 5 deletions(-) diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h index 306d461..3ee606a 100644 --- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -50,6 +50,14 @@ struct flow_dissector_key_ipv6_addrs { }; /** + * struct flow_dissector_key_tipc_addrs: + * @srcnode: source node address + */ +struct flow_dissector_key_tipc_addrs { + __be32 srcnode; +}; + +/** * struct flow_dissector_key_addrs: * @v4addrs: IPv4 addresses * @v6addrs: IPv6 addresses @@ -58,6 +66,7 @@ struct flow_dissector_key_addrs { union { struct flow_dissector_key_ipv4_addrs v4addrs; struct flow_dissector_key_ipv6_addrs v6addrs; + struct flow_dissector_key_tipc_addrs tipcaddrs; }; }; @@ -97,6 +106,7 @@ enum flow_dissector_key_id { FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS, /* struct flow_dissector_key_addrs */ FLOW_DISSECTOR_KEY_PORTS, /* struct flow_dissector_key_ports */ FLOW_DISSECTOR_KEY_ETH_ADDRS, /* struct flow_dissector_key_eth_addrs */ + FLOW_DISSECTOR_KEY_TIPC_ADDRS, /* struct flow_dissector_key_tipc_addrs */ FLOW_DISSECTOR_KEY_MAX, }; diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index ca9d224..91861c3 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -294,13 +294,12 @@ flow_label: key_control-thoff = (u16)nhoff; if (skb_flow_dissector_uses_key(flow_dissector, - FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS)) { + FLOW_DISSECTOR_KEY_TIPC_ADDRS)) { key_addrs = skb_flow_dissector_target(flow_dissector, - FLOW_DISSECTOR_KEY_IPV6_HASH_ADDRS, + FLOW_DISSECTOR_KEY_TIPC_ADDRS, target_container); - key_addrs-v4addrs.src = hdr-srcnode; - key_addrs-v4addrs.dst = 0; - key_control-addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS; + key_addrs-tipcaddrs.srcnode = hdr-srcnode; + key_control-addr_type = FLOW_DISSECTOR_KEY_TIPC_ADDRS; } return true; } @@ -408,6 +407,9 @@ static inline size_t flow_keys_hash_length(struct flow_keys *flow) case FLOW_DISSECTOR_KEY_IPV6_ADDRS: diff -= sizeof(flow-addrs.v6addrs); break; + case FLOW_DISSECTOR_KEY_TIPC_ADDRS: + diff -= sizeof(flow-addrs.tipcaddrs); + break; } return (sizeof(*flow) - diff) / sizeof(u32); } @@ -420,6 +422,8 @@ __be32 flow_get_u32_src(const struct flow_keys *flow) case FLOW_DISSECTOR_KEY_IPV6_ADDRS: return (__force __be32)ipv6_addr_hash( flow-addrs.v6addrs.src); + case FLOW_DISSECTOR_KEY_TIPC_ADDRS: + return flow-addrs.tipcaddrs.srcnode; default: return 0; } @@ -651,6 +655,10 @@ static const struct flow_dissector_key flow_keys_dissector_keys[] = { .offset = offsetof(struct flow_keys, addrs.v4addrs), }, { + .key_id = FLOW_DISSECTOR_KEY_TIPC_ADDRS, + .offset = offsetof(struct flow_keys, addrs.tipcaddrs), + }, + { .key_id = FLOW_DISSECTOR_KEY_PORTS, .offset = offsetof(struct flow_keys, ports), }, -- 1.8.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 net-next 11/11] mpls: Add MPLS entropy label in flow_keys
In flow dissector if an MPLS header contains an entropy label this is saved in the new keyid field of flow_keys. The entropy label is then represented in the flow hash function input. Signed-off-by: Tom Herbert t...@herbertland.com --- include/net/flow_dissector.h | 1 + net/core/flow_dissector.c| 35 +++ 2 files changed, 36 insertions(+) diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h index 5d4257b..09f4b76 100644 --- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -118,6 +118,7 @@ enum flow_dissector_key_id { FLOW_DISSECTOR_KEY_VLANID, /* struct flow_dissector_key_flow_tags */ FLOW_DISSECTOR_KEY_FLOW_LABEL, /* struct flow_dissector_key_flow_tags */ FLOW_DISSECTOR_KEY_GRE_KEYID, /* struct flow_dissector_key_keyid */ + FLOW_DISSECTOR_KEY_MPLS_ENTROPY, /* struct flow_dissector_key_keyid */ FLOW_DISSECTOR_KEY_MAX, }; diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index ea318d5..aaebe52 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -15,6 +15,7 @@ #include linux/ppp_defs.h #include linux/stddef.h #include linux/if_ether.h +#include linux/mpls.h #include net/flow_dissector.h #include scsi/fc/fc_fcoe.h @@ -288,6 +289,37 @@ ipv6: } return true; } + + case htons(ETH_P_MPLS_UC): + case htons(ETH_P_MPLS_MC): { + struct mpls_label *hdr, _hdr[2]; +mpls: + hdr = __skb_header_pointer(skb, nhoff, sizeof(_hdr), data, + hlen, _hdr); + if (!hdr) + return false; + + if ((ntohl(hdr[0].entry) MPLS_LS_LABEL_MASK) == +MPLS_LABEL_ENTROPY) { + if (skb_flow_dissector_uses_key(flow_dissector, + FLOW_DISSECTOR_KEY_MPLS_ENTROPY)) { + key_keyid = skb_flow_dissector_target(flow_dissector, + FLOW_DISSECTOR_KEY_MPLS_ENTROPY, + target_container); + key_keyid-keyid = ntohl(hdr[1].entry) + MPLS_LS_LABEL_MASK; + } + + key_basic-n_proto = proto; + key_basic-ip_proto = ip_proto; + key_control-thoff = (u16)nhoff; + + return true; + } + + return true; + } + case htons(ETH_P_FCOE): key_control-thoff = (u16)(nhoff + FCOE_HEADER_LEN); /* fall through */ @@ -357,6 +389,9 @@ ipv6: case IPPROTO_IPV6: proto = htons(ETH_P_IPV6); goto ipv6; + case IPPROTO_MPLS: + proto = htons(ETH_P_MPLS_UC); + goto mpls; default: break; } -- 1.8.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Thu, 2015-05-28 at 11:43 -0600, Jason Gunthorpe wrote: On Thu, May 28, 2015 at 07:21:11PM +0300, Or Gerlitz wrote: Anything else except for that (you said reworking of the network scripts and NetworkManager assumptions to make it work)?? IPv6 becomes very broken, child interfaces will generate the same IPv6 addreses for radv and link local resulting in duplicate address scenarios. About the only thing that will work properly is statically assigned IPv4 addresses. I don't see why we should stop the whole RDMA containers support train just b/c we found out the IPoIB DHCP bug which was there for few years before this effort started. I don't think that is what Doug said. Indeed. There is no need to scrap things, but if the design as it stands, and the intended means of creating objects for use in containers, is going to result in an unworkable network, then we have to re-evaluate how the container constructs are created, and that then has possible consequences for how we would get from an incoming packet to the proper container. I'm not trying to stop the support train here, but at the same time, if the train is headed for a bridge that's out -- Doug Ledford dledf...@redhat.com GPG KeyID: 0E572FDD signature.asc Description: This is a digitally signed message part
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Thu, May 28, 2015 at 9:22 PM, Doug Ledford dledf...@redhat.com wrote: I don't think that is what Doug said. Indeed. There is no need to scrap things, but if the design as it stands, and the intended means of creating objects for use in containers, is going to result in an unworkable network, then we have to re-evaluate how the container constructs are created, and that then has possible consequences for how we would get from an incoming packet to the proper container. To be precise, do we agree that the issue here isn't in the design as it stands but rather in a problem we found in the intended way of assigning IP addresses through DHCP for the containers? I'm not trying to stop the support train here, but at the same time, if the train is headed for a bridge that's out So what's your concrete saying here? where should we go from here? Or. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next V5 02/11] net/mlx5_core: Set irq affinity hints
From: Saeed Mahameed sae...@mellanox.com Preparation for upcoming ethernet driver. - Move msix array from eq_table struct to priv since its not related to eq_table - Intorduce irq_info struct to hold all irq information - Move name from mlx5_eq to irq_info struct since it is irq property. - Set IRQ affinity hints Signed-off-by: Achiad Shochat ach...@mellanox.com Signed-off-by: Rana Shahout ra...@mellanox.com Signed-off-by: Saeed Mahameed sae...@mellanox.com Signed-off-by: Amir Vadai am...@mellanox.com --- drivers/net/ethernet/mellanox/mlx5/core/eq.c | 16 ++-- drivers/net/ethernet/mellanox/mlx5/core/main.c | 111 ++--- include/linux/mlx5/driver.h| 11 ++- 3 files changed, 117 insertions(+), 21 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c index 3f511bd..516efc2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c @@ -339,7 +339,7 @@ static void init_eq_buf(struct mlx5_eq *eq) int mlx5_create_map_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq, u8 vecidx, int nent, u64 mask, const char *name, struct mlx5_uar *uar) { - struct mlx5_eq_table *table = dev-priv.eq_table; + struct mlx5_priv *priv = dev-priv; struct mlx5_create_eq_mbox_in *in; struct mlx5_create_eq_mbox_out out; int err; @@ -377,14 +377,15 @@ int mlx5_create_map_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq, u8 vecidx, goto err_in; } - snprintf(eq-name, MLX5_MAX_EQ_NAME, %s@pci:%s, + snprintf(priv-irq_info[vecidx].name, MLX5_MAX_IRQ_NAME, %s@pci:%s, name, pci_name(dev-pdev)); + eq-eqn = out.eq_number; eq-irqn = vecidx; eq-dev = dev; eq-doorbell = uar-map + MLX5_EQ_DOORBEL_OFFSET; - err = request_irq(table-msix_arr[vecidx].vector, mlx5_msix_handler, 0, - eq-name, eq); + err = request_irq(priv-msix_arr[vecidx].vector, mlx5_msix_handler, 0, + priv-irq_info[vecidx].name, eq); if (err) goto err_eq; @@ -400,7 +401,7 @@ int mlx5_create_map_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq, u8 vecidx, return 0; err_irq: - free_irq(table-msix_arr[vecidx].vector, eq); + free_irq(priv-msix_arr[vecidx].vector, eq); err_eq: mlx5_cmd_destroy_eq(dev, eq-eqn); @@ -416,16 +417,15 @@ EXPORT_SYMBOL_GPL(mlx5_create_map_eq); int mlx5_destroy_unmap_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq) { - struct mlx5_eq_table *table = dev-priv.eq_table; int err; mlx5_debug_eq_remove(dev, eq); - free_irq(table-msix_arr[eq-irqn].vector, eq); + free_irq(dev-priv.msix_arr[eq-irqn].vector, eq); err = mlx5_cmd_destroy_eq(dev, eq-eqn); if (err) mlx5_core_warn(dev, failed to destroy a previously created eq: eqn %d\n, eq-eqn); - synchronize_irq(table-msix_arr[eq-irqn].vector); + synchronize_irq(dev-priv.msix_arr[eq-irqn].vector); mlx5_buf_free(dev, eq-buf); return err; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index 28425e5..55085b0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -38,6 +38,7 @@ #include linux/dma-mapping.h #include linux/slab.h #include linux/io-mapping.h +#include linux/interrupt.h #include linux/mlx5/driver.h #include linux/mlx5/cq.h #include linux/mlx5/qp.h @@ -208,7 +209,8 @@ static void release_bar(struct pci_dev *pdev) static int mlx5_enable_msix(struct mlx5_core_dev *dev) { - struct mlx5_eq_table *table = dev-priv.eq_table; + struct mlx5_priv *priv = dev-priv; + struct mlx5_eq_table *table = priv-eq_table; int num_eqs = 1 dev-caps.gen.log_max_eq; int nvec; int i; @@ -218,14 +220,16 @@ static int mlx5_enable_msix(struct mlx5_core_dev *dev) if (nvec = MLX5_EQ_VEC_COMP_BASE) return -ENOMEM; - table-msix_arr = kzalloc(nvec * sizeof(*table-msix_arr), GFP_KERNEL); - if (!table-msix_arr) - return -ENOMEM; + priv-msix_arr = kcalloc(nvec, sizeof(*priv-msix_arr), GFP_KERNEL); + + priv-irq_info = kcalloc(nvec, sizeof(*priv-irq_info), GFP_KERNEL); + if (!priv-msix_arr || !priv-irq_info) + goto err_free_msix; for (i = 0; i nvec; i++) - table-msix_arr[i].entry = i; + priv-msix_arr[i].entry = i; - nvec = pci_enable_msix_range(dev-pdev, table-msix_arr, + nvec = pci_enable_msix_range(dev-pdev, priv-msix_arr, MLX5_EQ_VEC_COMP_BASE + 1, nvec); if (nvec 0) return nvec; @@ -233,14 +237,20 @@ static int
[PATCH net-next V5 00/11] net/mlx5: ConnectX-4 100G Ethernet driver
Hi Dave, This patchset extends the mlx5_core driver to support Ethernet functionality. The Ethernet functionality in the mlx5 driver is integrated into the core driver and not as separated driver. The IB functionality remains in the mlx5_ib driver as before. This functionality will enable the Ethernet capability of Mellanox's new famility of cards - ConnectX-4. Due to the fact that backword compatability is being kept, existing Connect-IB cards that are using this driver are fully working with the modified driver, and no issues with current deployments should be seen. Like the ConnectX-3 cards, ConnectX-4 is a VPI (Virtual Port Interface - every port can be configured as Infiniband or Ethernet) card. Unlike previous generations, the ConnectX-4 has a separate PCI function per port. The current code has a limitation that Infiniband and Ethernet port types are mutually exclusive. When the driver is compiled with Ethernet support, the Infiniband functionality is disabled and vice versa. To control that we added the CONFIG_MLX5_CORE_EN config directive which is 'n' by default, but can be changed by the user. This limitation is short-lived and would be addressed soon. As part of this patchset, mlx5_ifc.h was heavily modified [1]. This file is now generated automatically from the device specification document. Since this patch is too big for the mail server, it might be missing in the mailing list, but could be pulled from an external git repository [2]. irq name selection is done at driver initialization and doesn't contain the interface name as part of the irq name. irq_balancer will still work thanks to an improvement introduced by Neil Horman [3] to use sysfs instead of /proc/interrupts. Patchset was applied on top of commit ed2dfd9 (tcp/dccp: warn user for preferred ip_local_port_range) [1] - Patch 4/11 (net/mlx5_core: HW data structs/types definitions preparation for mlx5 ehternet driver) [2] - http://git.openfabrics.org/?p=~amirv/linux.git;a=shortlog;h=refs/heads/mlx5e_v1 [3] - kernel: da8d1c8 PCI/sysfs: add per pci device msi[x] irq listing (v5) irq_balancer: 32a7757 Complete rework of how we detect and classify irqs Thanks to Achiad, Saeed, Yevheny, Or and the whole team for making this happen, Amir Changes from V4: - Removed Patch 3/12: net/mlx5_core: Add EQ renaming mechanism - Patch 12/12: net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality - irq name is created on driver initialization, therefore it won't contain the network interface name in it. This won't effect irq_balancer thanks to patches introduced by Neil Horman to use sysfs instead of /proc/interrupts. Changes from V3: - PATCH 8/11: net/mlx5_core: Set/Query port MTU commands - Return value directly - no need for err. Changes from V2: - Improved changelogs and cover-letter - Added CONFIG_MLX5_EN to disable/enable the Ethernet functionality - Moved en.h and wq.[ch] into the patch with data-path related code Changes from V1: - Added patch 1/12 (net/mlx5_core,mlx5_ib: Do not use vmap() on coherent memory) Changes from V0: - Removed V0 Patch 1/11 (net/mlx5_core: Virtually extend work/completion queue buffers by one page) due to misuse of DMA API. Thanks Dave. - Patch 1/11 (net/mlx5_core: Set irq affinity hints): - Use kcalloc instead of kzalloc - Fix build error when CONFIG_CPUMASK_OFFSTACK=n. Driver loading will fail now if cpumask allocation is failing. - Using dev_to_node helper. Thanks, Ido. - Patch 3/11 (net/mlx5_core: HW data structs/types definitions preparation for mlx5 ehternet driver) - Removed Mellanox internal comment at the head of the file. Thanks Joe - Patch 6/11 (net/mlx5_core: Implement get/set port status) - Use direct return of function's result. Thanks Sergei. - Added Patch 8/11 (net/mlx5_core: Set/Query port MTU commands) - Patch 9/11 (net/mlx5: Ethernet Datapath files) - Use rq-wqe_sz instead of skb_end_offset. Thanks Ido. - Use dma_wmb() when possible instead of wmb(). Thanks Alex. - Fix checkpatch issues - Patch 10/11 (net/mlx5: ethernet resources handling) - checkpatch issues - Added missing include - Patch 11/11 (net/mlx5: Ethernet driver) - checkpatch issues - fixed typo - Modified use of affinity hint - Using dev_to_node helper. Thanks, Ido. - Use new hardware commands from Patch 8/11 (net/mlx5_core: Set/Query port MTU commands) to get/set port MTU in hardware. - Removed NETIF_F_SG since hardware ring wraparound is not supported - Use dma_wmb() when possible instead of wmb(). Thanks Alex. Amir Vadai (4): net/mlx5_core,mlx5_ib: Do not use vmap() on coherent memory net/mlx5: Ethernet Datapath files net/mlx5: Ethernet resource handling files net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality Rana Shahout (2): net/mlx5_core: Implement get/set port status net/mlx5_core: Modify CQ moderation parameters Saeed Mahameed (5): net/mlx5_core: Set irq affinity hints net/mlx5_core: HW
RE: [PATCH] net: qlcnic: clean up sysfs error codes
-Original Message- From: dept_hsg_linux_nic_dev-boun...@qlclistserver.qlogic.com [mailto:dept_hsg_linux_nic_dev-boun...@qlclistserver.qlogic.com] On Behalf Of Vladimir Zapolskiy Sent: Tuesday, May 26, 2015 6:20 AM To: David Miller; Shahed Shaikh; Dept-GE Linux NIC Dev Cc: netdev Subject: [PATCH] net: qlcnic: clean up sysfs error codes Replace confusing QL_STATUS_INVALID_PARAM == -1 == -EPERM with - EINVAL and QLC_STATUS_UNSUPPORTED_CMD == -2 == -ENOENT with - EOPNOTSUPP, the latter error code is arguable, but it is already used in the driver, so let it be here as well. Also remove always false (!buf) check on read(), the driver should not care if userspace gets its EFAULT or not. Signed-off-by: Vladimir Zapolskiy v...@mleia.com --- drivers/net/ethernet/qlogic/qlcnic/qlcnic.h | 3 - drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c | 2 +- drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c | 77 +++--- - 3 files changed, 36 insertions(+), 46 deletions(-) diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h index f221126..055f376 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic.h @@ -1326,9 +1326,6 @@ struct qlcnic_eswitch { }; -/* Return codes for Error handling */ -#define QL_STATUS_INVALID_PARAM -1 - #define MAX_BW100 /* % of link speed */ #define MIN_BW1 /* % of link speed */ #define MAX_VLAN_ID 4095 diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c index 367f397..2f6cc42 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c @@ -1031,7 +1031,7 @@ int qlcnic_init_pci_info(struct qlcnic_adapter *adapter) pfn = pci_info[i].id; if (pfn = ahw-max_vnic_func) { - ret = QL_STATUS_INVALID_PARAM; + ret = -EINVAL; dev_err(adapter-pdev-dev, %s: Invalid function 0x%x, max 0x%x\n, __func__, pfn, ahw-max_vnic_func); goto err_eswitch; diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c index 59a721f..05c28f2 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c @@ -24,8 +24,6 @@ #include linux/hwmon-sysfs.h #endif -#define QLC_STATUS_UNSUPPORTED_CMD-2 - int qlcnicvf_config_bridged_mode(struct qlcnic_adapter *adapter, u32 enable) { return -EOPNOTSUPP; @@ -166,7 +164,7 @@ static int qlcnic_82xx_store_beacon(struct qlcnic_adapter *adapter, u8 b_state, b_rate; if (len != sizeof(u16)) - return QL_STATUS_INVALID_PARAM; + return -EINVAL; memcpy(beacon, buf, sizeof(u16)); err = qlcnic_validate_beacon(adapter, beacon, b_state, b_rate); @@ -383,17 +381,17 @@ static int validate_pm_config(struct qlcnic_adapter *adapter, dest_pci_func = pm_cfg[i].dest_npar; src_index = qlcnic_is_valid_nic_func(adapter, src_pci_func); if (src_index 0) - return QL_STATUS_INVALID_PARAM; + return -EINVAL; dest_index = qlcnic_is_valid_nic_func(adapter, dest_pci_func); if (dest_index 0) - return QL_STATUS_INVALID_PARAM; + return -EINVAL; s_esw_id = adapter-npars[src_index].phy_port; d_esw_id = adapter-npars[dest_index].phy_port; if (s_esw_id != d_esw_id) - return QL_STATUS_INVALID_PARAM; + return -EINVAL; } return 0; @@ -414,7 +412,7 @@ static ssize_t qlcnic_sysfs_write_pm_config(struct file *filp, count = size / sizeof(struct qlcnic_pm_func_cfg); rem = size % sizeof(struct qlcnic_pm_func_cfg); if (rem) - return QL_STATUS_INVALID_PARAM; + return -EINVAL; qlcnic_swap32_buffer((u32 *)buf, size / sizeof(u32)); pm_cfg = (struct qlcnic_pm_func_cfg *)buf; @@ -427,7 +425,7 @@ static ssize_t qlcnic_sysfs_write_pm_config(struct file *filp, action = !!pm_cfg[i].action; index = qlcnic_is_valid_nic_func(adapter, pci_func); if (index 0) - return QL_STATUS_INVALID_PARAM; + return -EINVAL; id = adapter-npars[index].phy_port; ret = qlcnic_config_port_mirroring(adapter, id, @@ -440,7 +438,7 @@ static ssize_t qlcnic_sysfs_write_pm_config(struct file *filp, pci_func = pm_cfg[i].pci_func; index = qlcnic_is_valid_nic_func(adapter, pci_func); if (index 0) - return
Re: Drops in qdisc on ifb interface
On Thu, 2015-05-28 at 13:31 -0400, jsulli...@opensourcedevel.com wrote: The overall product does but the video source feeds come over a different network via UDP. There are, however, RTMP quality control feeds coming across this connection. There may also occasionally be test UDP source feeds on this connection but those are not production. Thanks - John This is important to know, because UDP wont benefit from GRO. I was assuming your receiver had to handle ~88000 packets per second, so I was doubting it could saturate one core, but maybe your target is very different. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Drops in qdisc on ifb interface
On May 28, 2015 at 1:49 PM Eric Dumazet eric.duma...@gmail.com wrote: On Thu, 2015-05-28 at 13:31 -0400, jsulli...@opensourcedevel.com wrote: The overall product does but the video source feeds come over a different network via UDP. There are, however, RTMP quality control feeds coming across this connection. There may also occasionally be test UDP source feeds on this connection but those are not production. Thanks - John This is important to know, because UDP wont benefit from GRO. I was assuming your receiver had to handle ~88000 packets per second, so I was doubting it could saturate one core, but maybe your target is very different. That PPS estimate seems accurate - the port speed and CIR on the shaped connection is 1 Gbps. I'm still mystified by why the GbE bottlenecks on IFB but the 10GbE does not. Thanks - John -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4] netfilter: nf_tables: add netdev table to filter from ingress
This allows us to create netdev tables that contain ingress chains. Use skb_header_pointer() as we may see shared sk_buffs at this stage. This change provides access to the existing nf_tables features from the ingress hook. Signed-off-by: Pablo Neira Ayuso pa...@netfilter.org --- include/net/netns/nftables.h |1 + net/netfilter/Kconfig|5 ++ net/netfilter/Makefile |1 + net/netfilter/nf_tables_netdev.c | 183 ++ 4 files changed, 190 insertions(+) create mode 100644 net/netfilter/nf_tables_netdev.c diff --git a/include/net/netns/nftables.h b/include/net/netns/nftables.h index eee608b..c807811 100644 --- a/include/net/netns/nftables.h +++ b/include/net/netns/nftables.h @@ -13,6 +13,7 @@ struct netns_nftables { struct nft_af_info *inet; struct nft_af_info *arp; struct nft_af_info *bridge; + struct nft_af_info *netdev; unsigned intbase_seq; u8 gencursor; }; diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig index 9a89e7c..bd5aaeb 100644 --- a/net/netfilter/Kconfig +++ b/net/netfilter/Kconfig @@ -456,6 +456,11 @@ config NF_TABLES_INET help This option enables support for a mixed IPv4/IPv6 inet table. +config NF_TABLES_NETDEV + tristate Netfilter nf_tables netdev tables support + help + This option enables support for the netdev table. + config NFT_EXTHDR tristate Netfilter nf_tables IPv6 exthdr module help diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile index a87d8b8..70d026d 100644 --- a/net/netfilter/Makefile +++ b/net/netfilter/Makefile @@ -75,6 +75,7 @@ nf_tables-objs += nft_bitwise.o nft_byteorder.o nft_payload.o obj-$(CONFIG_NF_TABLES)+= nf_tables.o obj-$(CONFIG_NF_TABLES_INET) += nf_tables_inet.o +obj-$(CONFIG_NF_TABLES_NETDEV) += nf_tables_netdev.o obj-$(CONFIG_NFT_COMPAT) += nft_compat.o obj-$(CONFIG_NFT_EXTHDR) += nft_exthdr.o obj-$(CONFIG_NFT_META) += nft_meta.o diff --git a/net/netfilter/nf_tables_netdev.c b/net/netfilter/nf_tables_netdev.c new file mode 100644 index 000..04cb170 --- /dev/null +++ b/net/netfilter/nf_tables_netdev.c @@ -0,0 +1,183 @@ +/* + * Copyright (c) 2015 Pablo Neira Ayuso pa...@netfilter.org + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include linux/init.h +#include linux/module.h +#include net/netfilter/nf_tables.h +#include linux/ip.h +#include linux/ipv6.h +#include net/netfilter/nf_tables_ipv4.h +#include net/netfilter/nf_tables_ipv6.h + +static inline void +nft_netdev_set_pktinfo_ipv4(struct nft_pktinfo *pkt, + const struct nf_hook_ops *ops, struct sk_buff *skb, + const struct nf_hook_state *state) +{ + struct iphdr *iph, _iph; + u32 len, thoff; + + nft_set_pktinfo(pkt, ops, skb, state); + + iph = skb_header_pointer(skb, skb_network_offset(skb), sizeof(*iph), +_iph); + if (!iph) + return; + + iph = ip_hdr(skb); + if (iph-ihl 5 || iph-version != 4) + return; + + len = ntohs(iph-tot_len); + thoff = iph-ihl * 4; + if (skb-len len) + return; + else if (len thoff) + return; + + pkt-tprot = iph-protocol; + pkt-xt.thoff = thoff; + pkt-xt.fragoff = ntohs(iph-frag_off) IP_OFFSET; +} + +static inline void +__nft_netdev_set_pktinfo_ipv6(struct nft_pktinfo *pkt, + const struct nf_hook_ops *ops, + struct sk_buff *skb, + const struct nf_hook_state *state) +{ +#if IS_ENABLED(CONFIG_IPV6) + struct ipv6hdr *ip6h, _ip6h; + unsigned int thoff = 0; + unsigned short frag_off; + int protohdr; + u32 pkt_len; + + ip6h = skb_header_pointer(skb, skb_network_offset(skb), sizeof(*ip6h), + _ip6h); + if (!ip6h) + return; + + if (ip6h-version != 6) + return; + + pkt_len = ntohs(ip6h-payload_len); + if (pkt_len + sizeof(*ip6h) skb-len) + return; + + protohdr = ipv6_find_hdr(pkt-skb, thoff, -1, frag_off, NULL); + if (protohdr 0) +return; + + pkt-tprot = protohdr; + pkt-xt.thoff = thoff; + pkt-xt.fragoff = frag_off; +#endif +} + +static inline void nft_netdev_set_pktinfo_ipv6(struct nft_pktinfo *pkt, + const struct nf_hook_ops *ops, + struct sk_buff *skb, + const struct nf_hook_state *state) +{ + nft_set_pktinfo(pkt, ops,
Re: [PATCH v2] README: clarify redistribution requirements covering patents
On Tue, May 19, 2015 at 1:22 PM, Luis R. Rodriguez mcg...@do-not-panic.com wrote: This v2 just changes licence to license as requested by Arend. Please let me know if there is anything else needed. Luis -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] netevent: remove automatic variable in register_netevent_notifier()
Remove automatic variable 'err' in register_netevent_notifier() and return the result of atomic_notifier_chain_register() directly. Signed-off-by: Wang Long long.wangl...@huawei.com --- net/core/netevent.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/net/core/netevent.c b/net/core/netevent.c index f17ccd2..8b3bc4f 100644 --- a/net/core/netevent.c +++ b/net/core/netevent.c @@ -31,10 +31,7 @@ static ATOMIC_NOTIFIER_HEAD(netevent_notif_chain); */ int register_netevent_notifier(struct notifier_block *nb) { - int err; - - err = atomic_notifier_chain_register(netevent_notif_chain, nb); - return err; + return atomic_notifier_chain_register(netevent_notif_chain, nb); } EXPORT_SYMBOL_GPL(register_netevent_notifier); -- 1.8.3.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/7] net: dsa: ar8xxx: add ethtool hw statistics support
MIB counters can now be reported through each switch port by using ethtool -S. Signed-off-by: Mathieu Olivari math...@codeaurora.org --- drivers/net/dsa/ar8xxx.c | 106 +++ drivers/net/dsa/ar8xxx.h | 47 + 2 files changed, 146 insertions(+), 7 deletions(-) diff --git a/drivers/net/dsa/ar8xxx.c b/drivers/net/dsa/ar8xxx.c index 4ce3ffc..2f0fa4d 100644 --- a/drivers/net/dsa/ar8xxx.c +++ b/drivers/net/dsa/ar8xxx.c @@ -22,6 +22,55 @@ #include ar8xxx.h +#define MIB_DESC(_s, _o, _n) \ + { \ + .size = (_s), \ + .offset = (_o), \ + .name = (_n), \ + } + +static const struct ar8xxx_mib_desc ar8327_mib[] = { + MIB_DESC(1, 0x00, RxBroad), + MIB_DESC(1, 0x04, RxPause), + MIB_DESC(1, 0x08, RxMulti), + MIB_DESC(1, 0x0c, RxFcsErr), + MIB_DESC(1, 0x10, RxAlignErr), + MIB_DESC(1, 0x14, RxRunt), + MIB_DESC(1, 0x18, RxFragment), + MIB_DESC(1, 0x1c, Rx64Byte), + MIB_DESC(1, 0x20, Rx128Byte), + MIB_DESC(1, 0x24, Rx256Byte), + MIB_DESC(1, 0x28, Rx512Byte), + MIB_DESC(1, 0x2c, Rx1024Byte), + MIB_DESC(1, 0x30, Rx1518Byte), + MIB_DESC(1, 0x34, RxMaxByte), + MIB_DESC(1, 0x38, RxTooLong), + MIB_DESC(2, 0x3c, RxGoodByte), + MIB_DESC(2, 0x44, RxBadByte), + MIB_DESC(1, 0x4c, RxOverFlow), + MIB_DESC(1, 0x50, Filtered), + MIB_DESC(1, 0x54, TxBroad), + MIB_DESC(1, 0x58, TxPause), + MIB_DESC(1, 0x5c, TxMulti), + MIB_DESC(1, 0x60, TxUnderRun), + MIB_DESC(1, 0x64, Tx64Byte), + MIB_DESC(1, 0x68, Tx128Byte), + MIB_DESC(1, 0x6c, Tx256Byte), + MIB_DESC(1, 0x70, Tx512Byte), + MIB_DESC(1, 0x74, Tx1024Byte), + MIB_DESC(1, 0x78, Tx1518Byte), + MIB_DESC(1, 0x7c, TxMaxByte), + MIB_DESC(1, 0x80, TxOverSize), + MIB_DESC(2, 0x84, TxByte), + MIB_DESC(1, 0x8c, TxCollision), + MIB_DESC(1, 0x90, TxAbortCol), + MIB_DESC(1, 0x94, TxMultiCol), + MIB_DESC(1, 0x98, TxSingleCol), + MIB_DESC(1, 0x9c, TxExcDefer), + MIB_DESC(1, 0xa0, TxDefer), + MIB_DESC(1, 0xa4, TxLateCol), +}; + u32 ar8xxx_mii_read32(struct mii_bus *bus, int phy_id, int regnum) { @@ -184,6 +233,10 @@ static int ar8xxx_setup(struct dsa_switch *ds) if (ret 0) return ret; + /* Enable MIB counters */ + ar8xxx_reg_set(ds, AR8327_REG_MIB, AR8327_MIB_CPU_KEEP); + ar8xxx_write(ds, AR8327_REG_MODULE_EN, AR8327_MODULE_EN_MIB); + /* Disable forwarding by default on all ports */ for (i = 0; i AR8327_NUM_PORTS; i++) ar8xxx_rmw(ds, AR8327_PORT_LOOKUP_CTRL(i), @@ -228,6 +281,42 @@ ar8xxx_phy_write(struct dsa_switch *ds, int phy, int regnum, u16 val) return mdiobus_write(bus, phy, regnum, val); } +static void ar8xxx_get_strings(struct dsa_switch *ds, int phy, uint8_t *data) +{ + int i; + + for (i = 0; i ARRAY_SIZE(ar8327_mib); i++) { + strncpy(data + i * ETH_GSTRING_LEN, ar8327_mib[i].name, + ETH_GSTRING_LEN); + } +} + +static void ar8xxx_get_ethtool_stats(struct dsa_switch *ds, int phy, +uint64_t *data) +{ + const struct ar8xxx_mib_desc *mib; + uint32_t reg, i, port; + u64 hi; + + port = phy_to_port(phy); + + for (i = 0; i ARRAY_SIZE(ar8327_mib); i++) { + mib = ar8327_mib[i]; + reg = AR8327_PORT_MIB_COUNTER(port) + mib-offset; + + data[i] = ar8xxx_read(ds, reg); + if (mib-size == 2) { + hi = ar8xxx_read(ds, reg + 4); + data[i] |= hi 32; + } + } +} + +static int ar8xxx_get_sset_count(struct dsa_switch *ds) +{ + return ARRAY_SIZE(ar8327_mib); +} + static void ar8xxx_poll_link(struct dsa_switch *ds) { int i = 0; @@ -275,13 +364,16 @@ static void ar8xxx_poll_link(struct dsa_switch *ds) } static struct dsa_switch_driver ar8xxx_switch_driver = { - .tag_protocol = DSA_TAG_PROTO_NONE, - .probe = ar8xxx_probe, - .setup = ar8xxx_setup, - .set_addr = ar8xxx_set_addr, - .poll_link = ar8xxx_poll_link, - .phy_read = ar8xxx_phy_read, - .phy_write = ar8xxx_phy_write, + .tag_protocol = DSA_TAG_PROTO_NONE, + .probe = ar8xxx_probe, + .setup = ar8xxx_setup, + .set_addr = ar8xxx_set_addr, + .poll_link = ar8xxx_poll_link, + .phy_read = ar8xxx_phy_read, + .phy_write = ar8xxx_phy_write, + .get_strings= ar8xxx_get_strings, + .get_ethtool_stats = ar8xxx_get_ethtool_stats, + .get_sset_count = ar8xxx_get_sset_count, }; static int __init
[PATCH 6/7] net: dsa: ar8xxx: add support for second xMII interfaces through DT
This patch is adding support for port6 specific options to device tree. They can be used to setup the second xMII interface, and connect it to one of the switch port. Signed-off-by: Mathieu Olivari math...@codeaurora.org --- drivers/net/dsa/ar8xxx.c | 50 1 file changed, 50 insertions(+) diff --git a/drivers/net/dsa/ar8xxx.c b/drivers/net/dsa/ar8xxx.c index 4044614..7559249 100644 --- a/drivers/net/dsa/ar8xxx.c +++ b/drivers/net/dsa/ar8xxx.c @@ -19,6 +19,7 @@ #include net/dsa.h #include linux/phy.h #include linux/of_net.h +#include linux/of_platform.h #include ar8xxx.h @@ -260,6 +261,9 @@ static int ar8xxx_set_pad_ctrl(struct dsa_switch *ds, int port, int mode) ar8xxx_write(ds, AR8327_REG_PORT5_PAD_CTRL, AR8327_PORT_PAD_RGMII_RX_DELAY_EN); break; + case PHY_INTERFACE_MODE_SGMII: + ar8xxx_write(ds, reg, AR8327_PORT_PAD_SGMII_EN); + break; default: pr_err(xMII mode %d not supported\n, mode); return -EINVAL; @@ -268,6 +272,48 @@ static int ar8xxx_set_pad_ctrl(struct dsa_switch *ds, int port, int mode) return 0; } +static int ar8xxx_of_setup(struct dsa_switch *ds) +{ + struct device_node *dn = ds-pd-of_node; + const char *s_phymode; + int ret, mode; + u32 phy_id, ctrl; + + /* If port6-phy-mode property exists, configure it accordingly */ + if (!of_property_read_string(dn, qca,port6-phy-mode, s_phymode)) { + for (mode = 0; mode PHY_INTERFACE_MODE_MAX; mode++) + if (!strcasecmp(s_phymode, phy_modes(mode))) + break; + + if (mode == PHY_INTERFACE_MODE_MAX) + pr_err(Unknown phy-mode: \%s\\n, s_phymode); + + ret = ar8xxx_set_pad_ctrl(ds, 6, mode); + if (ret 0) + return ret; + } + + /* If a phy ID is specified for PORT6 mac, connect them together */ + if (!of_property_read_u32(dn, qca,port6-phy-id, phy_id)) { + ar8xxx_rmw(ds, AR8327_PORT_LOOKUP_CTRL(6), + AR8327_PORT_LOOKUP_MEMBER, BIT(phy_to_port(phy_id))); + ar8xxx_rmw(ds, AR8327_PORT_LOOKUP_CTRL(phy_to_port(phy_id)), + AR8327_PORT_LOOKUP_MEMBER, BIT(6)); + + /* We want the switch to be pass-through and act like a PHY on +* these ports. So BC/MC/UC IGMP frames need to be accepted +*/ + ctrl = BIT(phy_to_port(phy_id)) | BIT(6); + ar8xxx_reg_set(ds, AR8327_REG_GLOBAL_FW_CTRL1, + ctrl AR8327_GLOBAL_FW_CTRL1_IGMP_DP_S | + ctrl AR8327_GLOBAL_FW_CTRL1_BC_DP_S | + ctrl AR8327_GLOBAL_FW_CTRL1_MC_DP_S | + ctrl AR8327_GLOBAL_FW_CTRL1_UC_DP_S); + } + + return 0; +} + static int ar8xxx_setup(struct dsa_switch *ds) { struct ar8xxx_priv *priv = ds_to_priv(ds); @@ -341,6 +387,10 @@ static int ar8xxx_setup(struct dsa_switch *ds) } } + ret = ar8xxx_of_setup(ds); + if (ret 0) + return ret; + return 0; } -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/7] net: dsa: ar8xxx: add regmap support
All switch registers can now be dumped using regmap/debugfs. \# cat /sys/kernel/debug/regmap/mdiobus/registers : 1302 0004: ... ... Signed-off-by: Mathieu Olivari math...@codeaurora.org --- drivers/net/dsa/Kconfig | 1 + drivers/net/dsa/ar8xxx.c | 60 drivers/net/dsa/ar8xxx.h | 5 3 files changed, 66 insertions(+) diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig index 2aae541..17fb296 100644 --- a/drivers/net/dsa/Kconfig +++ b/drivers/net/dsa/Kconfig @@ -68,6 +68,7 @@ config NET_DSA_BCM_SF2 config NET_DSA_AR8XXX tristate Qualcomm Atheros AR8XXX Ethernet switch family support depends on NET_DSA + select REGMAP ---help--- This enables support for the Qualcomm Atheros AR8XXX Ethernet switch chips. diff --git a/drivers/net/dsa/ar8xxx.c b/drivers/net/dsa/ar8xxx.c index 2f0fa4d..327abd4 100644 --- a/drivers/net/dsa/ar8xxx.c +++ b/drivers/net/dsa/ar8xxx.c @@ -176,6 +176,57 @@ static char *ar8xxx_probe(struct device *host_dev, int sw_addr) } } +static int ar8xxx_regmap_read(void *ctx, uint32_t reg, uint32_t *val) +{ + struct dsa_switch *ds = (struct dsa_switch *)ctx; + + *val = ar8xxx_read(ds, reg); + + return 0; +} + +static int ar8xxx_regmap_write(void *ctx, uint32_t reg, uint32_t val) +{ + struct dsa_switch *ds = (struct dsa_switch *)ctx; + + ar8xxx_write(ds, reg, val); + + return 0; +} + +static const struct regmap_range ar8xxx_readable_ranges[] = { + regmap_reg_range(0x, 0x00e4), /* Global control */ + regmap_reg_range(0x0100, 0x0168), /* EEE control */ + regmap_reg_range(0x0200, 0x0270), /* Parser control */ + regmap_reg_range(0x0400, 0x0454), /* ACL */ + regmap_reg_range(0x0600, 0x0718), /* Lookup */ + regmap_reg_range(0x0800, 0x0b70), /* QM */ + regmap_reg_range(0x0C00, 0x0c80), /* PKT */ + regmap_reg_range(0x1000, 0x10ac), /* MIB - Port0 */ + regmap_reg_range(0x1100, 0x11ac), /* MIB - Port1 */ + regmap_reg_range(0x1200, 0x12ac), /* MIB - Port2 */ + regmap_reg_range(0x1300, 0x13ac), /* MIB - Port3 */ + regmap_reg_range(0x1400, 0x14ac), /* MIB - Port4 */ + regmap_reg_range(0x1500, 0x15ac), /* MIB - Port5 */ + regmap_reg_range(0x1600, 0x16ac), /* MIB - Port6 */ + +}; + +static struct regmap_access_table ar8xxx_readable_table = { + .yes_ranges = ar8xxx_readable_ranges, + .n_yes_ranges = ARRAY_SIZE(ar8xxx_readable_ranges), +}; + +struct regmap_config ar8xxx_regmap_config = { + .reg_bits = 16, + .val_bits = 32, + .reg_stride = 4, + .max_register = 0x16ac, /* end MIB - Port6 range */ + .reg_read = ar8xxx_regmap_read, + .reg_write = ar8xxx_regmap_write, + .rd_table = ar8xxx_readable_table, +}; + static int ar8xxx_set_pad_ctrl(struct dsa_switch *ds, int port, int mode) { int reg; @@ -219,9 +270,17 @@ static int ar8xxx_set_pad_ctrl(struct dsa_switch *ds, int port, int mode) static int ar8xxx_setup(struct dsa_switch *ds) { + struct ar8xxx_priv *priv = ds_to_priv(ds); struct net_device *netdev = ds-dst-pd-of_netdev; int ret, i, phy_mode; + /* Start by setting up the register mapping */ + priv-regmap = devm_regmap_init(ds-master_dev, NULL, ds, + ar8xxx_regmap_config); + + if (IS_ERR(priv-regmap)) + pr_warn(regmap initialization failed); + /* Initialize CPU port pad mode (xMII type, delays...) */ phy_mode = of_get_phy_mode(netdev-dev.parent-of_node); if (phy_mode 0) { @@ -365,6 +424,7 @@ static void ar8xxx_poll_link(struct dsa_switch *ds) static struct dsa_switch_driver ar8xxx_switch_driver = { .tag_protocol = DSA_TAG_PROTO_NONE, + .priv_size = sizeof(struct ar8xxx_priv), .probe = ar8xxx_probe, .setup = ar8xxx_setup, .set_addr = ar8xxx_set_addr, diff --git a/drivers/net/dsa/ar8xxx.h b/drivers/net/dsa/ar8xxx.h index 7c7a125..98cc7ed 100644 --- a/drivers/net/dsa/ar8xxx.h +++ b/drivers/net/dsa/ar8xxx.h @@ -17,6 +17,11 @@ #define __AR8XXX_H #include linux/delay.h +#include linux/regmap.h + +struct ar8xxx_priv { + struct regmap *regmap; +}; struct ar8xxx_mib_desc { unsigned int size; -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 7/7] Documentation: devicetree: add ar8xxx binding
Add device-tree binding for ar8xxx switch families. Signed-off-by: Mathieu Olivari math...@codeaurora.org --- .../devicetree/bindings/net/dsa/qca-ar8xxx.txt | 70 ++ 1 file changed, 70 insertions(+) create mode 100644 Documentation/devicetree/bindings/net/dsa/qca-ar8xxx.txt diff --git a/Documentation/devicetree/bindings/net/dsa/qca-ar8xxx.txt b/Documentation/devicetree/bindings/net/dsa/qca-ar8xxx.txt new file mode 100644 index 000..f4fd3f1 --- /dev/null +++ b/Documentation/devicetree/bindings/net/dsa/qca-ar8xxx.txt @@ -0,0 +1,70 @@ +* Qualcomm Atheros AR8xxx switch family + +Required properties: + +- compatible: should be qca,ar8xxx +- dsa,mii-bus: phandle to the MDIO bus controller, see dsa/dsa.txt +- dsa,ethernet: phandle to the CPU network interface controller, see dsa/dsa.txt +- #size-cells: must be 0 +- #address-cells: must be 2, see dsa/dsa.txt + +Subnodes: + +The integrated switch subnode should be specified according to the binding +described in dsa/dsa.txt. + +Optional properties: + +- qca,port6-phy-mode: if specified, the driver will configure Port 6 in the + given phy-mode. See Documentation/devicetree/bindings/net/ethernet.txt for + the list of valid phy-mode. + +- qca,port6-phy-id: if specified, the driver will connect Port 6 to the PHY + given as a parameter. In this case, Port6 and the corresponding PHY will be + isolated from the rest of the switch. From a system perspective, they will + act as a regular PHY. + +Example: + + dsa@0 { + compatible = qca,ar8xxx; + #address-cells = 2; + #size-cells = 0; + + dsa,ethernet = ethernet0; + dsa,mii-bus = mii_bus0; + + switch@0 { + #address-cells = 1; + #size-cells = 0; + reg = 0 0;/* MDIO address 0, switch 0 in tree */ + + qca,port6-phy-mode = sgmii; + qca,port6-phy-id = 4; + + port@0 { + reg = 11; + label = cpu; + }; + + port@1 { + reg = 0; + label = lan1; + }; + + port@2 { + reg = 1; + label = lan2; + }; + + port@3 { + reg = 2; + label = lan3; + }; + + port@4 { + reg = 3; + label = lan4; + }; + }; + }; -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/7] net: dsa: add QCA tag support
QCA tags are used on QCA ar8xxx switch family. This change adds support for encap/decap using 2 bytes header mode. Signed-off-by: Mathieu Olivari math...@codeaurora.org --- include/net/dsa.h | 1 + net/dsa/Kconfig| 3 + net/dsa/Makefile | 1 + net/dsa/dsa.c | 5 ++ net/dsa/dsa_priv.h | 2 + net/dsa/slave.c| 5 ++ net/dsa/tag_qca.c | 158 + 7 files changed, 175 insertions(+) create mode 100644 net/dsa/tag_qca.c diff --git a/include/net/dsa.h b/include/net/dsa.h index fbca63b..64ddf6f 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -26,6 +26,7 @@ enum dsa_tag_protocol { DSA_TAG_PROTO_TRAILER, DSA_TAG_PROTO_EDSA, DSA_TAG_PROTO_BRCM, + DSA_TAG_PROTO_QCA, }; #define DSA_MAX_SWITCHES 4 diff --git a/net/dsa/Kconfig b/net/dsa/Kconfig index ff7736f..4f3cce1 100644 --- a/net/dsa/Kconfig +++ b/net/dsa/Kconfig @@ -26,6 +26,9 @@ config NET_DSA_HWMON via the hwmon sysfs interface and exposes the onboard sensors. # tagging formats +config NET_DSA_TAG_QCA + bool + config NET_DSA_TAG_BRCM bool diff --git a/net/dsa/Makefile b/net/dsa/Makefile index da06ed1..9feb86c 100644 --- a/net/dsa/Makefile +++ b/net/dsa/Makefile @@ -3,6 +3,7 @@ obj-$(CONFIG_NET_DSA) += dsa_core.o dsa_core-y += dsa.o slave.o # tagging formats +dsa_core-$(CONFIG_NET_DSA_TAG_QCA) += tag_qca.o dsa_core-$(CONFIG_NET_DSA_TAG_BRCM) += tag_brcm.o dsa_core-$(CONFIG_NET_DSA_TAG_DSA) += tag_dsa.o dsa_core-$(CONFIG_NET_DSA_TAG_EDSA) += tag_edsa.o diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c index fffb9aa..6010a7d 100644 --- a/net/dsa/dsa.c +++ b/net/dsa/dsa.c @@ -249,6 +249,11 @@ static int dsa_switch_setup_one(struct dsa_switch *ds, struct device *parent) dst-rcv = brcm_netdev_ops.rcv; break; #endif +#ifdef CONFIG_NET_DSA_TAG_QCA + case DSA_TAG_PROTO_QCA: + dst-rcv = qca_netdev_ops.rcv; + break; +#endif case DSA_TAG_PROTO_NONE: break; default: diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index d5f1f9b..350c94b 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -74,5 +74,7 @@ extern const struct dsa_device_ops trailer_netdev_ops; /* tag_brcm.c */ extern const struct dsa_device_ops brcm_netdev_ops; +/* tag_qca.c */ +extern const struct dsa_device_ops qca_netdev_ops; #endif diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 04ffad3..cd8f552 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -925,6 +925,11 @@ int dsa_slave_create(struct dsa_switch *ds, struct device *parent, p-xmit = brcm_netdev_ops.xmit; break; #endif +#ifdef CONFIG_NET_DSA_TAG_QCA + case DSA_TAG_PROTO_QCA: + p-xmit = qca_netdev_ops.xmit; + break; +#endif default: p-xmit = dsa_slave_notag_xmit; break; diff --git a/net/dsa/tag_qca.c b/net/dsa/tag_qca.c new file mode 100644 index 000..8f02196 --- /dev/null +++ b/net/dsa/tag_qca.c @@ -0,0 +1,158 @@ +/* + * Copyright (c) 2015, The Linux Foundation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 and + * only version 2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include linux/etherdevice.h +#include dsa_priv.h + +#define QCA_HDR_LEN2 +#define QCA_HDR_VERSION0x2 + +#define QCA_HDR_RECV_VERSION_MASK GENMASK(15, 14) +#define QCA_HDR_RECV_VERSION_S 14 +#define QCA_HDR_RECV_PRIORITY_MASK GENMASK(13, 11) +#define QCA_HDR_RECV_PRIORITY_S11 +#define QCA_HDR_RECV_TYPE_MASK GENMASK(10, 6) +#define QCA_HDR_RECV_TYPE_S6 +#define QCA_HDR_RECV_FRAME_IS_TAGGED BIT(3) +#define QCA_HDR_RECV_SOURCE_PORT_MASK GENMASK(2, 0) + +#define QCA_HDR_XMIT_VERSION_MASK GENMASK(15, 14) +#define QCA_HDR_XMIT_VERSION_S 14 +#define QCA_HDR_XMIT_PRIORITY_MASK GENMASK(13, 11) +#define QCA_HDR_XMIT_PRIORITY_S11 +#define QCA_HDR_XMIT_CONTROL_MASK GENMASK(10, 8) +#define QCA_HDR_XMIT_CONTROL_S 8 +#define QCA_HDR_XMIT_FROM_CPU BIT(7) +#define QCA_HDR_XMIT_DP_BIT_MASK GENMASK(6, 0) + +static inline int reg_to_port(int reg) +{ + if (reg 5) + return reg + 1; + + return -1; +} + +static inline int port_to_reg(int port) +{ + if (port = 1 port = 6) + return port - 1; + + return -1; +} + +static netdev_tx_t qca_tag_xmit(struct sk_buff *skb, struct net_device *dev) +{ +
[PATCH 1/7] net: dsa: add new driver for ar8xxx family
This patch contains initial init registration code for QCA8337. It will detect a QCA8337 switch, if present and declared in DT/platform. Each port will be represented through a standalone net_device interface, as for other DSA switches. CPU can communicate with any of the ports by setting an IP@ on ethN interface. Ports cannot communicate with each other just yet. Link status will be reported through polling, and we don't use any encapsulation. Signed-off-by: Mathieu Olivari math...@codeaurora.org --- drivers/net/dsa/Kconfig | 7 ++ drivers/net/dsa/Makefile | 1 + drivers/net/dsa/ar8xxx.c | 303 +++ drivers/net/dsa/ar8xxx.h | 82 + net/dsa/dsa.c| 1 + 5 files changed, 394 insertions(+) create mode 100644 drivers/net/dsa/ar8xxx.c create mode 100644 drivers/net/dsa/ar8xxx.h diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig index 7ad0a4d..2aae541 100644 --- a/drivers/net/dsa/Kconfig +++ b/drivers/net/dsa/Kconfig @@ -65,4 +65,11 @@ config NET_DSA_BCM_SF2 This enables support for the Broadcom Starfighter 2 Ethernet switch chips. +config NET_DSA_AR8XXX + tristate Qualcomm Atheros AR8XXX Ethernet switch family support + depends on NET_DSA + ---help--- + This enables support for the Qualcomm Atheros AR8XXX Ethernet + switch chips. + endmenu diff --git a/drivers/net/dsa/Makefile b/drivers/net/dsa/Makefile index e2d51c4..7647687 100644 --- a/drivers/net/dsa/Makefile +++ b/drivers/net/dsa/Makefile @@ -14,3 +14,4 @@ ifdef CONFIG_NET_DSA_MV88E6171 mv88e6xxx_drv-y += mv88e6171.o endif obj-$(CONFIG_NET_DSA_BCM_SF2) += bcm_sf2.o +obj-$(CONFIG_NET_DSA_AR8XXX) += ar8xxx.o diff --git a/drivers/net/dsa/ar8xxx.c b/drivers/net/dsa/ar8xxx.c new file mode 100644 index 000..4ce3ffc --- /dev/null +++ b/drivers/net/dsa/ar8xxx.c @@ -0,0 +1,303 @@ +/* + * Copyright (C) 2009 Felix Fietkau n...@openwrt.org + * Copyright (C) 2011-2012 Gabor Juhos juh...@openwrt.org + * Copyright (c) 2015, The Linux Foundation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 and + * only version 2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#include linux/module.h +#include linux/phy.h +#include linux/netdevice.h +#include net/dsa.h +#include linux/phy.h +#include linux/of_net.h + +#include ar8xxx.h + +u32 +ar8xxx_mii_read32(struct mii_bus *bus, int phy_id, int regnum) +{ + u16 lo, hi; + + lo = bus-read(bus, phy_id, regnum); + hi = bus-read(bus, phy_id, regnum + 1); + + return (hi 16) | lo; +} + +void +ar8xxx_mii_write32(struct mii_bus *bus, int phy_id, int regnum, u32 val) +{ + u16 lo, hi; + + lo = val 0x; + hi = (u16)(val 16); + + bus-write(bus, phy_id, regnum, lo); + bus-write(bus, phy_id, regnum + 1, hi); +} + +u32 ar8xxx_read(struct dsa_switch *ds, int reg) +{ + struct mii_bus *bus = dsa_host_dev_to_mii_bus(ds-master_dev); + u16 r1, r2, page; + u32 val; + + split_addr((u32)reg, r1, r2, page); + + mutex_lock(bus-mdio_lock); + + bus-write(bus, 0x18, 0, page); + wait_for_page_switch(); + val = ar8xxx_mii_read32(bus, 0x10 | r2, r1); + + mutex_unlock(bus-mdio_lock); + + return val; +} + +void ar8xxx_write(struct dsa_switch *ds, int reg, u32 val) +{ + struct mii_bus *bus = dsa_host_dev_to_mii_bus(ds-master_dev); + u16 r1, r2, page; + + split_addr((u32)reg, r1, r2, page); + + mutex_lock(bus-mdio_lock); + + bus-write(bus, 0x18, 0, page); + wait_for_page_switch(); + ar8xxx_mii_write32(bus, 0x10 | r2, r1, val); + + mutex_unlock(bus-mdio_lock); +} + +u32 +ar8xxx_rmw(struct dsa_switch *ds, int reg, u32 mask, u32 val) +{ + struct mii_bus *bus = dsa_host_dev_to_mii_bus(ds-master_dev); + u16 r1, r2, page; + u32 ret; + + split_addr((u32)reg, r1, r2, page); + + mutex_lock(bus-mdio_lock); + + bus-write(bus, 0x18, 0, page); + wait_for_page_switch(); + + ret = ar8xxx_mii_read32(bus, 0x10 | r2, r1); + ret = ~mask; + ret |= val; + ar8xxx_mii_write32(bus, 0x10 | r2, r1, ret); + + mutex_unlock(bus-mdio_lock); + + return ret; +} + +static char *ar8xxx_probe(struct device *host_dev, int sw_addr) +{ + struct mii_bus *bus = dsa_host_dev_to_mii_bus(host_dev); + u32 phy_id; + + if (!bus) + return NULL; + + /* sw_addr is irrelevant as the switch occupies the MDIO bus from +* addresses 0 to 4 (PHYs) and 16-23 (for MDIO 32bits protocol). So +* we'll
Re: [PATCH] namespace: Remove no longer needed goto label in the function, ops_init
Nicholas Krause xerofo...@gmail.com writes: This removes the no longer needed goto label, cleanup in the function ops_init due to kfree being NULL pointer safe and therefore no need to avoid calling it the call to kzalloc fails inside this particular function. Your proposed change pessimizes the error path without a description of why that would be a benefit. Further the subject on this patch is incorrect. You don't remove any gotos. So I don't like this change as it. The description is wrong and the change provides little to no real world benefit. Eric Signed-off-by: Nicholas Krause xerofo...@gmail.com --- net/core/net_namespace.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c index 572af00..e8b5568 100644 --- a/net/core/net_namespace.c +++ b/net/core/net_namespace.c @@ -102,7 +102,7 @@ static int ops_init(const struct pernet_operations *ops, struct net *net) err = net_assign_generic(net, *ops-id, data); if (err) - goto cleanup; + goto out; } err = 0; if (ops-init) @@ -110,10 +110,8 @@ static int ops_init(const struct pernet_operations *ops, struct net *net) if (!err) return 0; -cleanup: - kfree(data); - out: + kfree(data); return err; } -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux-next: build failure after merge of most of the trees
Hi Eric, On Thu, 28 May 2015 08:26:51 -0700 Eric Dumazet eric.duma...@gmail.com wrote: We were alerted of this problem thanks to kbuild test robot. This fix is not a definitive one I hope. No, just something to allow me to get my tree to build so I could go to bed :-) Golden rule is that vmalloc() users must include vmalloc.h themselves, not by an indirect include. Yep, that is a special case of Rule 1. I sent one fix, and prepared others, but I prefer that each offender is fixed. Yeah, thanks to Dave for that. -- Cheers, Stephen Rothwells...@canb.auug.org.au pgpb9jNS5wQyU.pgp Description: OpenPGP digital signature
Re: [PATCH v4 net-next 09/11] net: Add IPv6 flow label to flow_keys
On Thu, 2015-05-28 at 11:19 -0700, Tom Herbert wrote: In flow_dissector set the flow label in flow_keys for IPv6. This also removes the shortcircuiting of flow dissection when a non-zero label is present, the flow label can be considered to provide additional entropy for a hash. Signed-off-by: Tom Herbert t...@herbertland.com --- include/net/flow_dissector.h | 4 +++- net/core/flow_dissector.c| 31 +++ 2 files changed, 14 insertions(+), 21 deletions(-) diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h index 08480fb..14d8483 100644 --- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -28,7 +28,8 @@ struct flow_dissector_key_basic { }; struct flow_dissector_key_tags { - u32 vlan_id:12; + u32 vlan_id:12, + flow_label:20; }; /** @@ -111,6 +112,7 @@ enum flow_dissector_key_id { FLOW_DISSECTOR_KEY_ETH_ADDRS, /* struct flow_dissector_key_eth_addrs */ FLOW_DISSECTOR_KEY_TIPC_ADDRS, /* struct flow_dissector_key_tipc_addrs */ FLOW_DISSECTOR_KEY_VLANID, /* struct flow_dissector_key_flow_tags */ + FLOW_DISSECTOR_KEY_FLOW_LABEL, /* struct flow_dissector_key_flow_tags */ FLOW_DISSECTOR_KEY_MAX, }; diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 5c66cb2..ba089d9 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -190,7 +190,7 @@ ip: case htons(ETH_P_IPV6): { const struct ipv6hdr *iph; struct ipv6hdr _iph; - __be32 flow_label; + u32 flow_label; You change flow_label from __be32 to u32. ipv6: iph = __skb_header_pointer(skb, nhoff, sizeof(_iph), data, hlen, _iph); @@ -210,30 +210,17 @@ ipv6: memcpy(key_ipv6_addrs, iph-saddr, sizeof(*key_ipv6_addrs)); key_control-addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS; - goto flow_label; } - break; -flow_label: + flow_label = ip6_flowlabel(iph); But ip6_flowlabel() returns a __be32. This should not please sparse. if (flow_label) { - /* Awesome, IPv6 packet has a flow label so we can - * use that to represent the ports without any - * further dissection. - */ - - key_basic-n_proto = proto; - key_basic-ip_proto = ip_proto; - key_control-thoff = (u16)nhoff; - if (skb_flow_dissector_uses_key(flow_dissector, - FLOW_DISSECTOR_KEY_PORTS)) { - key_ports = skb_flow_dissector_target(flow_dissector, - FLOW_DISSECTOR_KEY_PORTS, - target_container); - key_ports-ports = flow_label; + FLOW_DISSECTOR_KEY_FLOW_LABEL)) { + key_tags = skb_flow_dissector_target(flow_dissector, + FLOW_DISSECTOR_KEY_FLOW_LABEL, + target_container); + key_tags-flow_label = ntohl(flow_label); Then you call ntohl() on u32 variable. This should also complain. Have you run sparse ? make C=2 CF=-D__CHECK_ENDIAN__ net/core/flow_dissector.o Thanks. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: pull request (net-next): ipsec-next 2015-05-28
From: Steffen Klassert steffen.klass...@secunet.com Date: Thu, 28 May 2015 08:25:47 +0200 1) Remove xfrm_queue_purge as this is the same as skb_queue_purge. 2) Optimize policy and state walk. 3) Use a sane return code if afinfo registration fails. 4) Only check fori a acquire state if the state is not valid. 5) Remove a unnecessary NULL check before xfrm_pol_hold as it checks the input for NULL. 6) Return directly if the xfrm hold queue is empty, avoid to take a lock as it is nothing to do in this case. 7) Optimize the inexact policy search and allow for matching of policies with priority ~0U. All from Li RongQing. Please pull or let me know if there are problems. Pulled, thanks Steffen. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net: qlcnic: clean up sysfs error codes
From: Vladimir Zapolskiy v...@mleia.com Date: Tue, 26 May 2015 03:49:45 +0300 Replace confusing QL_STATUS_INVALID_PARAM == -1 == -EPERM with -EINVAL and QLC_STATUS_UNSUPPORTED_CMD == -2 == -ENOENT with -EOPNOTSUPP, the latter error code is arguable, but it is already used in the driver, so let it be here as well. Also remove always false (!buf) check on read(), the driver should not care if userspace gets its EFAULT or not. Signed-off-by: Vladimir Zapolskiy v...@mleia.com Qlogic folks, I'm waiting for your promised feedback. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net: qlcnic: clean up sysfs error codes
Hello David, On 29.05.2015 02:28, David Miller wrote: From: Vladimir Zapolskiy v...@mleia.com Date: Tue, 26 May 2015 03:49:45 +0300 Replace confusing QL_STATUS_INVALID_PARAM == -1 == -EPERM with -EINVAL and QLC_STATUS_UNSUPPORTED_CMD == -2 == -ENOENT with -EOPNOTSUPP, the latter error code is arguable, but it is already used in the driver, so let it be here as well. Also remove always false (!buf) check on read(), the driver should not care if userspace gets its EFAULT or not. Signed-off-by: Vladimir Zapolskiy v...@mleia.com Qlogic folks, I'm waiting for your promised feedback. Rajesh reviewed and acked the change, thank you. http://www.spinics.net/lists/netdev/msg331073.html -- With best wishes, Vladimir -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next] neigh: Add missing rcu_assign_pointer
On 05/28/2015 06:13 PM, Eric Dumazet wrote: This patch is not needed. You really should read Documentation/RCU , because it looks like you are quite confused. When we remove an element from a RCU protected list, all the objects in the chain are already ready to be caught by rcu readers. Therefore, no additional memory barrier is needed before doing *np = n-next; Please do not add spurious memory barriers. Like atomic operations, we want all of them being required and possibly documented. Yes, you are right, thanks for your clear explanation :) However, there are still three places where we use rcu_assign_pointer() to remove a neigh entry from a RCU-protected list, and the three places are neigh_forced_gc(), neigh_flush_dev(), and __neigh_for_each_release() respectively. This means it's redundant for us to use rcu_assign_pointer() in the three places, right? Regards, Ying -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] net: dsa: ar8xxx: add regmap support
Le 05/28/15 18:42, Mathieu Olivari a écrit : All switch registers can now be dumped using regmap/debugfs. \# cat /sys/kernel/debug/regmap/mdiobus/registers : 1302 0004: ... ... ethtool has a register dump command, which should already be supported by the current code in net/dsa/slave.c, is there a particular reason why you use debugfs here instead? -- Florian -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] net: dsa: add QCA AR8xxx switch family support
On Thu, May 28, 2015 at 06:42:15PM -0700, Mathieu Olivari wrote: This patch set adds initial support for AR8xxx switches using the DSA subsystem. It currently supports QCA8337 switch, and can be extended to other hardware in the same family. This switch was already discussed in the following thread: https://www.marc.info/?t=14260141744r=1w=2 Below is a typical picture of a QCA8337 used in a standard home gateway configuration: +---+ +---+ | | SGMII | | | eth0+---+ +-- 1000baseT MDI (WAN) |wan| | 7-port +-- 1000baseT MDI (LAN1) | CPU | | ethernet +-- 1000baseT MDI (LAN2) | | RGMII | switch +-- 1000baseT MDI (LAN3) | eth1+---+ w/5 PHYs +-- 1000baseT MDI (LAN4) |lan| | | +---+ +---+ | MDIO | \/ The switch is connected to the CPU using 2 xMII interfaces. As DSA only supports one logical interface to the switch, we split the switch using device-tree information into 2 parts: *port 6 (one of the xMII switch port) will be dedicated to one particular Ethernet port. From a system perspective, it will be seen as a regular PHY. *port 0 (the other xMII port) will act as the switch master interface FYI: I have patches which allow DSA to use two cpu interfaces. Seems to work on my DIR665 with a Marvell Switch. I will post the patches as an RFC. Andrew -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/7] net: dsa: add new driver for ar8xxx family
+static int ar8xxx_set_pad_ctrl(struct dsa_switch *ds, int port, int mode) +{ + int reg; + + switch (port) { + case 0: + reg = AR8327_REG_PORT0_PAD_CTRL; + break; + case 6: + reg = AR8327_REG_PORT6_PAD_CTRL; + break; + default: + pr_err(Can't set PAD_CTRL on port %d\n, port); + return -EINVAL; + } + + /* DSA only supports 1 CPU port for now, so we'll take the assumption + * that P0 is connected to the CPU master_dev. + */ I don't like this assumption. Hardware i have with Marvell switches has the CPU connected to ports 5, or 6, or 0. Calling dsa_upstream_port() will tell you which is the CPU port. Andrew -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next] bpf: add missing rcu protection when releasing programs from prog_array
Normally the program attachment place (like sockets, qdiscs) takes care of rcu protection and calls bpf_prog_put() after a grace period. The programs stored inside prog_array may not be attached anywhere, so prog_array needs to take care of preserving rcu protection. Otherwise bpf_tail_call() will race with bpf_prog_put(). To solve that introduce bpf_prog_put_rcu() helper function and use it in 3 places where unattached program can decrement refcnt: closing program fd, deleting/replacing program in prog_array. Fixes: 04fd61ab36ec (bpf: allow bpf programs to tail-call other bpf programs) Reported-by: Martin Schwidefsky schwidef...@de.ibm.com Signed-off-by: Alexei Starovoitov a...@plumgrid.com --- include/linux/bpf.h |6 +- kernel/bpf/arraymap.c |4 ++-- kernel/bpf/syscall.c | 19 ++- 3 files changed, 25 insertions(+), 4 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 8821b9a8689e..5f520f5f087e 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -123,7 +123,10 @@ struct bpf_prog_aux { const struct bpf_verifier_ops *ops; struct bpf_map **used_maps; struct bpf_prog *prog; - struct work_struct work; + union { + struct work_struct work; + struct rcu_head rcu; + }; }; struct bpf_array { @@ -153,6 +156,7 @@ void bpf_register_map_type(struct bpf_map_type_list *tl); struct bpf_prog *bpf_prog_get(u32 ufd); void bpf_prog_put(struct bpf_prog *prog); +void bpf_prog_put_rcu(struct bpf_prog *prog); struct bpf_map *bpf_map_get(struct fd f); void bpf_map_put(struct bpf_map *map); diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index 614bcd4c1d74..cb31229a6fa4 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -202,7 +202,7 @@ static int prog_array_map_update_elem(struct bpf_map *map, void *key, old_prog = xchg(array-prog + index, prog); if (old_prog) - bpf_prog_put(old_prog); + bpf_prog_put_rcu(old_prog); return 0; } @@ -218,7 +218,7 @@ static int prog_array_map_delete_elem(struct bpf_map *map, void *key) old_prog = xchg(array-prog + index, NULL); if (old_prog) { - bpf_prog_put(old_prog); + bpf_prog_put_rcu(old_prog); return 0; } else { return -ENOENT; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 98a69bd83069..a1b14d197a4f 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -432,6 +432,23 @@ static void free_used_maps(struct bpf_prog_aux *aux) kfree(aux-used_maps); } +static void __prog_put_rcu(struct rcu_head *rcu) +{ + struct bpf_prog_aux *aux = container_of(rcu, struct bpf_prog_aux, rcu); + + free_used_maps(aux); + bpf_prog_free(aux-prog); +} + +/* version of bpf_prog_put() that is called after a grace period */ +void bpf_prog_put_rcu(struct bpf_prog *prog) +{ + if (atomic_dec_and_test(prog-aux-refcnt)) { + prog-aux-prog = prog; + call_rcu(prog-aux-rcu, __prog_put_rcu); + } +} + void bpf_prog_put(struct bpf_prog *prog) { if (atomic_dec_and_test(prog-aux-refcnt)) { @@ -445,7 +462,7 @@ static int bpf_prog_release(struct inode *inode, struct file *filp) { struct bpf_prog *prog = filp-private_data; - bpf_prog_put(prog); + bpf_prog_put_rcu(prog); return 0; } -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net: qlcnic: clean up sysfs error codes
From: Vladimir Zapolskiy v...@mleia.com Date: Fri, 29 May 2015 04:13:46 +0300 Hello David, On 29.05.2015 02:28, David Miller wrote: From: Vladimir Zapolskiy v...@mleia.com Date: Tue, 26 May 2015 03:49:45 +0300 Replace confusing QL_STATUS_INVALID_PARAM == -1 == -EPERM with -EINVAL and QLC_STATUS_UNSUPPORTED_CMD == -2 == -ENOENT with -EOPNOTSUPP, the latter error code is arguable, but it is already used in the driver, so let it be here as well. Also remove always false (!buf) check on read(), the driver should not care if userspace gets its EFAULT or not. Signed-off-by: Vladimir Zapolskiy v...@mleia.com Qlogic folks, I'm waiting for your promised feedback. Rajesh reviewed and acked the change, thank you. http://www.spinics.net/lists/netdev/msg331073.html Thanks, I missed that, applied. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 1/3] net: systemport: Pre-calculate and utilize cb-bd_addr
On Thu, May 28, 2015 at 3:24 PM, Florian Fainelli f.faine...@gmail.com wrote: There is a 1:1 mapping between the software maintained control block in priv-rx_cbs and the buffer address in priv-rx_bds, such that there is no need to keep computing the buffer address when refiling a control block. Signed-off-by: Florian Fainelli f.faine...@gmail.com Reviewed-by: Petri Gynther pgynt...@google.com --- drivers/net/ethernet/broadcom/bcmsysport.c | 18 +- drivers/net/ethernet/broadcom/bcmsysport.h | 2 -- 2 files changed, 9 insertions(+), 11 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c index 084a50a555de..267330ccd595 100644 --- a/drivers/net/ethernet/broadcom/bcmsysport.c +++ b/drivers/net/ethernet/broadcom/bcmsysport.c @@ -549,12 +549,7 @@ static int bcm_sysport_rx_refill(struct bcm_sysport_priv *priv, } dma_unmap_addr_set(cb, dma_addr, mapping); - dma_desc_set_addr(priv, priv-rx_bd_assign_ptr, mapping); - - priv-rx_bd_assign_index++; - priv-rx_bd_assign_index = (priv-num_rx_bds - 1); - priv-rx_bd_assign_ptr = priv-rx_bds + - (priv-rx_bd_assign_index * DESC_SIZE); + dma_desc_set_addr(priv, cb-bd_addr, mapping); netif_dbg(priv, rx_status, ndev, RX refill\n); @@ -568,7 +563,7 @@ static int bcm_sysport_alloc_rx_bufs(struct bcm_sysport_priv *priv) unsigned int i; for (i = 0; i priv-num_rx_bds; i++) { - cb = priv-rx_cbs[priv-rx_bd_assign_index]; + cb = priv-rx_cbs[i]; if (cb-skb) continue; @@ -1330,14 +1325,14 @@ static inline int tdma_enable_set(struct bcm_sysport_priv *priv, static int bcm_sysport_init_rx_ring(struct bcm_sysport_priv *priv) { + struct bcm_sysport_cb *cb; u32 reg; int ret; + int i; /* Initialize SW view of the RX ring */ priv-num_rx_bds = NUM_RX_DESC; priv-rx_bds = priv-base + SYS_PORT_RDMA_OFFSET; - priv-rx_bd_assign_ptr = priv-rx_bds; - priv-rx_bd_assign_index = 0; priv-rx_c_index = 0; priv-rx_read_ptr = 0; priv-rx_cbs = kcalloc(priv-num_rx_bds, sizeof(struct bcm_sysport_cb), @@ -1347,6 +1342,11 @@ static int bcm_sysport_init_rx_ring(struct bcm_sysport_priv *priv) return -ENOMEM; } + for (i = 0; i priv-num_rx_bds; i++) { + cb = priv-rx_cbs + i; + cb-bd_addr = priv-rx_bds + i * DESC_SIZE; + } + ret = bcm_sysport_alloc_rx_bufs(priv); if (ret) { netif_err(priv, hw, priv-netdev, SKB allocation failed\n); diff --git a/drivers/net/ethernet/broadcom/bcmsysport.h b/drivers/net/ethernet/broadcom/bcmsysport.h index 42a4b4a0bc14..f28bf545d7f4 100644 --- a/drivers/net/ethernet/broadcom/bcmsysport.h +++ b/drivers/net/ethernet/broadcom/bcmsysport.h @@ -663,8 +663,6 @@ struct bcm_sysport_priv { /* Receive queue */ void __iomem*rx_bds; - void __iomem*rx_bd_assign_ptr; - unsigned intrx_bd_assign_index; struct bcm_sysport_cb *rx_cbs; unsigned intnum_rx_bds; unsigned intrx_read_ptr; -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net v2] openvswitch: disable LRO
On Thu, May 28, 2015 at 6:04 AM, Jiri Benc jb...@redhat.com wrote: Currently, openvswitch tries to disable LRO from the user space. This does not work correctly when the device added is a vlan interface, though. Instead of dealing with possibly complex stacked cross name space relations in the user space, do the same as bridging does and call dev_disable_lro in the kernel. Signed-off-by: Jiri Benc jb...@redhat.com Looks good. Acked-by: Pravin B Shelar pshe...@nicira.com -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next] neigh: Add missing rcu_assign_pointer
On Fri, 2015-05-29 at 09:21 +0800, Ying Xue wrote: On 05/28/2015 06:13 PM, Eric Dumazet wrote: This patch is not needed. You really should read Documentation/RCU , because it looks like you are quite confused. When we remove an element from a RCU protected list, all the objects in the chain are already ready to be caught by rcu readers. Therefore, no additional memory barrier is needed before doing *np = n-next; Please do not add spurious memory barriers. Like atomic operations, we want all of them being required and possibly documented. Yes, you are right, thanks for your clear explanation :) However, there are still three places where we use rcu_assign_pointer() to remove a neigh entry from a RCU-protected list, and the three places are neigh_forced_gc(), neigh_flush_dev(), and __neigh_for_each_release() respectively. This means it's redundant for us to use rcu_assign_pointer() in the three places, right? I count 5 places of redundancy. diff --git a/net/core/neighbour.c b/net/core/neighbour.c index 3a74df750af4044eba0e7d88ae01ca9b4dac0e72..ac3b69183cc982e722d9683d6de7a39f66b50b64 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -141,9 +141,7 @@ static int neigh_forced_gc(struct neigh_table *tbl) write_lock(n-lock); if (atomic_read(n-refcnt) == 1 !(n-nud_state NUD_PERMANENT)) { - rcu_assign_pointer(*np, - rcu_dereference_protected(n-next, - lockdep_is_held(tbl-lock))); + *np = n-next; n-dead = 1; shrunk = 1; write_unlock(n-lock); @@ -210,9 +208,7 @@ static void neigh_flush_dev(struct neigh_table *tbl, struct net_device *dev) np = n-next; continue; } - rcu_assign_pointer(*np, - rcu_dereference_protected(n-next, - lockdep_is_held(tbl-lock))); + *np = n-next; write_lock(n-lock); neigh_del_timer(n); n-dead = 1; @@ -380,10 +376,8 @@ static struct neigh_hash_table *neigh_hash_grow(struct neigh_table *tbl, next = rcu_dereference_protected(n-next, lockdep_is_held(tbl-lock)); - rcu_assign_pointer(n-next, - rcu_dereference_protected( - new_nht-hash_buckets[hash], - lockdep_is_held(tbl-lock))); + n-next = new_nht-hash_buckets[hash]; + rcu_assign_pointer(new_nht-hash_buckets[hash], n); } } @@ -515,9 +509,7 @@ struct neighbour *__neigh_create(struct neigh_table *tbl, const void *pkey, n-dead = 0; if (want_ref) neigh_hold(n); - rcu_assign_pointer(n-next, - rcu_dereference_protected(nht-hash_buckets[hash_val], - lockdep_is_held(tbl-lock))); + n-next = nht-hash_buckets[hash_val]; rcu_assign_pointer(nht-hash_buckets[hash_val], n); write_unlock_bh(tbl-lock); neigh_dbg(2, neigh %p is created\n, n); @@ -2381,9 +2373,7 @@ void __neigh_for_each_release(struct neigh_table *tbl, write_lock(n-lock); release = cb(n); if (release) { - rcu_assign_pointer(*np, - rcu_dereference_protected(n-next, - lockdep_is_held(tbl-lock))); + *np = n-next; n-dead = 1; } else np = n-next; -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/7] net: dsa: ar8xxx: add regmap support
Fair enough, are there other global things besides counters that could deserve adding maybe some sort of global/master net_device to help query switch-wide information? This was discussed a while back. I like the current abstraction, all interfaces are real interfaces you can send and receive packets over. This pseudo interface cannot be used for packet transfer, which seems odd. Having access to registers for debugging, so debugfs seems like the best option to me. Andrew -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] netevent: remove automatic variable in register_netevent_notifier()
On 2015/5/28 22:07, Sergei Shtylyov wrote: Hello. On 5/28/2015 1:00 PM, Wang Long wrote: Remove automatic variable 'err' in register_netevent_notifier() and return the return value of atomic_notifier_chain_register() directly. s/return value/result/, in order to avoid tautology. Signed-off-by: Wang Long long.wangl...@huawei.com [...] WBR, Sergei Thanks, I will fix that. Best Regards Wang Long -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/7] net: dsa: add QCA AR8xxx switch family support
This patch set adds initial support for AR8xxx switches using the DSA subsystem. It currently supports QCA8337 switch, and can be extended to other hardware in the same family. This switch was already discussed in the following thread: https://www.marc.info/?t=14260141744r=1w=2 Below is a typical picture of a QCA8337 used in a standard home gateway configuration: +---+ +---+ | | SGMII | | | eth0+---+ +-- 1000baseT MDI (WAN) |wan| | 7-port +-- 1000baseT MDI (LAN1) | CPU | | ethernet +-- 1000baseT MDI (LAN2) | | RGMII | switch +-- 1000baseT MDI (LAN3) | eth1+---+ w/5 PHYs +-- 1000baseT MDI (LAN4) |lan| | | +---+ +---+ | MDIO | \/ The switch is connected to the CPU using 2 xMII interfaces. As DSA only supports one logical interface to the switch, we split the switch using device-tree information into 2 parts: *port 6 (one of the xMII switch port) will be dedicated to one particular Ethernet port. From a system perspective, it will be seen as a regular PHY. *port 0 (the other xMII port) will act as the switch master interface When 2 xMII are used, the switch will therefore be seen as 2 devices: 1 PHY + 1 DSA switch. The configuration of this split is done using driver specific options in device-tree. The exact properties are detailed in the Documentation patch below. Mathieu Olivari (7): net: dsa: add new driver for ar8xxx family net: dsa: ar8xxx: add ethtool hw statistics support net: dsa: ar8xxx: add regmap support net: dsa: add QCA tag support net: dsa: ar8xxx: enable QCA header support on AR8xxx net: dsa: ar8xxx: add support for second xMII interfaces through DT Documentation: devicetree: add ar8xxx binding .../devicetree/bindings/net/dsa/qca-ar8xxx.txt | 70 +++ drivers/net/dsa/Kconfig| 9 + drivers/net/dsa/Makefile | 1 + drivers/net/dsa/ar8xxx.c | 530 + drivers/net/dsa/ar8xxx.h | 157 ++ include/net/dsa.h | 1 + net/dsa/Kconfig| 3 + net/dsa/Makefile | 1 + net/dsa/dsa.c | 6 + net/dsa/dsa_priv.h | 2 + net/dsa/slave.c| 5 + net/dsa/tag_qca.c | 159 +++ 12 files changed, 944 insertions(+) create mode 100644 Documentation/devicetree/bindings/net/dsa/qca-ar8xxx.txt create mode 100644 drivers/net/dsa/ar8xxx.c create mode 100644 drivers/net/dsa/ar8xxx.h create mode 100644 net/dsa/tag_qca.c -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/7] Documentation: devicetree: add ar8xxx binding
Le 05/28/15 18:42, Mathieu Olivari a écrit : Add device-tree binding for ar8xxx switch families. Signed-off-by: Mathieu Olivari math...@codeaurora.org --- .../devicetree/bindings/net/dsa/qca-ar8xxx.txt | 70 ++ 1 file changed, 70 insertions(+) create mode 100644 Documentation/devicetree/bindings/net/dsa/qca-ar8xxx.txt diff --git a/Documentation/devicetree/bindings/net/dsa/qca-ar8xxx.txt b/Documentation/devicetree/bindings/net/dsa/qca-ar8xxx.txt new file mode 100644 index 000..f4fd3f1 --- /dev/null +++ b/Documentation/devicetree/bindings/net/dsa/qca-ar8xxx.txt @@ -0,0 +1,70 @@ +* Qualcomm Atheros AR8xxx switch family + +Required properties: + +- compatible: should be qca,ar8xxx +- dsa,mii-bus: phandle to the MDIO bus controller, see dsa/dsa.txt +- dsa,ethernet: phandle to the CPU network interface controller, see dsa/dsa.txt +- #size-cells: must be 0 +- #address-cells: must be 2, see dsa/dsa.txt + +Subnodes: + +The integrated switch subnode should be specified according to the binding +described in dsa/dsa.txt. + +Optional properties: + +- qca,port6-phy-mode: if specified, the driver will configure Port 6 in the + given phy-mode. See Documentation/devicetree/bindings/net/ethernet.txt for + the list of valid phy-mode. Is there a reason why this is a custom property and not a standard phy-mode property here such that you could utilize of_get_phy_mode() with this directly? + +- qca,port6-phy-id: if specified, the driver will connect Port 6 to the PHY + given as a parameter. In this case, Port6 and the corresponding PHY will be + isolated from the rest of the switch. From a system perspective, they will + act as a regular PHY. Same here, is there a reason why this is not a phy-handle property to a PHY node that sits on a (potentially different) MDIO bus? + +Example: + + dsa@0 { + compatible = qca,ar8xxx; + #address-cells = 2; + #size-cells = 0; + + dsa,ethernet = ethernet0; + dsa,mii-bus = mii_bus0; + + switch@0 { + #address-cells = 1; + #size-cells = 0; + reg = 0 0;/* MDIO address 0, switch 0 in tree */ + + qca,port6-phy-mode = sgmii; + qca,port6-phy-id = 4; + + port@0 { + reg = 11; + label = cpu; + }; + + port@1 { + reg = 0; + label = lan1; + }; + + port@2 { + reg = 1; + label = lan2; + }; + + port@3 { + reg = 2; + label = lan3; + }; + + port@4 { + reg = 3; + label = lan4; + }; + }; + }; -- Florian -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [V5 PATCH 2/5] arm64 : Introduce support for ACPI _CCA object
On Wed, 2015-05-20 at 17:09 -0500, Suravee Suthikulpanit wrote: From http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf, section 6.2.17 _CCA states that ARM platforms require ACPI _CCA object to be specified for DMA-cabpable devices. Therefore, this patch specifies ACPI_CCA_REQUIRED in arm64 Kconfig. In addition, to handle the case when _CCA is missing, arm64 would assign dummy_dma_ops to disable DMA capability of the device. Acked-by: Catalin Marinas catalin.mari...@arm.com Signed-off-by: Mark Salter msal...@redhat.com Signed-off-by: Suravee Suthikulpanit suravee.suthikulpa...@amd.com --- arch/arm64/Kconfig | 1 + arch/arm64/include/asm/dma-mapping.h | 18 ++- arch/arm64/mm/dma-mapping.c | 92 3 files changed, 109 insertions(+), 2 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 4269dba..95307b4 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1,5 +1,6 @@ config ARM64 def_bool y + select ACPI_CCA_REQUIRED if ACPI select ACPI_GENERIC_GSI if ACPI select ACPI_REDUCED_HARDWARE_ONLY if ACPI select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h index 9437e3d..f0d6d0b 100644 --- a/arch/arm64/include/asm/dma-mapping.h +++ b/arch/arm64/include/asm/dma-mapping.h @@ -18,6 +18,7 @@ #ifdef __KERNEL__ +#include linux/acpi.h #include linux/types.h #include linux/vmalloc.h ^^^ This hunk causes build issues with a couple of drivers: drivers/scsi/megaraid/megaraid_sas_fp.c:69:0: warning: FALSE redefined [enabled by default] #define FALSE 0 ^ In file included from include/acpi/acpi.h:58:0, from include/linux/acpi.h:37, from ./arch/arm64/include/asm/dma-mapping.h:21, from include/linux/dma-mapping.h:86, from ./arch/arm64/include/asm/pci.h:7, from include/linux/pci.h:1460, from drivers/scsi/megaraid/megaraid_sas_fp.c:37: include/acpi/actypes.h:433:0: note: this is the location of the previous definition #define FALSE (1 == 0) ^ In file included from include/acpi/acpi.h:58:0, from include/linux/acpi.h:37, from ./arch/arm64/include/asm/dma-mapping.h:21, from include/linux/dma-mapping.h:86, from include/scsi/scsi_cmnd.h:4, from drivers/scsi/ufs/ufshcd.h:60, from drivers/scsi/ufs/ufshcd.c:43: include/acpi/actypes.h:433:41: error: expected identifier before ‘(’ token #define FALSE (1 == 0) ^ drivers/scsi/ufs/unipro.h:203:2: note: in expansion of macro ‘FALSE’ FALSE = 0, ^ This happens because the ACPI definitions of TRUE and FALSE conflict with local definitions in megaraid and enum declaration in ufs. @@ -28,13 +29,23 @@ #define DMA_ERROR_CODE (~(dma_addr_t)0) extern struct dma_map_ops *dma_ops; +extern struct dma_map_ops dummy_dma_ops; static inline struct dma_map_ops *__generic_dma_ops(struct device *dev) { - if (unlikely(!dev) || !dev-archdata.dma_ops) + if (unlikely(!dev)) return dma_ops; - else + else if (dev-archdata.dma_ops) return dev-archdata.dma_ops; + else if (acpi_disabled) + return dma_ops; + + /* + * When ACPI is enabled, if arch_set_dma_ops is not called, + * we will disable device DMA capability by setting it + * to dummy_dma_ops. + */ + return dummy_dma_ops; } static inline struct dma_map_ops *get_dma_ops(struct device *dev) @@ -48,6 +59,9 @@ static inline struct dma_map_ops *get_dma_ops(struct device *dev) static inline void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size, struct iommu_ops *iommu, bool coherent) { + if (!acpi_disabled !dev-archdata.dma_ops) + dev-archdata.dma_ops = dma_ops; + dev-archdata.dma_coherent = coherent; } #define arch_setup_dma_ops arch_setup_dma_ops diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index ef7d112..6e6d6ad 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -415,6 +415,98 @@ out: return -ENOMEM; } +/ + * The following APIs are for dummy DMA ops * + / + +static void *__dummy_alloc(struct device *dev, size_t size, +dma_addr_t *dma_handle, gfp_t flags, +struct dma_attrs *attrs) +{ + return NULL; +} + +static void __dummy_free(struct device *dev, size_t size, + void *vaddr, dma_addr_t dma_handle, + struct dma_attrs
[PATCH 0/4] Netfilter updates for net-next
Hi David, The following patchset contains Netfilter updates for net-next, they are: 1) default CONFIG_NETFILTER_INGRESS to y for easier compile-testing of all options. 2) Allow to bind a table to net_device. This introduces the internal NFT_AF_NEEDS_DEV flag to perform a mandatory check for this binding. This is required by the next patch. 3) Add the 'netdev' table family, this new table allows you to create ingress filter basechains. This provides access to the existing nf_tables features from ingress. 4) Kill unused argument from compat_find_calc_{match,target} in ip_tables and ip6_tables, from Florian Westphal. You can pull these changes from: git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git Thanks! The following changes since commit 76d7c457659dfc05d5a23cd0b21fea333d1788cd: Merge branch 'icmp_frag' (2015-05-19 00:15:50 -0400) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git master for you to fetch changes up to ed6c4136f1571bd6ab362afc3410905a8a69ca42: netfilter: nf_tables: add netdev table to filter from ingress (2015-05-26 18:41:23 +0200) Florian Westphal (1): netfilter: remove unused comefrom hookmask argument Pablo Neira Ayuso (3): netfilter: default CONFIG_NETFILTER_INGRESS to y netfilter: nf_tables: allow to bind table to net_device netfilter: nf_tables: add netdev table to filter from ingress include/net/netfilter/nf_tables.h|8 ++ include/net/netns/nftables.h |1 + include/uapi/linux/netfilter/nf_tables.h |2 + net/ipv4/netfilter/ip_tables.c |4 +- net/ipv6/netfilter/ip6_tables.c |4 +- net/netfilter/Kconfig|6 + net/netfilter/Makefile |1 + net/netfilter/nf_tables_api.c| 46 +++- net/netfilter/nf_tables_netdev.c | 183 ++ 9 files changed, 244 insertions(+), 11 deletions(-) create mode 100644 net/netfilter/nf_tables_netdev.c -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html