Re: [RFC PATCH 4/5] mlx4: add support for fast rx drop bpf program
On Tue, Apr 05, 2016 at 05:15:20PM +0300, Or Gerlitz wrote: > On 4/4/2016 9:50 PM, Alexei Starovoitov wrote: > >On Mon, Apr 04, 2016 at 08:22:03AM -0700, Eric Dumazet wrote: > >>A single flow is able to use 40Gbit on those 40Gbit NIC, so there is not > >>a single 10GB trunk used for a given flow. > >> > >>This 14Mpps thing seems to be a queue limitation on mlx4. > >yeah, could be queueing related. Multiple cpus can send ~30Mpps of the same > >64 byte packet, > >but mlx4 can only receive 14.5Mpps. Odd. > > > >Or (and other mellanox guys), what is really going on inside 40G nic? > > Hi Alexei, > > Not that I know everything that goes inside there, and not that if I > knew it all I could have posted that here (I heard HWs sometimes > have IP)... but, anyway, as for your questions: > > ConnectX3 40Gbs NIC can receive > 10Gbs packet-worthy (14.5M) in > single ring and Mellanox > 100Gbs NICs can receive > 25Gbs packet-worthy (37.5M) in single > ring, people that use DPDK (...) even see this numbers and AFAIU we > now attempt to see that in the kernel with XDP :) > > I realize that we might have some issues in the mlx4 driver > reporting on HW drops. Eran (cc-ed) and Co are looking on that. Thanks! > > In parallel to doing so, I would suggest you to do some experiments > that might shed some more light, if on the TX side you do > > $ ./pktgen_sample03_burst_single_flow.sh -i $DEV -d $IP -m $MAC -t 4 > > On the RX side, skip RSS and force the packets that match that > traffic pattern to go to (say) ring (==action) 0 > > $ ethtool -U $DEV flow-type ip4 dst-mac $MAC dst-ip $IP action 0 loc 0 I added the module parameter: options mlx4_core log_num_mgm_entry_size=-1 And with this I was able to reach to >20 Mpps. This is actually regardless of the ethtool settings mentioned above. 25.31% ksoftirqd/0 [mlx4_en] [k] mlx4_en_process_rx_cq 20.18% ksoftirqd/0 [mlx4_en] [k] mlx4_en_alloc_frags 8.42% ksoftirqd/0 [mlx4_en] [k] mlx4_en_free_frag 5.59% swapper [kernel.vmlinux] [k] poll_idle 5.38% ksoftirqd/0 [kernel.vmlinux] [k] get_page_from_freelist 3.06% ksoftirqd/0 [mlx4_en] [k] mlx4_call_bpf 2.73% ksoftirqd/0 [mlx4_en] [k] 0x0001cf94 2.72% ksoftirqd/0 [kernel.vmlinux] [k] free_pages_prepare 2.19% ksoftirqd/0 [kernel.vmlinux] [k] percpu_array_map_lookup_elem 2.08% ksoftirqd/0 [kernel.vmlinux] [k] sk_load_byte_positive_offset 1.72% ksoftirqd/0 [kernel.vmlinux] [k] free_one_page 1.59% ksoftirqd/0 [kernel.vmlinux] [k] bpf_map_lookup_elem 1.30% ksoftirqd/0 [mlx4_en] [k] 0x0001cfc1 1.07% ksoftirqd/0 [kernel.vmlinux] [k] __alloc_pages_nodemask 1.00% ksoftirqd/0 [mlx4_en] [k] mlx4_alloc_pages.isra.23 > > to go back to RSS remove the rule > > $ ethtool -U $DEV delete action 0 > > FWIW (not that I see how it helps you now), you can do HW drop on > the RX side with ring -1 > > $ ethtool -U $DEV flow-type ip4 dst-mac $MAC dst-ip $IP action -1 loc 0 > > Or. > Here also is the output from the two machines using a tool to get ethtool delta stats at 1 second intervals: --- sender --- tx_packets: 20,246,059 tx_bytes: 1,214,763,540 bps= 9,267.91 Mbps xmit_more: 19,463,226 queue_stopped: 36,982 wake_queue: 36,982 rx_pause: 6,351 tx_pause_duration: 124,974 tx_pause_transition: 3,176 tx_novlan_packets: 20,244,344 tx_novlan_bytes: 1,295,629,440 bps= 9,884.86 Mbps tx0_packets: 5,151,029 tx0_bytes: 309,061,680 bps = 2,357.95 Mbps tx1_packets: 5,094,532 tx1_bytes: 305,671,920 bps = 2,332.9 Mbps tx2_packets: 5,130,996 tx2_bytes: 307,859,760 bps = 2,348.78 Mbps tx3_packets: 5,135,513 tx3_bytes: 308,130,780 bps = 2,350.85 Mbps UP 0: 9,389.68 Mbps = 100.00% UP 0: 20,512,070 Tran/sec = 100.00% --- receiver --- rx_packets: 20,207,929 rx_bytes: 1,212,475,740 bps= 9,250.45 Mbps rx_dropped: 236,604 rx_pause_duration: 128,436 rx_pause_transition: 3,258 tx_pause: 6,516 rx_novlan_packets: 20,208,906 rx_novlan_bytes: 1,293,369,984 bps= 9,867.62 Mbps rx0_packets: 20,444,526 rx0_bytes: 1,226,671,560 bps= 9,358.76 Mbps
[net-next 02/14] i40e: Enable Geneve offload for FW API ver > 1.4 for XL710/X710 devices
From: Anjali Singhai JainThis patch enables the Capability for XL710/X710 devices with FW API version higher than 1.4 to do geneve Rx offload. Change-ID: I9a8f87772c48d7d67dc85e3701d2e0b845034c0b Signed-off-by: Anjali Singhai Jain Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_main.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 297fd39..fdcb50a 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -9158,6 +9158,12 @@ static int i40e_config_netdev(struct i40e_vsi *vsi) I40E_VLAN_ANY, false, true); spin_unlock_bh(>mac_filter_list_lock); } + } else if ((pf->hw.aq.api_maj_ver > 1) || + ((pf->hw.aq.api_maj_ver == 1) && + (pf->hw.aq.api_min_ver > 4))) { + /* Supported in FW API version higher than 1.4 */ + pf->flags |= I40E_FLAG_GENEVE_OFFLOAD_CAPABLE; + pf->auto_disable_flags = I40E_FLAG_HW_ATR_EVICT_CAPABLE; } else { /* relate the VSI_VMDQ name to the VSI_MAIN name */ snprintf(netdev->name, IFNAMSIZ, "%sv%%d", -- 2.5.5
[net-next 01/14] i40e: remove redundant check on vsi->active_vlans
From: Colin Kingactive_vlans is an unsigned long array, hence a null check on this array is superfluous and can be removed. Detected with static analysis by smatch: drivers/net/ethernet/intel/i40e/i40e_debugfs.c:386 i40e_dbg_dump_vsi_seid() warn: this array is probably non-NULL. 'vsi->active_vlans' Signed-off-by: Colin Ian King Acked-by: Shannon Nelson Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c index 0c97733..83dccf1 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c +++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c @@ -147,9 +147,8 @@ static void i40e_dbg_dump_vsi_seid(struct i40e_pf *pf, int seid) dev_info(>pdev->dev, "vlan_features = 0x%08lx\n", (unsigned long int)nd->vlan_features); } - if (vsi->active_vlans) - dev_info(>pdev->dev, -"vlgrp: & = %p\n", vsi->active_vlans); + dev_info(>pdev->dev, +"vlgrp: & = %p\n", vsi->active_vlans); dev_info(>pdev->dev, "state = %li flags = 0x%08lx, netdev_registered = %i, current_netdev_flags = 0x%04x\n", vsi->state, vsi->flags, -- 2.5.5
[net-next 10/14] i40e: Fix for supported link modes in 10GBaseT PHY's
From: Avinash Dayanand100baseT/Full is now listed and supported link mode for 10GBaseT PHY. This is a fix to list all the supported link modes of 10GBaseT PHY. Change-ID: If2be3212ef0fef85fd5d6e4550c7783de2f915e9 Signed-off-by: Avinash Dayanand Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 16 1 file changed, 16 insertions(+) diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c index 410d237..8a83d45 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c +++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c @@ -313,6 +313,13 @@ static void i40e_get_settings_link_up(struct i40e_hw *hw, ecmd->advertising |= ADVERTISED_1baseT_Full; if (hw_link_info->requested_speeds & I40E_LINK_SPEED_1GB) ecmd->advertising |= ADVERTISED_1000baseT_Full; + /* adding 100baseT support for 10GBASET_PHY */ + if (pf->flags & I40E_FLAG_HAVE_10GBASET_PHY) { + ecmd->supported |= SUPPORTED_100baseT_Full; + ecmd->advertising |= ADVERTISED_100baseT_Full | +ADVERTISED_1000baseT_Full | +ADVERTISED_1baseT_Full; + } break; case I40E_PHY_TYPE_1000BASE_T_OPTICAL: ecmd->supported = SUPPORTED_Autoneg | @@ -325,6 +332,15 @@ static void i40e_get_settings_link_up(struct i40e_hw *hw, SUPPORTED_100baseT_Full; if (hw_link_info->requested_speeds & I40E_LINK_SPEED_100MB) ecmd->advertising |= ADVERTISED_100baseT_Full; + /* firmware detects 10G phy as 100M phy at 100M speed */ + if (pf->flags & I40E_FLAG_HAVE_10GBASET_PHY) { + ecmd->supported |= SUPPORTED_1baseT_Full | + SUPPORTED_1000baseT_Full; + ecmd->advertising |= ADVERTISED_Autoneg | +ADVERTISED_100baseT_Full | +ADVERTISED_1000baseT_Full | +ADVERTISED_1baseT_Full; + } break; case I40E_PHY_TYPE_10GBASE_CR1_CU: case I40E_PHY_TYPE_10GBASE_CR1: -- 2.5.5
[net-next 05/14] i40e: Add new device ID for X722
From: Catherine SullivanThe new device ID is 0x37D3 and it should follow the same flows and branding string as for 0x37D0. Change-ID: Ia5ad4a1910268c4666a3fd46a7afffbec55b4fc2 Signed-off-by: Catherine Sullivan Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_common.c | 1 + drivers/net/ethernet/intel/i40e/i40e_devids.h | 1 + drivers/net/ethernet/intel/i40e/i40e_main.c | 1 + drivers/net/ethernet/intel/i40evf/i40e_common.c | 1 + drivers/net/ethernet/intel/i40evf/i40e_devids.h | 1 + 5 files changed, 5 insertions(+) diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c index 8276a13..ebcc0d3 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_common.c +++ b/drivers/net/ethernet/intel/i40e/i40e_common.c @@ -60,6 +60,7 @@ static i40e_status i40e_set_mac_type(struct i40e_hw *hw) case I40E_DEV_ID_SFP_X722: case I40E_DEV_ID_1G_BASE_T_X722: case I40E_DEV_ID_10G_BASE_T_X722: + case I40E_DEV_ID_SFP_I_X722: hw->mac.type = I40E_MAC_X722; break; default: diff --git a/drivers/net/ethernet/intel/i40e/i40e_devids.h b/drivers/net/ethernet/intel/i40e/i40e_devids.h index 99257fc..dd4457d 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_devids.h +++ b/drivers/net/ethernet/intel/i40e/i40e_devids.h @@ -44,6 +44,7 @@ #define I40E_DEV_ID_SFP_X722 0x37D0 #define I40E_DEV_ID_1G_BASE_T_X722 0x37D1 #define I40E_DEV_ID_10G_BASE_T_X7220x37D2 +#define I40E_DEV_ID_SFP_I_X722 0x37D3 #define i40e_is_40G_device(d) ((d) == I40E_DEV_ID_QSFP_A || \ (d) == I40E_DEV_ID_QSFP_B || \ diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index fdcb50a..73d4bea 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -90,6 +90,7 @@ static const struct pci_device_id i40e_pci_tbl[] = { {PCI_VDEVICE(INTEL, I40E_DEV_ID_SFP_X722), 0}, {PCI_VDEVICE(INTEL, I40E_DEV_ID_1G_BASE_T_X722), 0}, {PCI_VDEVICE(INTEL, I40E_DEV_ID_10G_BASE_T_X722), 0}, + {PCI_VDEVICE(INTEL, I40E_DEV_ID_SFP_I_X722), 0}, {PCI_VDEVICE(INTEL, I40E_DEV_ID_20G_KR2), 0}, {PCI_VDEVICE(INTEL, I40E_DEV_ID_20G_KR2_A), 0}, /* required last entry */ diff --git a/drivers/net/ethernet/intel/i40evf/i40e_common.c b/drivers/net/ethernet/intel/i40evf/i40e_common.c index 771ac6a..4db0c03 100644 --- a/drivers/net/ethernet/intel/i40evf/i40e_common.c +++ b/drivers/net/ethernet/intel/i40evf/i40e_common.c @@ -58,6 +58,7 @@ i40e_status i40e_set_mac_type(struct i40e_hw *hw) case I40E_DEV_ID_SFP_X722: case I40E_DEV_ID_1G_BASE_T_X722: case I40E_DEV_ID_10G_BASE_T_X722: + case I40E_DEV_ID_SFP_I_X722: hw->mac.type = I40E_MAC_X722; break; case I40E_DEV_ID_X722_VF: diff --git a/drivers/net/ethernet/intel/i40evf/i40e_devids.h b/drivers/net/ethernet/intel/i40evf/i40e_devids.h index ca8b58c..7023570 100644 --- a/drivers/net/ethernet/intel/i40evf/i40e_devids.h +++ b/drivers/net/ethernet/intel/i40evf/i40e_devids.h @@ -44,6 +44,7 @@ #define I40E_DEV_ID_SFP_X722 0x37D0 #define I40E_DEV_ID_1G_BASE_T_X722 0x37D1 #define I40E_DEV_ID_10G_BASE_T_X7220x37D2 +#define I40E_DEV_ID_SFP_I_X722 0x37D3 #define I40E_DEV_ID_X722_VF0x37CD #define I40E_DEV_ID_X722_VF_HV 0x37D9 -- 2.5.5
[net-next 11/14] i40e: Lower some message levels
From: Mitch WilliamsThese conditions can happen any time VFs are enabled or disabled and are not really indicative of fatal problems unless they happen continuously. Lower the log level so that people don't get scared. Change-ID: I1ceb4adbd10d03cbeed54d1f5b7f20d60328351d Signed-off-by: Mitch Williams Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c index 169c256..9924503 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c @@ -1232,8 +1232,8 @@ static int i40e_vc_send_msg_to_vf(struct i40e_vf *vf, u32 v_opcode, /* single place to detect unsuccessful return values */ if (v_retval) { vf->num_invalid_msgs++; - dev_err(>pdev->dev, "VF %d failed opcode %d, error: %d\n", - vf->vf_id, v_opcode, v_retval); + dev_info(>pdev->dev, "VF %d failed opcode %d, retval: %d\n", +vf->vf_id, v_opcode, v_retval); if (vf->num_invalid_msgs > I40E_DEFAULT_NUM_INVALID_MSGS_ALLOWED) { dev_err(>pdev->dev, @@ -1251,9 +1251,9 @@ static int i40e_vc_send_msg_to_vf(struct i40e_vf *vf, u32 v_opcode, aq_ret = i40e_aq_send_msg_to_vf(hw, abs_vf_id, v_opcode, v_retval, msg, msglen, NULL); if (aq_ret) { - dev_err(>pdev->dev, - "Unable to send the message to VF %d aq_err %d\n", - vf->vf_id, pf->hw.aq.asq_last_status); + dev_info(>pdev->dev, +"Unable to send the message to VF %d aq_err %d\n", +vf->vf_id, pf->hw.aq.asq_last_status); return -EIO; } -- 2.5.5
[net-next 09/14] i40evf: Fix get_rss_aq
From: Catherine SullivanWe were passing in the seed where we should just be passing false because we want the VSI table not the pf table. Change-ID: I9b633ab06eb59468087f0c0af8539857e99f9495 Signed-off-by: Catherine Sullivan Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40evf/i40evf_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c index 6561a33..2d1fe56 100644 --- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c +++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c @@ -1341,7 +1341,7 @@ static int i40evf_get_rss_aq(struct i40e_vsi *vsi, const u8 *seed, } if (lut) { - ret = i40evf_aq_get_rss_lut(hw, vsi->id, seed, lut, lut_size); + ret = i40evf_aq_get_rss_lut(hw, vsi->id, false, lut, lut_size); if (ret) { dev_err(>pdev->dev, "Cannot get RSS lut, err %s aq_err %s\n", -- 2.5.5
[net-next 08/14] i40e: Disable link polling
From: Shannon NelsonPeriodic link polling was added when the link events were found not to be trustworthy. This was the case early on, but was likely because the link event mask was being used incorrectly. As this has been fixed in recent code, we can disable the link polling to lessen the AQ traffic. Change-ID: Id890b5ee3c2d04381fc76ffa434777644f5d8eb0 Signed-off-by: Shannon Nelson Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_main.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 73d4bea..184f3f9 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -8448,7 +8448,6 @@ static int i40e_sw_init(struct i40e_pf *pf) /* Set default capability flags */ pf->flags = I40E_FLAG_RX_CSUM_ENABLED | I40E_FLAG_MSI_ENABLED | - I40E_FLAG_LINK_POLLING_ENABLED | I40E_FLAG_MSIX_ENABLED; if (iommu_present(_bus_type)) -- 2.5.5
[net-next 04/14] i40evf: Fix VLAN features
From: Mitch WilliamsUsers of ethtool were being given the mistaken impression that this driver was able to change its VLAN tagging features, and were disappointed that this was not actually the case. Implement ndo_fix_features method so that we can adjust these flags as needed to avoid false impressions. Change-ID: I08584f103a4fa73d6a4128d472e4ef44dcfda57f Signed-off-by: Mitch Williams Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40evf/i40evf_main.c | 23 +++ 1 file changed, 23 insertions(+) diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c index e397368..2d018b4 100644 --- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c +++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c @@ -2252,6 +2252,28 @@ static int i40evf_change_mtu(struct net_device *netdev, int new_mtu) return 0; } +#define I40EVF_VLAN_FEATURES (NETIF_F_HW_VLAN_CTAG_TX |\ + NETIF_F_HW_VLAN_CTAG_RX |\ + NETIF_F_HW_VLAN_CTAG_FILTER) + +/** + * i40evf_fix_features - fix up the netdev feature bits + * @netdev: our net device + * @features: desired feature bits + * + * Returns fixed-up features bits + **/ +static netdev_features_t i40evf_fix_features(struct net_device *netdev, +netdev_features_t features) +{ + struct i40evf_adapter *adapter = netdev_priv(netdev); + + features &= ~I40EVF_VLAN_FEATURES; + if (adapter->vf_res->vf_offload_flags & I40E_VIRTCHNL_VF_OFFLOAD_VLAN) + features |= I40EVF_VLAN_FEATURES; + return features; +} + static const struct net_device_ops i40evf_netdev_ops = { .ndo_open = i40evf_open, .ndo_stop = i40evf_close, @@ -2264,6 +2286,7 @@ static const struct net_device_ops i40evf_netdev_ops = { .ndo_tx_timeout = i40evf_tx_timeout, .ndo_vlan_rx_add_vid= i40evf_vlan_rx_add_vid, .ndo_vlan_rx_kill_vid = i40evf_vlan_rx_kill_vid, + .ndo_fix_features = i40evf_fix_features, #ifdef CONFIG_NET_POLL_CONTROLLER .ndo_poll_controller= i40evf_netpoll, #endif -- 2.5.5
[net-next 12/14] i40e: Request PHY media event at reset time
From: Shannon NelsonAdd the Media Not Available flag to the link event mask. It seems that event comes first if you have a DA cable pulled out, but there's no follow-up event for Link Down; if you're not looking for MEDIA_NA you will get no event, even though there's now no Link. Change-ID: cb3340a2849805bb881f64f6f2ae810eef46eba7 Signed-off-by: Shannon Nelson Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_main.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 184f3f9..d2c0106 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -6859,6 +6859,7 @@ static void i40e_reset_and_rebuild(struct i40e_pf *pf, bool reinit) */ ret = i40e_aq_set_phy_int_mask(>hw, ~(I40E_AQ_EVENT_LINK_UPDOWN | +I40E_AQ_EVENT_MEDIA_NA | I40E_AQ_EVENT_MODULE_QUAL_FAIL), NULL); if (ret) dev_info(>pdev->dev, "set phy mask fail, err %s aq_err %s\n", @@ -11070,6 +11071,7 @@ static int i40e_probe(struct pci_dev *pdev, const struct pci_device_id *ent) */ err = i40e_aq_set_phy_int_mask(>hw, ~(I40E_AQ_EVENT_LINK_UPDOWN | +I40E_AQ_EVENT_MEDIA_NA | I40E_AQ_EVENT_MODULE_QUAL_FAIL), NULL); if (err) dev_info(>pdev->dev, "set phy mask fail, err %s aq_err %s\n", -- 2.5.5
[net-next 07/14] i40evf: Add longer wait after remove module
From: Mitch WilliamsUpon module remove, wait a little longer after requesting a reset before checking to see if the firmware responded. This change prevents double resets when the firmware is busy. Change-ID: Ieedc988ee82fac1f32a074bf4d9e4dba426bfa58 Signed-off-by: Mitch Williams Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40evf/i40evf_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c index 2d018b4..6561a33 100644 --- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c +++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c @@ -2854,11 +2854,11 @@ static void i40evf_remove(struct pci_dev *pdev) adapter->state = __I40EVF_REMOVE; adapter->aq_required = 0; i40evf_request_reset(adapter); - msleep(20); + msleep(50); /* If the FW isn't responding, kick it once, but only once. */ if (!i40evf_asq_done(hw)) { i40evf_request_reset(adapter); - msleep(20); + msleep(50); } if (adapter->msix_entries) { -- 2.5.5
[net-next 03/14] i40e: Remove unused variable
From: Mitch WilliamsThis variable is vestigial, a remnant of the primordial code from which this driver spawned. We can safely remove it. Change-ID: I24e0fe338e7c7c50d27dc5515564f33caefbb93a Signed-off-by: Mitch Williams Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c index 47b9e62..150002e 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c @@ -1311,8 +1311,8 @@ static int i40e_vc_get_vf_resources_msg(struct i40e_vf *vf, u8 *msg) struct i40e_pf *pf = vf->pf; i40e_status aq_ret = 0; struct i40e_vsi *vsi; - int i = 0, len = 0; int num_vsis = 1; + int len = 0; int ret; if (!test_bit(I40E_VF_STAT_INIT, >vf_states)) { @@ -1374,15 +1374,14 @@ static int i40e_vc_get_vf_resources_msg(struct i40e_vf *vf, u8 *msg) vfres->num_queue_pairs = vf->num_queue_pairs; vfres->max_vectors = pf->hw.func_caps.num_msix_vectors_vf; if (vf->lan_vsi_idx) { - vfres->vsi_res[i].vsi_id = vf->lan_vsi_id; - vfres->vsi_res[i].vsi_type = I40E_VSI_SRIOV; - vfres->vsi_res[i].num_queue_pairs = vsi->alloc_queue_pairs; + vfres->vsi_res[0].vsi_id = vf->lan_vsi_id; + vfres->vsi_res[0].vsi_type = I40E_VSI_SRIOV; + vfres->vsi_res[0].num_queue_pairs = vsi->alloc_queue_pairs; /* VFs only use TC 0 */ - vfres->vsi_res[i].qset_handle + vfres->vsi_res[0].qset_handle = le16_to_cpu(vsi->info.qs_handle[0]); - ether_addr_copy(vfres->vsi_res[i].default_mac_addr, + ether_addr_copy(vfres->vsi_res[0].default_mac_addr, vf->default_lan_addr.addr); - i++; } set_bit(I40E_VF_STAT_ACTIVE, >vf_states); -- 2.5.5
[net-next 14/14] i40e/i40evf: Fix TSO checksum pseudo-header adjustment
From: Alexander DuyckWith IPv4 and IPv6 now using the same format for checksums based on the length of the frame we need to update the i40e and i40evf drivers so that they correctly account for lengths greater than or equal to 64K. With this patch the driver should now correctly update checksums for frames up to 16776960 in length which should be more than large enough for all possible TSO frames in the near future. Signed-off-by: Alexander Duyck Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_txrx.c | 11 --- drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 11 --- 2 files changed, 8 insertions(+), 14 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c index 5bef5b0..5d5fa53 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c @@ -2304,10 +2304,8 @@ static int i40e_tso(struct i40e_ring *tx_ring, struct sk_buff *skb, l4_offset = l4.hdr - skb->data; /* remove payload length from outer checksum */ - paylen = (__force u16)l4.udp->check; - paylen += ntohs((__force __be16)1) * - (u16)~(skb->len - l4_offset); - l4.udp->check = ~csum_fold((__force __wsum)paylen); + paylen = skb->len - l4_offset; + csum_replace_by_diff(>check, htonl(paylen)); } /* reset pointers to inner headers */ @@ -2327,9 +2325,8 @@ static int i40e_tso(struct i40e_ring *tx_ring, struct sk_buff *skb, l4_offset = l4.hdr - skb->data; /* remove payload length from inner checksum */ - paylen = (__force u16)l4.tcp->check; - paylen += ntohs((__force __be16)1) * (u16)~(skb->len - l4_offset); - l4.tcp->check = ~csum_fold((__force __wsum)paylen); + paylen = skb->len - l4_offset; + csum_replace_by_diff(>check, htonl(paylen)); /* compute length of segmentation header */ *hdr_len = (l4.tcp->doff * 4) + l4_offset; diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c index 570348d..04aabc5 100644 --- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c @@ -1571,10 +1571,8 @@ static int i40e_tso(struct i40e_ring *tx_ring, struct sk_buff *skb, l4_offset = l4.hdr - skb->data; /* remove payload length from outer checksum */ - paylen = (__force u16)l4.udp->check; - paylen += ntohs((__force __be16)1) * - (u16)~(skb->len - l4_offset); - l4.udp->check = ~csum_fold((__force __wsum)paylen); + paylen = skb->len - l4_offset; + csum_replace_by_diff(>check, htonl(paylen)); } /* reset pointers to inner headers */ @@ -1594,9 +1592,8 @@ static int i40e_tso(struct i40e_ring *tx_ring, struct sk_buff *skb, l4_offset = l4.hdr - skb->data; /* remove payload length from inner checksum */ - paylen = (__force u16)l4.tcp->check; - paylen += ntohs((__force __be16)1) * (u16)~(skb->len - l4_offset); - l4.tcp->check = ~csum_fold((__force __wsum)paylen); + paylen = skb->len - l4_offset; + csum_replace_by_diff(>check, htonl(paylen)); /* compute length of segmentation header */ *hdr_len = (l4.tcp->doff * 4) + l4_offset; -- 2.5.5
[net-next 13/14] i40e/i40evf: Bump patch from 1.5.1 to 1.5.2
From: Avinash DayanandSigned-off-by: Avinash Dayanand Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +- drivers/net/ethernet/intel/i40evf/i40evf_main.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index d2c0106..d6147f8 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -46,7 +46,7 @@ static const char i40e_driver_string[] = #define DRV_VERSION_MAJOR 1 #define DRV_VERSION_MINOR 5 -#define DRV_VERSION_BUILD 1 +#define DRV_VERSION_BUILD 2 #define DRV_VERSION __stringify(DRV_VERSION_MAJOR) "." \ __stringify(DRV_VERSION_MINOR) "." \ __stringify(DRV_VERSION_BUILD)DRV_KERN diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c index 2d1fe56..f4dada0 100644 --- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c +++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c @@ -38,7 +38,7 @@ static const char i40evf_driver_string[] = #define DRV_VERSION_MAJOR 1 #define DRV_VERSION_MINOR 5 -#define DRV_VERSION_BUILD 1 +#define DRV_VERSION_BUILD 2 #define DRV_VERSION __stringify(DRV_VERSION_MAJOR) "." \ __stringify(DRV_VERSION_MINOR) "." \ __stringify(DRV_VERSION_BUILD) \ -- 2.5.5
[net-next 06/14] i40e: Make VF resets more reliable
From: Mitch WilliamsClear the VFLR bit immediately after triggering a reset instead of waiting until after cleanup is complete. Make sure to trigger a reset every time, not just if the PF is up. These changes fix a problem where VF resets would get lost by the PF, preventing the VF driver from initializing. Change-ID: I5945cf2884095b7b0554867c64df8617e71d9d29 Signed-off-by: Mitch Williams Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c index 150002e..169c256 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c @@ -937,6 +937,10 @@ void i40e_reset_vf(struct i40e_vf *vf, bool flr) wr32(hw, I40E_VPGEN_VFRTRIG(vf->vf_id), reg); i40e_flush(hw); } + /* clear the VFLR bit in GLGEN_VFLRSTAT */ + reg_idx = (hw->func_caps.vf_base_id + vf->vf_id) / 32; + bit_idx = (hw->func_caps.vf_base_id + vf->vf_id) % 32; + wr32(hw, I40E_GLGEN_VFLRSTAT(reg_idx), BIT(bit_idx)); if (i40e_quiesce_vf_pci(vf)) dev_err(>pdev->dev, "VF %d PCI transactions stuck\n", @@ -989,10 +993,6 @@ complete_reset: /* tell the VF the reset is done */ wr32(hw, I40E_VFGEN_RSTAT1(vf->vf_id), I40E_VFR_VFACTIVE); - /* clear the VFLR bit in GLGEN_VFLRSTAT */ - reg_idx = (hw->func_caps.vf_base_id + vf->vf_id) / 32; - bit_idx = (hw->func_caps.vf_base_id + vf->vf_id) % 32; - wr32(hw, I40E_GLGEN_VFLRSTAT(reg_idx), BIT(bit_idx)); i40e_flush(hw); clear_bit(__I40E_VF_DISABLE, >state); } @@ -2296,11 +2296,9 @@ int i40e_vc_process_vflr_event(struct i40e_pf *pf) /* read GLGEN_VFLRSTAT register to find out the flr VFs */ vf = >vf[vf_id]; reg = rd32(hw, I40E_GLGEN_VFLRSTAT(reg_idx)); - if (reg & BIT(bit_idx)) { + if (reg & BIT(bit_idx)) /* i40e_reset_vf will clear the bit in GLGEN_VFLRSTAT */ - if (!test_bit(__I40E_DOWN, >state)) - i40e_reset_vf(vf, true); - } + i40e_reset_vf(vf, true); } return 0; -- 2.5.5
[net-next 00/14][pull request] 40GbE Intel Wired LAN Driver Updates 2016-04-05
This series contains updates to i40e and i40evf only. Colin Ian King cleaned up a redundant NULL check which was found by static analysis. Anjali enables geneve receive offload for XL710/X710 devices. Mitch cleans up unused variable in i40e_vc_get_vf_resources_msg(). Fixed the driver to actually be able to adjust VLAN tagging features through ethtool, as expected. Fixed a problem where VF resets would get lost by the PF preventing the VF driver from initializing. Also put users mind at ease by lowering some message levels since many of these conditions can happen any time VFs are enabled or disabled and are not really indicative a fatal problems, unless they happen continuously. Shannon disables the link polling to lessen the admin queue traffic especially since the link event mask usage has been fixed recently. Alex Duyck fixes the i40e and i40evf drivers to correctly update checksums for frames up to 16776960 in length which should be more than large enough for all possible TSO frames in the near future. The following are changes since commit 4da46cebbd3b4dc445195a9672c99c1353af5695: net/core/dev: Warn on a too-short GRO frame and are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 40GbE Alexander Duyck (1): i40e/i40evf: Fix TSO checksum pseudo-header adjustment Anjali Singhai Jain (1): i40e: Enable Geneve offload for FW API ver > 1.4 for XL710/X710 devices Avinash Dayanand (2): i40e: Fix for supported link modes in 10GBaseT PHY's i40e/i40evf: Bump patch from 1.5.1 to 1.5.2 Catherine Sullivan (2): i40e: Add new device ID for X722 i40evf: Fix get_rss_aq Colin King (1): i40e: remove redundant check on vsi->active_vlans Mitch Williams (5): i40e: Remove unused variable i40evf: Fix VLAN features i40e: Make VF resets more reliable i40evf: Add longer wait after remove module i40e: Lower some message levels Shannon Nelson (2): i40e: Disable link polling i40e: Request phy media event at reset time drivers/net/ethernet/intel/i40e/i40e_common.c | 1 + drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 5 ++- drivers/net/ethernet/intel/i40e/i40e_devids.h | 1 + drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 16 ++ drivers/net/ethernet/intel/i40e/i40e_main.c| 12 +-- drivers/net/ethernet/intel/i40e/i40e_txrx.c| 11 +++ drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 37 ++ drivers/net/ethernet/intel/i40evf/i40e_common.c| 1 + drivers/net/ethernet/intel/i40evf/i40e_devids.h| 1 + drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 11 +++ drivers/net/ethernet/intel/i40evf/i40evf_main.c| 31 +++--- 11 files changed, 84 insertions(+), 43 deletions(-) -- 2.5.5
Re: [PATCH net-next 1/3] net: dsa: make the STP state function return void
Hi Andrew, Andrew Lunnwrites: >> -- port_stp_update: bridge layer function invoked when a given switch port >> STP >> +- port_stp_state: bridge layer function invoked when a given switch port STP > > port_stp_state_set might be a better name, to make it clear it is > setting the state, not getting the current state, etc. Most of the > other functions are _add, _prepare, _join, _leave, so _set would fit > the pattern. I agree, I'm changing that. > Changing to a void makes sense. Thanks, Vivien
Re: [PATCH net-next 2/3] net: dsa: make the FDB add function return void
Hi Andrew, Andrew Lunnwrites: >> mutex_lock(>smi_mutex); >> -ret = _mv88e6xxx_port_fdb_load(ds, port, fdb->addr, fdb->vid, state); >> +if (_mv88e6xxx_port_fdb_load(ds, port, fdb->addr, fdb->vid, state)) >> +netdev_warn(ds->ports[port], "cannot load address\n"); > > In the SF2 driver you use pr_err, but here netdev_warn. We probably > should be consistent if we error or warn. I would use netdev_error, > since if this fails we probably have a real hardware problem. I used pr_err in the SF2 driver to be consistent with the rest of the code which only uses pr_err and pr_info. I was thinking about adding ds_err and ds_port_err to print errors for ds->master_dev and ds->ports[port], but that might be overkill. What do you think? Or local to the driver for the moment, like mvsw_err maybe? I tend to use warn for cases where the user cannot really do something about the situation, but an hardware problem is indeed critical, so I agree with you to use error over warn here. Thanks, Vivien
[PATCH net-next V3 13/16] net: fec: detect tx int lost
If a tx int is lost, no need to reset the fec. Just mark the event and call napi_schedule. Signed-off-by: Troy Kisky--- v3: no change --- drivers/net/ethernet/freescale/fec_main.c | 38 ++- 1 file changed, 37 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index be875fd..445443d 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1094,14 +1094,50 @@ fec_stop(struct net_device *ndev) } } +static const uint txint_flags[] = { + FEC_ENET_TXF_0, FEC_ENET_TXF_1, FEC_ENET_TXF_2 +}; static void fec_timeout(struct net_device *ndev) { struct fec_enet_private *fep = netdev_priv(ndev); + struct bufdesc *bdp; + unsigned short status; + int i; + uint events = 0; - fec_dump(ndev); + for (i = 0; i < fep->num_tx_queues; i++) { + struct fec_enet_priv_tx_q *txq = fep->tx_queue[i]; + int index; + struct sk_buff *skb = NULL; + bdp = txq->dirty_tx; + while (1) { + bdp = fec_enet_get_nextdesc(bdp, >bd); + if (bdp == txq->bd.cur) + break; + index = fec_enet_get_bd_index(bdp, >bd); + skb = txq->tx_skbuff[index]; + if (skb) { + status = fec16_to_cpu(bdp->cbd_sc); + if ((status & BD_ENET_TX_READY) == 0) + events |= txint_flags[i]; + break; + } + } + } + if (events) { + fep->events |= events; + /* Disable the RX/TX interrupt */ + writel(FEC_NAPI_IMASK, fep->hwp + FEC_IMASK); + napi_schedule(>napi); + netif_wake_queue(fep->netdev); + pr_err("%s: tx int lost\n", __func__); + return; + } + + fec_dump(ndev); ndev->stats.tx_errors++; schedule_work(>tx_timeout_work); -- 2.5.0
[PATCH net-next V3 09/16] net: fec: eliminate calls to fec_enet_get_prevdesc
Eliminating calls to fec_enet_get_prevdesc shrinks the code a little. Signed-off-by: Troy Kisky--- v3: Change commit message s/unsigned status/unsigned int status/ as requested --- drivers/net/ethernet/freescale/fec_main.c | 37 +-- 1 file changed, 11 insertions(+), 26 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 21d2cd0..349fda1 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -758,6 +758,7 @@ static void fec_enet_bd_init(struct net_device *dev) struct bufdesc *bdp; unsigned int i; unsigned int q; + unsigned int status; for (q = 0; q < fep->num_rx_queues; q++) { /* Initialize the receive buffer descriptors. */ @@ -765,19 +766,13 @@ static void fec_enet_bd_init(struct net_device *dev) bdp = rxq->bd.base; for (i = 0; i < rxq->bd.ring_size; i++) { - /* Initialize the BD for every fragment in the page. */ - if (bdp->cbd_bufaddr) - bdp->cbd_sc = cpu_to_fec16(BD_ENET_RX_EMPTY); - else - bdp->cbd_sc = cpu_to_fec16(0); + status = bdp->cbd_bufaddr ? BD_ENET_RX_EMPTY : 0; + if (bdp == rxq->bd.last) + status |= BD_SC_WRAP; + bdp->cbd_sc = cpu_to_fec16(status); bdp = fec_enet_get_nextdesc(bdp, >bd); } - - /* Set the last buffer to wrap */ - bdp = fec_enet_get_prevdesc(bdp, >bd); - bdp->cbd_sc |= cpu_to_fec16(BD_SC_WRAP); - rxq->bd.cur = rxq->bd.base; } @@ -789,18 +784,16 @@ static void fec_enet_bd_init(struct net_device *dev) for (i = 0; i < txq->bd.ring_size; i++) { /* Initialize the BD for every fragment in the page. */ - bdp->cbd_sc = cpu_to_fec16(0); if (txq->tx_skbuff[i]) { dev_kfree_skb_any(txq->tx_skbuff[i]); txq->tx_skbuff[i] = NULL; } bdp->cbd_bufaddr = cpu_to_fec32(0); + bdp->cbd_sc = cpu_to_fec16((bdp == txq->bd.last) ? + BD_SC_WRAP : 0); bdp = fec_enet_get_nextdesc(bdp, >bd); } - - /* Set the last buffer to wrap */ bdp = fec_enet_get_prevdesc(bdp, >bd); - bdp->cbd_sc |= cpu_to_fec16(BD_SC_WRAP); txq->dirty_tx = bdp; } } @@ -2717,19 +2710,16 @@ fec_enet_alloc_rxq_buffers(struct net_device *ndev, unsigned int queue) } rxq->rx_skbuff[i] = skb; - bdp->cbd_sc = cpu_to_fec16(BD_ENET_RX_EMPTY); if (fep->bufdesc_ex) { struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp; ebdp->cbd_esc = cpu_to_fec32(BD_ENET_RX_INT); } + bdp->cbd_sc = cpu_to_fec16(BD_ENET_RX_EMPTY | + ((bdp == rxq->bd.last) ? BD_SC_WRAP : 0)); bdp = fec_enet_get_nextdesc(bdp, >bd); } - - /* Set the last buffer to wrap. */ - bdp = fec_enet_get_prevdesc(bdp, >bd); - bdp->cbd_sc |= cpu_to_fec16(BD_SC_WRAP); return 0; err_alloc: @@ -2752,21 +2742,16 @@ fec_enet_alloc_txq_buffers(struct net_device *ndev, unsigned int queue) if (!txq->tx_bounce[i]) goto err_alloc; - bdp->cbd_sc = cpu_to_fec16(0); bdp->cbd_bufaddr = cpu_to_fec32(0); if (fep->bufdesc_ex) { struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp; ebdp->cbd_esc = cpu_to_fec32(BD_ENET_TX_INT); } - + bdp->cbd_sc = cpu_to_fec16((bdp == txq->bd.last) ? + BD_SC_WRAP : 0); bdp = fec_enet_get_nextdesc(bdp, >bd); } - - /* Set the last buffer to wrap. */ - bdp = fec_enet_get_prevdesc(bdp, >bd); - bdp->cbd_sc |= cpu_to_fec16(BD_SC_WRAP); - return 0; err_alloc: -- 2.5.0
[PATCH net-next V3 14/16] net: fec: create subroutine reset_tx_queue
Create subroutine reset_tx_queue to have one place to release any queued tx skbs. Signed-off-by: Troy Kisky--- v3: change commit message --- drivers/net/ethernet/freescale/fec_main.c | 50 +++ 1 file changed, 25 insertions(+), 25 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 445443d..a38acf2 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -752,12 +752,33 @@ fec_enet_start_xmit(struct sk_buff *skb, struct net_device *ndev) return NETDEV_TX_OK; } +static void reset_tx_queue(struct fec_enet_private *fep, + struct fec_enet_priv_tx_q *txq) +{ + struct bufdesc *bdp = txq->bd.base; + unsigned int i; + + txq->bd.cur = bdp; + for (i = 0; i < txq->bd.ring_size; i++) { + /* Initialize the BD for every fragment in the page. */ + if (txq->tx_skbuff[i]) { + dev_kfree_skb_any(txq->tx_skbuff[i]); + txq->tx_skbuff[i] = NULL; + } + bdp->cbd_bufaddr = cpu_to_fec32(0); + bdp->cbd_sc = cpu_to_fec16((bdp == txq->bd.last) ? + BD_SC_WRAP : 0); + bdp = fec_enet_get_nextdesc(bdp, >bd); + } + bdp = fec_enet_get_prevdesc(bdp, >bd); + txq->dirty_tx = bdp; +} + /* Init RX & TX buffer descriptors */ static void fec_enet_bd_init(struct net_device *dev) { struct fec_enet_private *fep = netdev_priv(dev); - struct fec_enet_priv_tx_q *txq; struct fec_enet_priv_rx_q *rxq; struct bufdesc *bdp; unsigned int i; @@ -780,26 +801,8 @@ static void fec_enet_bd_init(struct net_device *dev) rxq->bd.cur = rxq->bd.base; } - for (q = 0; q < fep->num_tx_queues; q++) { - /* ...and the same for transmit */ - txq = fep->tx_queue[q]; - bdp = txq->bd.base; - txq->bd.cur = bdp; - - for (i = 0; i < txq->bd.ring_size; i++) { - /* Initialize the BD for every fragment in the page. */ - if (txq->tx_skbuff[i]) { - dev_kfree_skb_any(txq->tx_skbuff[i]); - txq->tx_skbuff[i] = NULL; - } - bdp->cbd_bufaddr = cpu_to_fec32(0); - bdp->cbd_sc = cpu_to_fec16((bdp == txq->bd.last) ? - BD_SC_WRAP : 0); - bdp = fec_enet_get_nextdesc(bdp, >bd); - } - bdp = fec_enet_get_prevdesc(bdp, >bd); - txq->dirty_tx = bdp; - } + for (q = 0; q < fep->num_tx_queues; q++) + reset_tx_queue(fep, fep->tx_queue[q]); } static void fec_enet_active_rxring(struct net_device *ndev) @@ -2648,13 +2651,10 @@ static void fec_enet_free_buffers(struct net_device *ndev) for (q = 0; q < fep->num_tx_queues; q++) { txq = fep->tx_queue[q]; - bdp = txq->bd.base; + reset_tx_queue(fep, txq); for (i = 0; i < txq->bd.ring_size; i++) { kfree(txq->tx_bounce[i]); txq->tx_bounce[i] = NULL; - skb = txq->tx_skbuff[i]; - txq->tx_skbuff[i] = NULL; - dev_kfree_skb(skb); } } } -- 2.5.0
[PATCH net-next V3 05/16] net: fec: reduce interrupts
By clearing the NAPI interrupts in the NAPI routine and not in the interrupt handler, we can reduce the number of interrupts. We also don't need any status variables as the registers are still valid. Also, notice that if budget pkts are received, the next call to fec_enet_rx_napi will now continue to receive the previously pending packets. To test that this actually reduces interrupts, try this command before/after patch cat /proc/interrupts |grep ether; \ ping -s2800 192.168.0.201 -f -c1000 ; \ cat /proc/interrupts |grep ether For me, before this patch is 2996 interrupts. After patch is 2010 interrupts. Signed-off-by: Troy Kisky--- v3: Fix introduced bug of checking for FEC_ENET_TS_TIMER before calling fec_ptp_check_pps_event Changed commit message to show measured changes. Used netdev_info instead of pr_info. Fugang Duan suggested splitting TX and RX into two NAPI contexts, but that should be a separate patch as it is unrelated to what this patch does. --- drivers/net/ethernet/freescale/fec.h | 6 +- drivers/net/ethernet/freescale/fec_main.c | 118 +++--- 2 files changed, 45 insertions(+), 79 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h index 6dd0ba8..9d5bdc6 100644 --- a/drivers/net/ethernet/freescale/fec.h +++ b/drivers/net/ethernet/freescale/fec.h @@ -505,11 +505,7 @@ struct fec_enet_private { unsigned int total_tx_ring_size; unsigned int total_rx_ring_size; - - unsigned long work_tx; - unsigned long work_rx; - unsigned long work_ts; - unsigned long work_mdio; + uintevents; struct platform_device *pdev; diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index b4d46f8..918ac82 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -70,8 +70,6 @@ static void fec_enet_itr_coal_init(struct net_device *ndev); #define DRIVER_NAME"fec" -#define FEC_ENET_GET_QUQUE(_x) ((_x == 0) ? 1 : ((_x == 1) ? 2 : 0)) - /* Pause frame feild and FIFO threshold */ #define FEC_ENET_FCE (1 << 5) #define FEC_ENET_RSEM_V0x84 @@ -1257,21 +1255,6 @@ static void fec_txq(struct net_device *ndev, struct fec_enet_priv_tx_q *txq) writel(0, txq->bd.reg_desc_active); } -static void -fec_enet_tx(struct net_device *ndev) -{ - struct fec_enet_private *fep = netdev_priv(ndev); - struct fec_enet_priv_tx_q *txq; - u16 queue_id; - /* First process class A queue, then Class B and Best Effort queue */ - for_each_set_bit(queue_id, >work_tx, FEC_ENET_MAX_TX_QS) { - clear_bit(queue_id, >work_tx); - txq = fep->tx_queue[FEC_ENET_GET_QUQUE(queue_id)]; - fec_txq(ndev, txq); - } - return; -} - static int fec_enet_new_rxbdp(struct net_device *ndev, struct bufdesc *bdp, struct sk_buff *skb) { @@ -1505,70 +1488,34 @@ rx_processing_done: return pkt_received; } -static int -fec_enet_rx(struct net_device *ndev, int budget) -{ - int pkt_received = 0; - u16 queue_id; - struct fec_enet_private *fep = netdev_priv(ndev); - struct fec_enet_priv_rx_q *rxq; - - for_each_set_bit(queue_id, >work_rx, FEC_ENET_MAX_RX_QS) { - clear_bit(queue_id, >work_rx); - rxq = fep->rx_queue[FEC_ENET_GET_QUQUE(queue_id)]; - pkt_received += fec_rxq(ndev, rxq, budget - pkt_received); - } - return pkt_received; -} - -static bool -fec_enet_collect_events(struct fec_enet_private *fep, uint int_events) -{ - if (int_events == 0) - return false; - - if (int_events & FEC_ENET_RXF_0) - fep->work_rx |= (1 << 2); - if (int_events & FEC_ENET_RXF_1) - fep->work_rx |= (1 << 0); - if (int_events & FEC_ENET_RXF_2) - fep->work_rx |= (1 << 1); - - if (int_events & FEC_ENET_TXF_0) - fep->work_tx |= (1 << 2); - if (int_events & FEC_ENET_TXF_1) - fep->work_tx |= (1 << 0); - if (int_events & FEC_ENET_TXF_2) - fep->work_tx |= (1 << 1); - - return true; -} - static irqreturn_t fec_enet_interrupt(int irq, void *dev_id) { struct net_device *ndev = dev_id; struct fec_enet_private *fep = netdev_priv(ndev); - uint int_events; irqreturn_t ret = IRQ_NONE; + uint eir = readl(fep->hwp + FEC_IEVENT); + uint int_events = eir & readl(fep->hwp + FEC_IMASK); - int_events = readl(fep->hwp + FEC_IEVENT); - writel(int_events, fep->hwp + FEC_IEVENT); - fec_enet_collect_events(fep, int_events); - - if ((fep->work_tx || fep->work_rx) && fep->link) { - ret = IRQ_HANDLED; - + if (int_events & (FEC_ENET_RXF | FEC_ENET_TXF)) { if (napi_schedule_prep(>napi)) {
[PATCH net-next V3 10/16] net: fec: move restart test for efficiency
Move restart test to earlier in fec_txq() which saves one comparison. Signed-off-by: Troy Kisky--- v3: change commit message --- drivers/net/ethernet/freescale/fec_main.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 349fda1..a2a9dca 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1157,8 +1157,13 @@ static void fec_txq(struct net_device *ndev, struct fec_enet_priv_tx_q *txq) /* Order the load of bd.cur and cbd_sc */ rmb(); status = fec16_to_cpu(READ_ONCE(bdp->cbd_sc)); - if (status & BD_ENET_TX_READY) + if (status & BD_ENET_TX_READY) { + if (!readl(txq->bd.reg_desc_active)) { + /* ERR006358 has hit, restart tx */ + writel(0, txq->bd.reg_desc_active); + } break; + } index = fec_enet_get_bd_index(bdp, >bd); @@ -1230,11 +1235,6 @@ static void fec_txq(struct net_device *ndev, struct fec_enet_priv_tx_q *txq) netif_tx_wake_queue(nq); } } - - /* ERR006538: Keep the transmitter going */ - if (bdp != txq->bd.cur && - readl(txq->bd.reg_desc_active) == 0) - writel(0, txq->bd.reg_desc_active); } static int -- 2.5.0
[PATCH net-next V3 11/16] net: fec: clear cbd_sc after transmission to help with debugging
When the tx queue is dumped, it is easier to see that this entry is idle if cbd_sc is cleared after transmission. Signed-off-by: Troy Kisky--- v3: change commit message --- drivers/net/ethernet/freescale/fec_main.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index a2a9dca..f96ea97 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1164,6 +1164,8 @@ static void fec_txq(struct net_device *ndev, struct fec_enet_priv_tx_q *txq) } break; } + bdp->cbd_sc = cpu_to_fec16((bdp == txq->bd.last) ? + BD_SC_WRAP : 0); index = fec_enet_get_bd_index(bdp, >bd); -- 2.5.0
[PATCH net-next V3 12/16] net: fec: dump all tx queues in fec_dump
Dump all tx queues, not just queue 0. Also, disable fec interrupts first. The interrupts will be reenabled in fec_restart. Signed-off-by: Troy Kisky--- v3: no change --- drivers/net/ethernet/freescale/fec_main.c | 40 +-- 1 file changed, 22 insertions(+), 18 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index f96ea97..be875fd 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -271,28 +271,32 @@ static void swap_buffer2(void *dst_buf, void *src_buf, int len) static void fec_dump(struct net_device *ndev) { struct fec_enet_private *fep = netdev_priv(ndev); - struct bufdesc *bdp; - struct fec_enet_priv_tx_q *txq; - int index = 0; + int i; + /* Disable all FEC interrupts */ + writel(0, fep->hwp + FEC_IMASK); netdev_info(ndev, "TX ring dump\n"); pr_info("Nr SC addr len SKB\n"); - txq = fep->tx_queue[0]; - bdp = txq->bd.base; - - do { - pr_info("%3u %c%c 0x%04x 0x%08x %4u %p\n", - index, - bdp == txq->bd.cur ? 'S' : ' ', - bdp == txq->dirty_tx ? 'H' : ' ', - fec16_to_cpu(bdp->cbd_sc), - fec32_to_cpu(bdp->cbd_bufaddr), - fec16_to_cpu(bdp->cbd_datlen), - txq->tx_skbuff[index]); - bdp = fec_enet_get_nextdesc(bdp, >bd); - index++; - } while (bdp != txq->bd.base); + for (i = 0; i < fep->num_tx_queues; i++) { + struct fec_enet_priv_tx_q *txq = fep->tx_queue[i]; + struct bufdesc *bdp = txq->bd.base; + int index = 0; + + pr_info("tx queue %d\n", i); + do { + pr_info("%3u %c%c 0x%04x 0x%08x %4u %p\n", + index, + bdp == txq->bd.cur ? 'S' : ' ', + bdp == txq->dirty_tx ? 'H' : ' ', + fec16_to_cpu(bdp->cbd_sc), + fec32_to_cpu(bdp->cbd_bufaddr), + fec16_to_cpu(bdp->cbd_datlen), + txq->tx_skbuff[index]); + bdp = fec_enet_get_nextdesc(bdp, >bd); + index++; + } while (bdp != txq->bd.base); + } } static inline bool is_ipv4_pkt(struct sk_buff *skb) -- 2.5.0
[PATCH net-next V3 03/16] net: fec: return IRQ_HANDLED if fec_ptp_check_pps_event handled it
fec_ptp_check_pps_event will return 1 if FEC_T_TF_MASK caused an interrupt. Don't return IRQ_NONE in this case. Signed-off-by: Troy Kisky--- v3: New patch, came from feedback from another patch. --- drivers/net/ethernet/freescale/fec_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index a011719..7993040 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1579,8 +1579,8 @@ fec_enet_interrupt(int irq, void *dev_id) } if (fep->ptp_clock) - fec_ptp_check_pps_event(fep); - + if (fec_ptp_check_pps_event(fep)) + ret = IRQ_HANDLED; return ret; } -- 2.5.0
[PATCH net-next V3 16/16] net: fec: don't set cbd_bufaddr unless no mapping error
Not assigning cbd_bufaddr on error will prevent trying to unmap the error in case the FEC is reset. Signed-off-by: Troy Kisky--- v3: no change --- drivers/net/ethernet/freescale/fec_main.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 101d820..c2ed8be 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -600,7 +600,7 @@ fec_enet_txq_put_hdr_tso(struct fec_enet_priv_tx_q *txq, int hdr_len = skb_transport_offset(skb) + tcp_hdrlen(skb); struct bufdesc_ex *ebdp = container_of(bdp, struct bufdesc_ex, desc); void *bufaddr; - unsigned long dmabuf; + dma_addr_t dmabuf; unsigned int estatus = 0; bufaddr = txq->tso_hdrs + index * TSO_HEADER_SIZE; @@ -1295,17 +1295,21 @@ fec_enet_new_rxbdp(struct net_device *ndev, struct bufdesc *bdp, struct sk_buff { struct fec_enet_private *fep = netdev_priv(ndev); int off; + dma_addr_t dmabuf; off = ((unsigned long)skb->data) & fep->rx_align; if (off) skb_reserve(skb, fep->rx_align + 1 - off); - bdp->cbd_bufaddr = cpu_to_fec32(dma_map_single(>pdev->dev, skb->data, FEC_ENET_RX_FRSIZE - fep->rx_align, DMA_FROM_DEVICE)); - if (dma_mapping_error(>pdev->dev, fec32_to_cpu(bdp->cbd_bufaddr))) { + dmabuf = dma_map_single(>pdev->dev, skb->data, + FEC_ENET_RX_FRSIZE - fep->rx_align, + DMA_FROM_DEVICE); + if (dma_mapping_error(>pdev->dev, dmabuf)) { if (net_ratelimit()) netdev_err(ndev, "Rx DMA memory map failed\n"); return -ENOMEM; } + bdp->cbd_bufaddr = cpu_to_fec32(dmabuf); return 0; } -- 2.5.0
[PATCH net-next V3 15/16] net: fec: call dma_unmap_single on mapped tx buffers at restart
Make sure any pending tx buffers are unmapped when the fec is restarted. Signed-off-by: Troy Kisky--- v3: no change --- drivers/net/ethernet/freescale/fec_main.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index a38acf2..101d820 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -404,6 +404,7 @@ dma_mapping_error: bdp = fec_enet_get_nextdesc(bdp, >bd); dma_unmap_single(>pdev->dev, fec32_to_cpu(bdp->cbd_bufaddr), fec16_to_cpu(bdp->cbd_datlen), DMA_TO_DEVICE); + bdp->cbd_bufaddr = cpu_to_fec32(0); } return ERR_PTR(-ENOMEM); } @@ -761,11 +762,18 @@ static void reset_tx_queue(struct fec_enet_private *fep, txq->bd.cur = bdp; for (i = 0; i < txq->bd.ring_size; i++) { /* Initialize the BD for every fragment in the page. */ + if (bdp->cbd_bufaddr) { + if (!IS_TSO_HEADER(txq, fec32_to_cpu(bdp->cbd_bufaddr))) + dma_unmap_single(>pdev->dev, +fec32_to_cpu(bdp->cbd_bufaddr), +fec16_to_cpu(bdp->cbd_datlen), +DMA_TO_DEVICE); + bdp->cbd_bufaddr = cpu_to_fec32(0); + } if (txq->tx_skbuff[i]) { dev_kfree_skb_any(txq->tx_skbuff[i]); txq->tx_skbuff[i] = NULL; } - bdp->cbd_bufaddr = cpu_to_fec32(0); bdp->cbd_sc = cpu_to_fec16((bdp == txq->bd.last) ? BD_SC_WRAP : 0); bdp = fec_enet_get_nextdesc(bdp, >bd); @@ -2643,6 +2651,7 @@ static void fec_enet_free_buffers(struct net_device *ndev) fec32_to_cpu(bdp->cbd_bufaddr), FEC_ENET_RX_FRSIZE - fep->rx_align, DMA_FROM_DEVICE); + bdp->cbd_bufaddr = cpu_to_fec32(0); dev_kfree_skb(skb); } bdp = fec_enet_get_nextdesc(bdp, >bd); -- 2.5.0
[PATCH net-next V3 07/16] net: fec: don't clear all rx queue bits when just one is being checked
FEC_ENET_RXF is 3 separate bits, we only check one queue at a time. So, when the last queue is being checked, it is bad to remove the interrupt on the 1st queue. Also, since tx/rx interrupts are now cleared in the napi routine and not the interrupt, it is not needed here any longer. Signed-off-by: Troy Kisky--- v3: change commit message --- drivers/net/ethernet/freescale/fec_main.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 17140ea..3cd0cdf 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1339,8 +1339,6 @@ static int fec_rxq(struct net_device *ndev, struct fec_enet_priv_rx_q *rxq, break; pkt_received++; - writel(FEC_ENET_RXF, fep->hwp + FEC_IEVENT); - /* Check for errors. */ status ^= BD_ENET_RX_LAST; if (status & (BD_ENET_RX_LG | BD_ENET_RX_SH | BD_ENET_RX_NO | -- 2.5.0
[PATCH net-next V3 08/16] net: fec: set cbd_sc without relying on previous value
Relying on the wrap bit of cdb_sc to stay valid once initialized when the controller also writes to this byte seems undesirable since we can easily know what the value should be. Signed-off-by: Troy Kisky--- v3: change commit message --- drivers/net/ethernet/freescale/fec_main.c | 38 +-- 1 file changed, 11 insertions(+), 27 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 3cd0cdf..21d2cd0 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -340,9 +340,8 @@ fec_enet_txq_submit_frag_skb(struct fec_enet_priv_tx_q *txq, bdp = fec_enet_get_nextdesc(bdp, >bd); ebdp = (struct bufdesc_ex *)bdp; - status = fec16_to_cpu(bdp->cbd_sc); - status &= ~BD_ENET_TX_STATS; - status |= (BD_ENET_TX_TC | BD_ENET_TX_READY); + status = BD_ENET_TX_TC | BD_ENET_TX_READY | + ((bdp == txq->bd.last) ? BD_SC_WRAP : 0); frag_len = skb_shinfo(skb)->frags[frag].size; /* Handle the last BD specially */ @@ -436,8 +435,6 @@ static int fec_enet_txq_submit_skb(struct fec_enet_priv_tx_q *txq, /* Fill in a Tx ring entry */ bdp = txq->bd.cur; last_bdp = bdp; - status = fec16_to_cpu(bdp->cbd_sc); - status &= ~BD_ENET_TX_STATS; /* Set buffer length and buffer pointer */ bufaddr = skb->data; @@ -462,6 +459,8 @@ static int fec_enet_txq_submit_skb(struct fec_enet_priv_tx_q *txq, return NETDEV_TX_OK; } + status = BD_ENET_TX_TC | BD_ENET_TX_READY | + ((bdp == txq->bd.last) ? BD_SC_WRAP : 0); if (nr_frags) { last_bdp = fec_enet_txq_submit_frag_skb(txq, skb, ndev); if (IS_ERR(last_bdp)) { @@ -512,7 +511,6 @@ static int fec_enet_txq_submit_skb(struct fec_enet_priv_tx_q *txq, /* Send it on its way. Tell FEC it's ready, interrupt when done, * it's the last BD of the frame, and to put the CRC on the end. */ - status |= (BD_ENET_TX_READY | BD_ENET_TX_TC); bdp->cbd_sc = cpu_to_fec16(status); /* If this was the last BD in the ring, start at the beginning again. */ @@ -544,11 +542,6 @@ fec_enet_txq_put_data_tso(struct fec_enet_priv_tx_q *txq, struct sk_buff *skb, unsigned int estatus = 0; dma_addr_t addr; - status = fec16_to_cpu(bdp->cbd_sc); - status &= ~BD_ENET_TX_STATS; - - status |= (BD_ENET_TX_TC | BD_ENET_TX_READY); - if (((unsigned long) data) & fep->tx_align || fep->quirks & FEC_QUIRK_SWAP_FRAME) { memcpy(txq->tx_bounce[index], data, size); @@ -578,15 +571,16 @@ fec_enet_txq_put_data_tso(struct fec_enet_priv_tx_q *txq, struct sk_buff *skb, ebdp->cbd_esc = cpu_to_fec32(estatus); } + status = BD_ENET_TX_TC | BD_ENET_TX_READY | + ((bdp == txq->bd.last) ? BD_SC_WRAP : 0); /* Handle the last BD specially */ if (last_tcp) - status |= (BD_ENET_TX_LAST | BD_ENET_TX_TC); + status |= BD_ENET_TX_LAST; if (is_last) { status |= BD_ENET_TX_INTR; if (fep->bufdesc_ex) ebdp->cbd_esc |= cpu_to_fec32(BD_ENET_TX_INT); } - bdp->cbd_sc = cpu_to_fec16(status); return 0; @@ -602,13 +596,8 @@ fec_enet_txq_put_hdr_tso(struct fec_enet_priv_tx_q *txq, struct bufdesc_ex *ebdp = container_of(bdp, struct bufdesc_ex, desc); void *bufaddr; unsigned long dmabuf; - unsigned short status; unsigned int estatus = 0; - status = fec16_to_cpu(bdp->cbd_sc); - status &= ~BD_ENET_TX_STATS; - status |= (BD_ENET_TX_TC | BD_ENET_TX_READY); - bufaddr = txq->tso_hdrs + index * TSO_HEADER_SIZE; dmabuf = txq->tso_hdrs_dma + index * TSO_HEADER_SIZE; if (((unsigned long)bufaddr) & fep->tx_align || @@ -641,8 +630,8 @@ fec_enet_txq_put_hdr_tso(struct fec_enet_priv_tx_q *txq, ebdp->cbd_esc = cpu_to_fec32(estatus); } - bdp->cbd_sc = cpu_to_fec16(status); - + bdp->cbd_sc = cpu_to_fec16(BD_ENET_TX_TC | BD_ENET_TX_READY | + ((bdp == txq->bd.last) ? BD_SC_WRAP : 0)); return 0; } @@ -1454,12 +1443,6 @@ static int fec_rxq(struct net_device *ndev, struct fec_enet_priv_rx_q *rxq, } rx_processing_done: - /* Clear the status flags for this buffer */ - status &= ~BD_ENET_RX_STATS; - - /* Mark the buffer empty */ - status |= BD_ENET_RX_EMPTY; - if (fep->bufdesc_ex) { struct bufdesc_ex *ebdp = (struct bufdesc_ex *)bdp; @@ -1471,7 +1454,8 @@ rx_processing_done:
[PATCH net-next V3 02/16] net: fec: remove unused interrupt FEC_ENET_TS_TIMER
FEC_ENET_TS_TIMER is not checked in the interrupt routine so there is no need to enable it. Signed-off-by: Troy Kisky--- v3: New patch Frank Li said "TS_TIMER should never be triggered." when discussing another patch. --- drivers/net/ethernet/freescale/fec.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h index 195122e..6dd0ba8 100644 --- a/drivers/net/ethernet/freescale/fec.h +++ b/drivers/net/ethernet/freescale/fec.h @@ -374,8 +374,8 @@ struct bufdesc_ex { #define FEC_ENET_TS_AVAIL ((uint)0x0001) #define FEC_ENET_TS_TIMER ((uint)0x8000) -#define FEC_DEFAULT_IMASK (FEC_ENET_TXF | FEC_ENET_RXF | FEC_ENET_MII | FEC_ENET_TS_TIMER) -#define FEC_NAPI_IMASK (FEC_ENET_MII | FEC_ENET_TS_TIMER) +#define FEC_DEFAULT_IMASK (FEC_ENET_TXF | FEC_ENET_RXF | FEC_ENET_MII) +#define FEC_NAPI_IMASK FEC_ENET_MII #define FEC_RX_DISABLED_IMASK (FEC_DEFAULT_IMASK & (~FEC_ENET_RXF)) /* ENET interrupt coalescing macro define */ -- 2.5.0
[PATCH net-next V3 06/16] net: fec: split off napi routine with 3 queues
If we only have 1 tx/rx queue, we need not check the other queues. Signed-off-by: Troy Kisky--- v3: rebase changes only, fep is no longer passed as a parameter to fec_rxq/fec_txq --- drivers/net/ethernet/freescale/fec_main.c | 39 +-- 1 file changed, 37 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 918ac82..17140ea 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1524,7 +1524,7 @@ fec_enet_interrupt(int irq, void *dev_id) return ret; } -static int fec_enet_rx_napi(struct napi_struct *napi, int budget) +static int fec_enet_napi_q3(struct napi_struct *napi, int budget) { struct net_device *ndev = napi->dev; struct fec_enet_private *fep = netdev_priv(ndev); @@ -1564,6 +1564,39 @@ static int fec_enet_rx_napi(struct napi_struct *napi, int budget) return pkts; } +static int fec_enet_napi_q1(struct napi_struct *napi, int budget) +{ + struct net_device *ndev = napi->dev; + struct fec_enet_private *fep = netdev_priv(ndev); + int pkts = 0; + uint events; + + do { + events = readl(fep->hwp + FEC_IEVENT); + if (fep->events) { + events |= fep->events; + fep->events = 0; + } + events &= FEC_ENET_RXF_0 | FEC_ENET_TXF_0; + if (!events) { + if (budget) { + napi_complete(napi); + writel(FEC_DEFAULT_IMASK, fep->hwp + FEC_IMASK); + } + return pkts; + } + + writel(events, fep->hwp + FEC_IEVENT); + if (events & FEC_ENET_RXF_0) + pkts += fec_rxq(ndev, fep->rx_queue[0], + budget - pkts); + if (events & FEC_ENET_TXF_0) + fec_txq(ndev, fep->tx_queue[0]); + } while (pkts < budget); + fep->events |= FEC_ENET_RXF_0; /* save for next callback */ + return pkts; +} + /* - */ static void fec_get_mac(struct net_device *ndev) { @@ -3123,7 +3156,9 @@ static int fec_enet_init(struct net_device *ndev) ndev->ethtool_ops = _enet_ethtool_ops; writel(FEC_RX_DISABLED_IMASK, fep->hwp + FEC_IMASK); - netif_napi_add(ndev, >napi, fec_enet_rx_napi, NAPI_POLL_WEIGHT); + netif_napi_add(ndev, >napi, (fep->num_rx_queues | + fep->num_tx_queues) == 1 ? fec_enet_napi_q1 : + fec_enet_napi_q3, NAPI_POLL_WEIGHT); if (fep->quirks & FEC_QUIRK_HAS_VLAN) /* enable hw VLAN support */ -- 2.5.0
[PATCH net-next V3 00/16] net: fec: cleanup and fixes
V3 has 1 dropped patch "net: fec: print more debug info in fec_timeout" 2 new patches 0002-net-fec-remove-unused-interrupt-FEC_ENET_TS_TIMER.patch 0003-net-fec-return-IRQ_HANDLED-if-fec_ptp_check_pps_even.patch 1 combined patch 0004-net-fec-pass-rxq-txq-to-fec_enet_rx-tx_queue-instead.patch The changes are noted on individual patches My measured performance of this series is before patch set 365 Mbits/sec Tx/407 RX after patch set 374 Tx/427 Rx Troy Kisky (16): net: fec: only check queue 0 if RXF_0/TXF_0 interrupt is set net: fec: remove unused interrupt FEC_ENET_TS_TIMER net: fec: return IRQ_HANDLED if fec_ptp_check_pps_event handled it net: fec: pass rxq/txq to fec_enet_rx/tx_queue instead of queue_id net: fec: reduce interrupts net: fec: split off napi routine with 3 queues net: fec: don't clear all rx queue bits when just one is being checked net: fec: set cbd_sc without relying on previous value net: fec: eliminate calls to fec_enet_get_prevdesc net: fec: move restart test for efficiency net: fec: clear cbd_sc after transmission to help with debugging net: fec: dump all tx queues in fec_dump net: fec: detect tx int lost net: fec: create subroutine reset_tx_queue net: fec: call dma_unmap_single on mapped tx buffers at restart net: fec: don't set cbd_bufaddr unless no mapping error drivers/net/ethernet/freescale/fec.h | 10 +- drivers/net/ethernet/freescale/fec_main.c | 410 -- 2 files changed, 218 insertions(+), 202 deletions(-) -- 2.5.0
[PATCH net-next V3 04/16] net: fec: pass rxq/txq to fec_enet_rx/tx_queue instead of queue_id
The queue_id is the qid member of struct bufdesc_prop. Passing rxq/txq will allow the macro FEC_ENET_GET_QUQUE to be removed in the next patch. Signed-off-by: Troy KiskyAcked-by: Fugang Duan --- v3: add Acked-by combine with "net: fec: pass txq to fec_enet_tx_queue instead of queue_id" reverted change that passed fep as a parameter --- drivers/net/ethernet/freescale/fec_main.c | 29 +++-- 1 file changed, 11 insertions(+), 18 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 7993040..b4d46f8 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1156,25 +1156,18 @@ fec_enet_hwtstamp(struct fec_enet_private *fep, unsigned ts, hwtstamps->hwtstamp = ns_to_ktime(ns); } -static void -fec_enet_tx_queue(struct net_device *ndev, u16 queue_id) +static void fec_txq(struct net_device *ndev, struct fec_enet_priv_tx_q *txq) { - struct fec_enet_private *fep; + struct fec_enet_private *fep = netdev_priv(ndev); struct bufdesc *bdp; unsigned short status; struct sk_buff *skb; - struct fec_enet_priv_tx_q *txq; struct netdev_queue *nq; int index = 0; int entries_free; - fep = netdev_priv(ndev); - - queue_id = FEC_ENET_GET_QUQUE(queue_id); - - txq = fep->tx_queue[queue_id]; /* get next bdp of dirty_tx */ - nq = netdev_get_tx_queue(ndev, queue_id); + nq = netdev_get_tx_queue(ndev, txq->bd.qid); bdp = txq->dirty_tx; /* get next bdp of dirty_tx */ @@ -1268,11 +1261,13 @@ static void fec_enet_tx(struct net_device *ndev) { struct fec_enet_private *fep = netdev_priv(ndev); + struct fec_enet_priv_tx_q *txq; u16 queue_id; /* First process class A queue, then Class B and Best Effort queue */ for_each_set_bit(queue_id, >work_tx, FEC_ENET_MAX_TX_QS) { clear_bit(queue_id, >work_tx); - fec_enet_tx_queue(ndev, queue_id); + txq = fep->tx_queue[FEC_ENET_GET_QUQUE(queue_id)]; + fec_txq(ndev, txq); } return; } @@ -1328,11 +1323,10 @@ static bool fec_enet_copybreak(struct net_device *ndev, struct sk_buff **skb, * not been given to the system, we just set the empty indicator, * effectively tossing the packet. */ -static int -fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id) +static int fec_rxq(struct net_device *ndev, struct fec_enet_priv_rx_q *rxq, + int budget) { struct fec_enet_private *fep = netdev_priv(ndev); - struct fec_enet_priv_rx_q *rxq; struct bufdesc *bdp; unsigned short status; struct sk_buff *skb_new = NULL; @@ -1350,8 +1344,6 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id) #ifdef CONFIG_M532x flush_cache_all(); #endif - queue_id = FEC_ENET_GET_QUQUE(queue_id); - rxq = fep->rx_queue[queue_id]; /* First, grab all of the stats for the incoming packet. * These get messed up if we get called due to a busy condition. @@ -1519,11 +1511,12 @@ fec_enet_rx(struct net_device *ndev, int budget) int pkt_received = 0; u16 queue_id; struct fec_enet_private *fep = netdev_priv(ndev); + struct fec_enet_priv_rx_q *rxq; for_each_set_bit(queue_id, >work_rx, FEC_ENET_MAX_RX_QS) { clear_bit(queue_id, >work_rx); - pkt_received += fec_enet_rx_queue(ndev, - budget - pkt_received, queue_id); + rxq = fep->rx_queue[FEC_ENET_GET_QUQUE(queue_id)]; + pkt_received += fec_rxq(ndev, rxq, budget - pkt_received); } return pkt_received; } -- 2.5.0
[PATCH net-next V3 01/16] net: fec: only check queue 0 if RXF_0/TXF_0 interrupt is set
Before queue 0 was always checked if any queue caused an interrupt. It is better to just mark queue 0 if queue 0 has caused an interrupt. Signed-off-by: Troy KiskyAcked-by: Fugang Duan --- v3: add Acked-by --- drivers/net/ethernet/freescale/fec_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 08243c2..a011719 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1534,14 +1534,14 @@ fec_enet_collect_events(struct fec_enet_private *fep, uint int_events) if (int_events == 0) return false; - if (int_events & FEC_ENET_RXF) + if (int_events & FEC_ENET_RXF_0) fep->work_rx |= (1 << 2); if (int_events & FEC_ENET_RXF_1) fep->work_rx |= (1 << 0); if (int_events & FEC_ENET_RXF_2) fep->work_rx |= (1 << 1); - if (int_events & FEC_ENET_TXF) + if (int_events & FEC_ENET_TXF_0) fep->work_tx |= (1 << 2); if (int_events & FEC_ENET_TXF_1) fep->work_tx |= (1 << 0); -- 2.5.0
Re: [PATCH net-next] net: dsa: document missing functions
Hi Andrew, Andrew Lunnwrites: > On Tue, Apr 05, 2016 at 11:22:40AM -0400, Vivien Didelot wrote: >> Add description for the missing port_vlan_prepare, port_fdb_prepare, >> port_fdb_dump functions in the DSA documentation. >> >> Signed-off-by: Vivien Didelot > > Hi Vivien > > A few English improvements: > >> --- >> Documentation/networking/dsa/dsa.txt | 16 >> 1 file changed, 16 insertions(+) >> >> diff --git a/Documentation/networking/dsa/dsa.txt >> b/Documentation/networking/dsa/dsa.txt >> index 3b196c3..8ba3369 100644 >> --- a/Documentation/networking/dsa/dsa.txt >> +++ b/Documentation/networking/dsa/dsa.txt >> @@ -542,6 +542,12 @@ Bridge layer >> Bridge VLAN filtering >> - >> >> +- port_vlan_prepare: bridge layer function invoked when the bridge prepares >> the >> + configuration of a VLAN on the given port. If the operation is not >> + programmable, this function should return -EOPNOTSUPP to inform the bridge > > s/programmable/supported by the hardware > >> + code to fallback to a software implementation. No hardware programmation > > s/programmation/setup > >> + must be done in this function. See port_vlan_add for this and details. >> + >> - port_vlan_add: bridge layer function invoked when a VLAN is configured >>(tagged or untagged) for the given switch port >> >> @@ -552,6 +558,12 @@ Bridge VLAN filtering >>function that the driver has to call for each VLAN the given port is a >> member >>of. A switchdev object is used to carry the VID and bridge flags. >> >> +- port_fdb_prepare: bridge layer function invoked when the bridge prepares >> the >> + installation of a Forwarding Database entry. If the operation is not >> + programmable, this function should return -EOPNOTSUPP to inform the bridge > > s/programmable/supported > >> + code to fallback to a software implementation. No hardware programmation > > s/programmation/setup Done, v2 on its way. Thanks, Vivien
Re: [RFC PATCH 5/6] ppp: define reusable device creation functions
On Tue, 5 Apr 2016 23:14:56 +0200 Guillaume Naultwrote: > On Tue, Apr 05, 2016 at 08:28:32AM -0700, Stephen Hemminger wrote: > > On Tue, 5 Apr 2016 02:56:29 +0200 > > Guillaume Nault wrote: > > > > > Move PPP device initialisation and registration out of > > > ppp_create_interface(). > > > This prepares code for device registration with rtnetlink. > > > > > > > Does PPP module autoload correctly based on the netlink attributes? > > > Patch #6 has MODULE_ALIAS_RTNL_LINK("ppp"). This works fine for > auto-loading ppp_generic when creating a PPP device with rtnetlink. > Is there anything else required? > That should be enough.
Re: [RFC PATCH net 3/4] ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update
On Mon, Apr 04, 2016 at 01:45:02PM -0700, Cong Wang wrote: > I see your point, but calling __ip6_datagram_connect() seems overkill > here, we don't need to update so many things in the pmtu update context, > at least IPv4 doesn't do that either. I don't think you have to do that. > > So why just updating the dst cache (also some addr cache) here is not > enough? I am not sure I understand. I could be missing something. This patch uses ip6_datagram_dst_update() to do the route lookup and sk->sk_dst_cache update. ip6_datagram_dst_update() is created in the first two refactoring patches and is also used by __ip6_datagram_connect(). Which operations in ip6_datagram_dst_update() could be saved during the pmtu update?
Re: [net PATCH v2 2/2] ipv4/GRO: Make GRO conform to RFC 6864
On Tue, Apr 05, 2016 at 12:36:40PM -0300, Tom Herbert wrote: > On Tue, Apr 5, 2016 at 12:07 PM, Edward Creewrote: > > On 05/04/16 05:32, Herbert Xu wrote: > >> On Mon, Apr 04, 2016 at 09:26:55PM -0700, Alexander Duyck wrote: > >>> The question I would have is what are you really losing with increment > >>> from 0 versus fixed 0? From what I see it is essentially just garbage > >>> in/garbage out. > >> GRO is meant to be lossless, that is, you should not be able to > >> detect its presence from the outside. If you lose information then > >> you're breaking this rule and people will soon start asking for it > >> to be disabled in various situations. > >> > >> I'm not against doing this per se but it should not be part of the > >> default configuration. > > I'm certainly in favour of this being configurable - indeed IMHO it should > > also be possible to configure GRO with the 'looser' semantics of LRO, so > > that people who want that can get it without all the horrible "don't confuse > > Slow Start" hacks, and so that LRO can go away (AIUI the only reasons it > > exists are (a) improved performance from the 'loose' semantics and (b) old > > kernels without GRO. We may not be able to kill (b) but we can certainly > > address (a)). > > > > But I don't agree that the default has to be totally lossless; anyone who is > > caring about the ID fields in atomic datagrams is breaking the RFCs, and can > > be assumed to Know What They're Doing sufficiently to configure this. > > > > On the gripping hand, I feel like GRO+TSO is the wrong model for speeding up > > forwarding/routing workloads. Instead we should be looking into having > > lists > > of SKBs traverse the stack together, splitting the list whenever e.g. the > > destination changes. That seems like it ought to be much more efficient > > than > > rewriting headers twice, once to coalesce a superframe and once to segment > > it > > again - and it also means this worry about GRO being lossless can go away. > > But until someone tries implementing skb batches, we won't know for sure if > > it works (and I don't have time right now ;) > > > Ed, > > I thought about that some. It seems like we would want to do both GRO > and retain all the individual packets in the skb so that we could use > those for forwarding instead of GSO as I think you're saying. This Retaining the individual packets would also help to make GRO feasible for SCTP. SCTP needs to know where each packet ended because of AUTH chunks and we cannot rely on something like gso_size as each original packet had it's own size. I could do it for tx side (see my SCTP/GSO RFC patches) using skb_gro_receive() with a specially crafted header skb, but I'm not seeing a way to do it in rx side as I cannot guarantee incoming skbs will follow that pattern. Marcelo > would would work great in the plain forwarding case, but one problem > is what to do if the host modifies the super packet (for instance when > forwarding over a tunnel we might add encapsulation header). This > should work in GSO (although we need to address the limitations around > 1 encap level), not sure this is easy if we need to add a header to > each packet in a batch. > > Tom > > > > > -Ed >
Re: [PATCH v3 -next] net/core/dev: Warn on a too-short GRO frame
From: Aaron ConoleDate: Sat, 2 Apr 2016 15:26:43 -0400 > From: Aaron Conole > > When signaling that a GRO frame is ready to be processed, the network stack > correctly checks length and aborts processing when a frame is less than 14 > bytes. However, such a condition is really indicative of a broken driver, > and should be loudly signaled, rather than silently dropped as the case is > today. > > Convert the condition to use net_warn_ratelimited() to ensure the stack > loudly complains about such broken drivers. > > Signed-off-by: Aaron Conole Applied, thanks.
Re: [PATCH v2] net: remove unimplemented RTNH_F_PERVASIVE
From: Quentin ArmitageDate: Sat, 2 Apr 2016 17:51:28 +0100 > Linux 2.1.68 introduced RTNH_F_PERVASIVE, but it had no implementation > and couldn't be enabled since the required config parameter wasn't in > any Kconfig file (see commit d088dde7b196 ("ipv4: obsolete config in > kernel source (IP_ROUTE_PERVASIVE)")). > > This commit removes all remaining references to RTNH_F_PERVASIVE. > Although this will cause userspace applications that were using the > flag to fail to build, they will be alerted to the fact that using > RTNH_F_PERVASIVE was not achieving anything. > > Signed-off-by: Quentin Armitage Can't really delete values like this from user visible headers. It can break the build. What if some library or tool has a table translating RTNH_F_* values into strings to display to the user? Those sources will stop building if I apply your changes.
Re: [RFC PATCH net 3/4] ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update
From: Cong WangDate: Mon, 4 Apr 2016 13:45:02 -0700 > On Sat, Apr 2, 2016 at 7:33 PM, Martin KaFai Lau wrote: >> One thing to note is that this patch uses the addresses from the sk >> instead of iph when updating sk->sk_dst_cache. It is basically the >> same logic that the __ip6_datagram_connect() is doing, so some >> refactoring works in the first two patches. >> >> AFAIK, a UDP socket can become connected after sending out some >> datagrams in un-connected state. or It can be connected >> multiple times to different destinations. I did some quick >> tests but I could be wrong. >> >> I am thinking if there could be a chance that the skb->data, which >> has the original outgoing iph, is not related to the current >> connected address. If it is possible, we have to specifically >> use the addresses in the sk instead of skb->data (i.e. iph) when >> updating the sk->sk_dst_cache. >> >> If we need to use the sk addresses (and other info) to find out a >> new dst for a connected udp socket, it is better not doing it while >> the userland is connecting to somewhere else. >> >> If the above case is impossible, we can keep using the info from iph to >> do the dst update for a connected-udp sk without taking the lock. > > I see your point, but calling __ip6_datagram_connect() seems overkill > here, we don't need to update so many things in the pmtu update context, > at least IPv4 doesn't do that either. I don't think you have to do that. > > So why just updating the dst cache (also some addr cache) here is not > enough? I think we are steadily getting closer to a version of this fix that we have some agreement on, right? Martin can you address Cong's feedback and spin another version of this series? Thanks.
Re: [net-next PATCH 2/2 v4] ibmvnic: enable RX checksum offload
From: Thomas FalconDate: Fri, 1 Apr 2016 17:20:35 -0500 > Enable RX Checksum offload feature in the ibmvnic driver. > > Signed-off-by: Thomas Falcon Applied.
Re: [PATCH net-next 2/3] net: dsa: make the FDB add function return void
On Tue, Apr 05, 2016 at 11:24:34AM -0400, Vivien Didelot wrote: > The switchdev design implies that a software error should not happen in > the commit phase since it must have been previously reported in the > prepare phase. If an hardware error occurs during the commit phase, > there is nothing switchdev can do about it. > > The DSA layer separates port_fdb_prepare and port_fdb_add for simplicity > and convenience. If an hardware error occurs during the commit phase, > there is no need to report it outside the DSA driver itself. > > Make the DSA port_fdb_add routine return void for explicitness. > > Signed-off-by: Vivien Didelot> --- > drivers/net/dsa/bcm_sf2.c | 9 + > drivers/net/dsa/mv88e6xxx.c | 12 +--- > drivers/net/dsa/mv88e6xxx.h | 6 +++--- > include/net/dsa.h | 2 +- > net/dsa/slave.c | 16 > 5 files changed, 22 insertions(+), 23 deletions(-) > > diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c > index b847624..feebeaa 100644 > --- a/drivers/net/dsa/bcm_sf2.c > +++ b/drivers/net/dsa/bcm_sf2.c > @@ -722,13 +722,14 @@ static int bcm_sf2_sw_fdb_prepare(struct dsa_switch > *ds, int port, > return 0; > } > > -static int bcm_sf2_sw_fdb_add(struct dsa_switch *ds, int port, > - const struct switchdev_obj_port_fdb *fdb, > - struct switchdev_trans *trans) > +static void bcm_sf2_sw_fdb_add(struct dsa_switch *ds, int port, > +const struct switchdev_obj_port_fdb *fdb, > +struct switchdev_trans *trans) > { > struct bcm_sf2_priv *priv = ds_to_priv(ds); > > - return bcm_sf2_arl_op(priv, 0, port, fdb->addr, fdb->vid, true); > + if (bcm_sf2_arl_op(priv, 0, port, fdb->addr, fdb->vid, true)) > + pr_err("%s: failed to add address\n", __func__); > } > > static int bcm_sf2_sw_fdb_del(struct dsa_switch *ds, int port, > diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c > index 5a2e46d..bca9a2c 100644 > --- a/drivers/net/dsa/mv88e6xxx.c > +++ b/drivers/net/dsa/mv88e6xxx.c > @@ -2090,21 +2090,19 @@ int mv88e6xxx_port_fdb_prepare(struct dsa_switch *ds, > int port, > return 0; > } > > -int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port, > -const struct switchdev_obj_port_fdb *fdb, > -struct switchdev_trans *trans) > +void mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port, > + const struct switchdev_obj_port_fdb *fdb, > + struct switchdev_trans *trans) > { > int state = is_multicast_ether_addr(fdb->addr) ? > GLOBAL_ATU_DATA_STATE_MC_STATIC : > GLOBAL_ATU_DATA_STATE_UC_STATIC; > struct mv88e6xxx_priv_state *ps = ds_to_priv(ds); > - int ret; > > mutex_lock(>smi_mutex); > - ret = _mv88e6xxx_port_fdb_load(ds, port, fdb->addr, fdb->vid, state); > + if (_mv88e6xxx_port_fdb_load(ds, port, fdb->addr, fdb->vid, state)) > + netdev_warn(ds->ports[port], "cannot load address\n"); In the SF2 driver you use pr_err, but here netdev_warn. We probably should be consistent if we error or warn. I would use netdev_error, since if this fails we probably have a real hardware problem. Andrew
Re: [net-next PATCH 1/2 v4] ibmvnic: map L2/L3/L4 header descriptors to firmware
From: Thomas FalconDate: Fri, 1 Apr 2016 17:20:34 -0500 > Allow the VNIC driver to provide descriptors containing > L2/L3/L4 headers to firmware. This feature is needed > for greater hardware compatibility and enablement of checksum > and TCP offloading features. > > A new function is included for the hypervisor call, > H_SEND_SUBCRQ_INDIRECT, allowing a DMA-mapped array of SCRQ > descriptor elements to be sent to the VNIC server. > > These additions will help fully enable checksum offloading as > well as other features as they are included later. > > Signed-off-by: Thomas Falcon Applied.
Re: [PATCH] ip6_tunnel: set rtnl_link_ops before calling register_netdevice
From: Thadeu Lima de Souza CascardoDate: Fri, 1 Apr 2016 17:17:50 -0300 > When creating an ip6tnl tunnel with ip tunnel, rtnl_link_ops is not set > before ip6_tnl_create2 is called. When register_netdevice is called, there > is no linkinfo attribute in the NEWLINK message because of that. > > Setting rtnl_link_ops before calling register_netdevice fixes that. > > Signed-off-by: Thadeu Lima de Souza Cascardo Applied and queued up for -stable.
Re: [PATCH net-next 1/3] net: dsa: make the STP state function return void
> -- port_stp_update: bridge layer function invoked when a given switch port STP > +- port_stp_state: bridge layer function invoked when a given switch port STP Hi Vivien port_stp_state_set might be a better name, to make it clear it is setting the state, not getting the current state, etc. Most of the other functions are _add, _prepare, _join, _leave, so _set would fit the pattern. Changing to a void makes sense. Andrew
Re: [net PATCH v2 2/2] ipv4/GRO: Make GRO conform to RFC 6864
From: Edward CreeDate: Tue, 5 Apr 2016 16:07:49 +0100 > On the gripping hand, I feel like GRO+TSO is the wrong model for > speeding up forwarding/routing workloads. Instead we should be > looking into having lists of SKBs traverse the stack together, > splitting the list whenever e.g. the destination changes. "Destination" is a very complicated beast. It's not just a destination IP address. It's not even just a full saddr/daddr/TOS triplet. Packets can be forwarded around based upon any key whatsoever in the headers. Netfilter can mangle them based upon arbitrary bits in the packet, as can the packet scheduler classifier actions. It's therefore not profitable to try this at all, it's completely pointless unless all the keys match up exactly. This is why GRO _is_ the proper model to speed this stuff and do bulk processing, because it still presents a full "packet" to all of these layers to mangle, rewrite, route, and do whatever else however they like.
Re: [PATCH net-next] net: dsa: document missing functions
On Tue, Apr 05, 2016 at 11:22:40AM -0400, Vivien Didelot wrote: > Add description for the missing port_vlan_prepare, port_fdb_prepare, > port_fdb_dump functions in the DSA documentation. > > Signed-off-by: Vivien DidelotHi Vivien A few English improvements: > --- > Documentation/networking/dsa/dsa.txt | 16 > 1 file changed, 16 insertions(+) > > diff --git a/Documentation/networking/dsa/dsa.txt > b/Documentation/networking/dsa/dsa.txt > index 3b196c3..8ba3369 100644 > --- a/Documentation/networking/dsa/dsa.txt > +++ b/Documentation/networking/dsa/dsa.txt > @@ -542,6 +542,12 @@ Bridge layer > Bridge VLAN filtering > - > > +- port_vlan_prepare: bridge layer function invoked when the bridge prepares > the > + configuration of a VLAN on the given port. If the operation is not > + programmable, this function should return -EOPNOTSUPP to inform the bridge s/programmable/supported by the hardware > + code to fallback to a software implementation. No hardware programmation s/programmation/setup > + must be done in this function. See port_vlan_add for this and details. > + > - port_vlan_add: bridge layer function invoked when a VLAN is configured >(tagged or untagged) for the given switch port > > @@ -552,6 +558,12 @@ Bridge VLAN filtering >function that the driver has to call for each VLAN the given port is a > member >of. A switchdev object is used to carry the VID and bridge flags. > > +- port_fdb_prepare: bridge layer function invoked when the bridge prepares > the > + installation of a Forwarding Database entry. If the operation is not > + programmable, this function should return -EOPNOTSUPP to inform the bridge s/programmable/supported > + code to fallback to a software implementation. No hardware programmation s/programmation/setup Andrew
Re: [RESEND PATCH net-next v2 0/3] bcmgenet cleanups
From: Petri GyntherDate: Tue, 5 Apr 2016 13:59:58 -0700 > Three cleanup patches for bcmgenet. Series applied, thanks.
Re: [PATCH] sctp: Fix error handling for switch statement case in the function sctp_cmd_interprete
On 2016-04-05 07:29 PM, David Miller wrote: > From: Daniel Borkmann> Date: Tue, 05 Apr 2016 23:53:52 +0200 > >> On 04/05/2016 11:36 PM, Bastien Philbert wrote: >>> This fixes error handling for the switch statement case >>> SCTP_CMD_SEND_PKT by making the error value of the call >>> to sctp_packet_transmit equal the variable error due to >>> this function being able to fail with a error code. In >> >> What actual issue have you observed that you fix? >> >>> addition allow the call to sctp_ootb_pkt_free afterwards >>> to free up the no longer in use sctp packet even if the >>> call to the function sctp_packet_transmit fails in order >>> to avoid a memory leak here for not freeing the sctp >> >> Not sure how this relates to your code? > > Bastien, I'm seeing a clear negative pattern with the bug fixes > you are submitting. > > Just now you submitted the ICMP change which obviously was never > tested because it tried to take the RTNL mutex in atomic context, > and now this sctp thing. > > If you don't start actually testing your changes and expalining > clearly what the problem actually is, how you discovered it, > and how you actually tested your patch, I will start completely > ignoring your patch submissions. > Ok sure I will be more careful with my future patches. Sorry about those two patches :(. Bastien
Re: [net 0/3][pull request] Intel Wired LAN Driver Updates 2016-04-05
From: Jeff KirsherDate: Tue, 5 Apr 2016 15:30:48 -0700 > This series contains updates to i40e and e1000. Pulled, thanks Jeff.
Re: [PATCH] sctp: Fix error handling for switch statement case in the function sctp_cmd_interprete
From: Daniel BorkmannDate: Tue, 05 Apr 2016 23:53:52 +0200 > On 04/05/2016 11:36 PM, Bastien Philbert wrote: >> This fixes error handling for the switch statement case >> SCTP_CMD_SEND_PKT by making the error value of the call >> to sctp_packet_transmit equal the variable error due to >> this function being able to fail with a error code. In > > What actual issue have you observed that you fix? > >> addition allow the call to sctp_ootb_pkt_free afterwards >> to free up the no longer in use sctp packet even if the >> call to the function sctp_packet_transmit fails in order >> to avoid a memory leak here for not freeing the sctp > > Not sure how this relates to your code? Bastien, I'm seeing a clear negative pattern with the bug fixes you are submitting. Just now you submitted the ICMP change which obviously was never tested because it tried to take the RTNL mutex in atomic context, and now this sctp thing. If you don't start actually testing your changes and expalining clearly what the problem actually is, how you discovered it, and how you actually tested your patch, I will start completely ignoring your patch submissions.
Re: [RESEND PATCH net-next v2 3/3] net: bcmgenet: cleanup for dmadesc_set()
2016-04-05 14:00 GMT-07:00 Petri Gynther: > dmadesc_set() is used for setting the Tx buffer DMA address, length, > and status bits on a Tx ring descriptor when a frame is being Tx'ed. > > Always set the Tx buffer DMA address first, before updating the length > and status bits, i.e. giving the Tx descriptor to the hardware. > > The reason this is a cleanup rather than a fix is that the hardware > won't transmit anything from a Tx ring until the TDMA producer index > has been incremented. As long as the dmadesc_set() writes complete > before the TDMA producer index write, life is good. > > Signed-off-by: Petri Gynther Acked-by: Florian Fainelli -- Florian
Re: [PATCH] sctp: Fix error handling for switch statement case in the function sctp_cmd_interprete
On 2016-04-05 06:12 PM, Marcelo Ricardo Leitner wrote: > On Tue, Apr 05, 2016 at 05:36:41PM -0400, Bastien Philbert wrote: >> This fixes error handling for the switch statement case >> SCTP_CMD_SEND_PKT by making the error value of the call >> to sctp_packet_transmit equal the variable error due to >> this function being able to fail with a error code. In >> addition allow the call to sctp_ootb_pkt_free afterwards >> to free up the no longer in use sctp packet even if the >> call to the function sctp_packet_transmit fails in order >> to avoid a memory leak here for not freeing the sctp > > This leak shouldn't exist as sctp_packet_transmit() will free the packet > if it returns ENOMEM, through the nomem: handling. > > But about making it visible to the user, that looks interesting to me > although I cannot foresee yet its effects, like the comment at the end > of sctp_packet_transmit() on not returning EHOSTUNREACH. Did you check > it? > I was aware of the -EHOSTUNREACH issue but assumed that this needs to be known to functions internal to the kernel. TO rephase does it matter if the callers of this function known if sctp_packet_transmit or care if it fails or is this just unnecessary as we do cleanup else where which is enough so the new error check is not needed? Again if their is a certain test would like me to run on this patch too to make sure it's OK I don't mind, just let me known :). Cheers, Bastien >> >> Signed-off-by: Bastien Philbert>> --- >> net/sctp/sm_sideeffect.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c >> index 7fe56d0..f3a8b58 100644 >> --- a/net/sctp/sm_sideeffect.c >> +++ b/net/sctp/sm_sideeffect.c >> @@ -1434,7 +1434,7 @@ static int sctp_cmd_interpreter(sctp_event_t >> event_type, >> case SCTP_CMD_SEND_PKT: >> /* Send a full packet to our peer. */ >> packet = cmd->obj.packet; >> -sctp_packet_transmit(packet, gfp); >> +error = sctp_packet_transmit(packet, gfp); >> sctp_ootb_pkt_free(packet); >> break; >> >> -- >> 2.5.0 >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >>
[net 2/3] e1000: Do not overestimate descriptor counts in Tx pre-check
From: Alexander DuyckThe current code path is capable of grossly overestimating the number of descriptors needed to transmit a new frame. This specifically occurs if the skb contains a number of 4K pages. The issue is that the logic for determining the descriptors needed is ((S) >> (X)) + 1. When X is 12 it means that we were indicating that we required 2 descriptors for each 4K page when we only needed one. This change corrects this by instead adding (1 << (X)) - 1 to the S value instead of adding 1 after the fact. This way we get an accurate descriptor needed count as we are essentially doing a DIV_ROUNDUP(). Reported-by: Ivan Suzdal Signed-off-by: Alexander Duyck Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/e1000/e1000_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c index 3fc7bde..d213fb4 100644 --- a/drivers/net/ethernet/intel/e1000/e1000_main.c +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c @@ -3106,7 +3106,7 @@ static int e1000_maybe_stop_tx(struct net_device *netdev, return __e1000_maybe_stop_tx(netdev, size); } -#define TXD_USE_COUNT(S, X) (((S) >> (X)) + 1) +#define TXD_USE_COUNT(S, X) (((S) + ((1 << (X)) - 1)) >> (X)) static netdev_tx_t e1000_xmit_frame(struct sk_buff *skb, struct net_device *netdev) { -- 2.5.5
[net 1/3] i40e: fix errant PCIe bandwidth message
From: Jesse BrandeburgThere was an error introduced with commit 3fced535079a ("i40e: X722 is on the IOSF bus and does not report the PCI bus info"), where code was added but the enabling flag is never set. CC: Anjali Singhai Jain CC: Stefan Assman Fixes: 3fced535079a ("i40e: X722 is on the IOSF bus ...") Reported-by: Steve Best Signed-off-by: Jesse Brandeburg Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_main.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 6700643..3449129 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -8559,6 +8559,7 @@ static int i40e_sw_init(struct i40e_pf *pf) I40E_FLAG_OUTER_UDP_CSUM_CAPABLE | I40E_FLAG_WB_ON_ITR_CAPABLE | I40E_FLAG_MULTIPLE_TCP_UDP_RSS_PCTYPE | +I40E_FLAG_NO_PCI_LINK_CHECK | I40E_FLAG_100M_SGMII_CAPABLE | I40E_FLAG_USE_SET_LLDP_MIB | I40E_FLAG_GENEVE_OFFLOAD_CAPABLE; -- 2.5.5
[net 3/3] e1000: Double Tx descriptors needed check for 82544
From: Alexander DuyckThe 82544 has code that adds one additional descriptor per data buffer. However we weren't taking that into account when determining the descriptors needed for the next transmit at the end of the xmit_frame path. This change takes that into account by doubling the number of descriptors needed for the 82544 so that we can avoid a potential issue where we could hang the Tx ring by loading frames with xmit_more enabled and then stopping the ring without writing the tail. In addition it adds a few more descriptors to account for some additional workarounds that have been added over time. Signed-off-by: Alexander Duyck Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/e1000/e1000_main.c | 19 ++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c index d213fb4..ae90d4f 100644 --- a/drivers/net/ethernet/intel/e1000/e1000_main.c +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c @@ -3256,12 +3256,29 @@ static netdev_tx_t e1000_xmit_frame(struct sk_buff *skb, nr_frags, mss); if (count) { + /* The descriptors needed is higher than other Intel drivers +* due to a number of workarounds. The breakdown is below: +* Data descriptors: MAX_SKB_FRAGS + 1 +* Context Descriptor: 1 +* Keep head from touching tail: 2 +* Workarounds: 3 +*/ + int desc_needed = MAX_SKB_FRAGS + 7; + netdev_sent_queue(netdev, skb->len); skb_tx_timestamp(skb); e1000_tx_queue(adapter, tx_ring, tx_flags, count); + + /* 82544 potentially requires twice as many data descriptors +* in order to guarantee buffers don't end on evenly-aligned +* dwords +*/ + if (adapter->pcix_82544) + desc_needed += MAX_SKB_FRAGS + 1; + /* Make sure there is space in the ring for the next send. */ - e1000_maybe_stop_tx(netdev, tx_ring, MAX_SKB_FRAGS + 2); + e1000_maybe_stop_tx(netdev, tx_ring, desc_needed); if (!skb->xmit_more || netif_xmit_stopped(netdev_get_tx_queue(netdev, 0))) { -- 2.5.5
[net 0/3][pull request] Intel Wired LAN Driver Updates 2016-04-05
This series contains updates to i40e and e1000. Jesse fixes an issue where code was added by a previous commit but the flag to enable it was never set. Alex fixes the e1000 driver from grossly overestimated the descriptors needed to transmit a frame. The following are changes since commit eb8e97715f29a1240cdf67b0df725be27433259f: sctp: use list_* in sctp_list_dequeue and are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue master Alexander Duyck (2): e1000: Do not overestimate descriptor counts in Tx pre-check e1000: Double Tx descriptors needed check for 82544 Jesse Brandeburg (1): i40e: fix errant PCIe bandwidth message drivers/net/ethernet/intel/e1000/e1000_main.c | 21 +++-- drivers/net/ethernet/intel/i40e/i40e_main.c | 1 + 2 files changed, 20 insertions(+), 2 deletions(-) -- 2.5.5
Re: [PATCH] sctp: Fix error handling for switch statement case in the function sctp_cmd_interprete
On Tue, Apr 05, 2016 at 05:36:41PM -0400, Bastien Philbert wrote: > This fixes error handling for the switch statement case > SCTP_CMD_SEND_PKT by making the error value of the call > to sctp_packet_transmit equal the variable error due to > this function being able to fail with a error code. In > addition allow the call to sctp_ootb_pkt_free afterwards > to free up the no longer in use sctp packet even if the > call to the function sctp_packet_transmit fails in order > to avoid a memory leak here for not freeing the sctp This leak shouldn't exist as sctp_packet_transmit() will free the packet if it returns ENOMEM, through the nomem: handling. But about making it visible to the user, that looks interesting to me although I cannot foresee yet its effects, like the comment at the end of sctp_packet_transmit() on not returning EHOSTUNREACH. Did you check it? > > Signed-off-by: Bastien Philbert> --- > net/sctp/sm_sideeffect.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c > index 7fe56d0..f3a8b58 100644 > --- a/net/sctp/sm_sideeffect.c > +++ b/net/sctp/sm_sideeffect.c > @@ -1434,7 +1434,7 @@ static int sctp_cmd_interpreter(sctp_event_t event_type, > case SCTP_CMD_SEND_PKT: > /* Send a full packet to our peer. */ > packet = cmd->obj.packet; > - sctp_packet_transmit(packet, gfp); > + error = sctp_packet_transmit(packet, gfp); > sctp_ootb_pkt_free(packet); > break; > > -- > 2.5.0 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-sctp" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
Re: [RFC PATCH 1/5] bpf: add PHYS_DEV prog type for early driver filter
On Tue, Apr 05, 2016 at 11:29:05AM +0200, Jesper Dangaard Brouer wrote: > > > > Of course, there are other pieces to accelerate: > > 12.71% ksoftirqd/1[mlx4_en] [k] mlx4_en_alloc_frags > > 6.87% ksoftirqd/1[mlx4_en] [k] mlx4_en_free_frag > > 4.20% ksoftirqd/1[kernel.vmlinux] [k] get_page_from_freelist > > 4.09% swapper[mlx4_en] [k] mlx4_en_process_rx_cq > > and I think Jesper's work on batch allocation is going help that a lot. > > Actually, it looks like all of this "overhead" comes from the page > alloc/free (+ dma unmap/map). We would need a page-pool recycle > mechanism to solve/remove this overhead. For the early drop case we > might be able to hack recycle the page directly in the driver (and also > avoid dma_unmap/map cycle). Exactly. A cache of allocated and mapped pages will help a lot both drop and redirect use cases. After tx completion we can recycle still mmaped page into the cache (need to make sure to map them PCI_DMA_BIDIRECTIONAL) and rx can refill the ring with it. For load balancer steady state we won't have any calls to page allocator and dma. Being able to do cheap percpu pool like this is a huge advantage that any kernel bypass cannot have. I'm pretty sure it will be possible to avoid local_cmpxchg as well.
Re: [PATCH] sctp: Fix error handling for switch statement case in the function sctp_cmd_interprete
On 2016-04-05 05:53 PM, Daniel Borkmann wrote: > On 04/05/2016 11:36 PM, Bastien Philbert wrote: >> This fixes error handling for the switch statement case >> SCTP_CMD_SEND_PKT by making the error value of the call >> to sctp_packet_transmit equal the variable error due to >> this function being able to fail with a error code. In > > What actual issue have you observed that you fix? > The issue here is basically that sctp_packet_transmit can return a error if it unsuccessfully transmit the sk_buff as a parameter. Seems that we should signal the user/caller(s) when a sctp packet transmission fails here. If you would like I can resend with a better commit message in a V2 if this explains the issue better. Bastien >> addition allow the call to sctp_ootb_pkt_free afterwards >> to free up the no longer in use sctp packet even if the >> call to the function sctp_packet_transmit fails in order >> to avoid a memory leak here for not freeing the sctp > > Not sure how this relates to your code? > >> Signed-off-by: Bastien Philbert>> --- >> net/sctp/sm_sideeffect.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c >> index 7fe56d0..f3a8b58 100644 >> --- a/net/sctp/sm_sideeffect.c >> +++ b/net/sctp/sm_sideeffect.c >> @@ -1434,7 +1434,7 @@ static int sctp_cmd_interpreter(sctp_event_t >> event_type, >> case SCTP_CMD_SEND_PKT: >> /* Send a full packet to our peer. */ >> packet = cmd->obj.packet; >> -sctp_packet_transmit(packet, gfp); >> +error = sctp_packet_transmit(packet, gfp); >> sctp_ootb_pkt_free(packet); >> break; >> >> >
Re: [PATCH] sctp: Fix error handling for switch statement case in the function sctp_cmd_interprete
On 04/05/2016 11:36 PM, Bastien Philbert wrote: This fixes error handling for the switch statement case SCTP_CMD_SEND_PKT by making the error value of the call to sctp_packet_transmit equal the variable error due to this function being able to fail with a error code. In What actual issue have you observed that you fix? addition allow the call to sctp_ootb_pkt_free afterwards to free up the no longer in use sctp packet even if the call to the function sctp_packet_transmit fails in order to avoid a memory leak here for not freeing the sctp Not sure how this relates to your code? Signed-off-by: Bastien Philbert--- net/sctp/sm_sideeffect.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c index 7fe56d0..f3a8b58 100644 --- a/net/sctp/sm_sideeffect.c +++ b/net/sctp/sm_sideeffect.c @@ -1434,7 +1434,7 @@ static int sctp_cmd_interpreter(sctp_event_t event_type, case SCTP_CMD_SEND_PKT: /* Send a full packet to our peer. */ packet = cmd->obj.packet; - sctp_packet_transmit(packet, gfp); + error = sctp_packet_transmit(packet, gfp); sctp_ootb_pkt_free(packet); break;
[PATCH] sctp: Fix error handling for switch statement case in the function sctp_cmd_interprete
This fixes error handling for the switch statement case SCTP_CMD_SEND_PKT by making the error value of the call to sctp_packet_transmit equal the variable error due to this function being able to fail with a error code. In addition allow the call to sctp_ootb_pkt_free afterwards to free up the no longer in use sctp packet even if the call to the function sctp_packet_transmit fails in order to avoid a memory leak here for not freeing the sctp Signed-off-by: Bastien Philbert--- net/sctp/sm_sideeffect.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c index 7fe56d0..f3a8b58 100644 --- a/net/sctp/sm_sideeffect.c +++ b/net/sctp/sm_sideeffect.c @@ -1434,7 +1434,7 @@ static int sctp_cmd_interpreter(sctp_event_t event_type, case SCTP_CMD_SEND_PKT: /* Send a full packet to our peer. */ packet = cmd->obj.packet; - sctp_packet_transmit(packet, gfp); + error = sctp_packet_transmit(packet, gfp); sctp_ootb_pkt_free(packet); break; -- 2.5.0
Re: [PATCH] ipv6: icmp: Add protection from concurrent users in the function icmpv6_echo_reply
On 05.04.2016 23:27, Bastien Philbert wrote: This adds protection from concurrenct users in the function icmpv6_echo_reply around the call to the function __in6_dev_get by locking/unlocking around this call with calls to the functions rtnl_lock and rtnl_unlock to protect against concurrent users when calling this function in icmpv6_echo_reply as stated in the comments for locking requirements for the function, __in6_dev_get. Signed-off-by: Bastien Philbert--- net/ipv6/icmp.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c index 0a37ddc..798434f 100644 --- a/net/ipv6/icmp.c +++ b/net/ipv6/icmp.c @@ -607,7 +607,9 @@ static void icmpv6_echo_reply(struct sk_buff *skb) hlimit = ip6_sk_dst_hoplimit(np, , dst); + rtnl_lock(); idev = __in6_dev_get(skb->dev); + rtnl_unlock(); msg.skb = skb; msg.offset = 0; We can't hold rtnl_lock in bh context. Have you seen a rcu verifier report? I am sure we hold rcu read lock at this point. Bye, Hannes
[PATCH] ipv6: icmp: Add protection from concurrent users in the function icmpv6_echo_reply
This adds protection from concurrenct users in the function icmpv6_echo_reply around the call to the function __in6_dev_get by locking/unlocking around this call with calls to the functions rtnl_lock and rtnl_unlock to protect against concurrent users when calling this function in icmpv6_echo_reply as stated in the comments for locking requirements for the function, __in6_dev_get. Signed-off-by: Bastien Philbert--- net/ipv6/icmp.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c index 0a37ddc..798434f 100644 --- a/net/ipv6/icmp.c +++ b/net/ipv6/icmp.c @@ -607,7 +607,9 @@ static void icmpv6_echo_reply(struct sk_buff *skb) hlimit = ip6_sk_dst_hoplimit(np, , dst); + rtnl_lock(); idev = __in6_dev_get(skb->dev); + rtnl_unlock(); msg.skb = skb; msg.offset = 0; -- 2.5.0
Re: [RFC PATCH 6/6] ppp: add rtnetlink device creation support
On Tue, Apr 05, 2016 at 07:18:14PM +0200, walter harms wrote: > > > Am 05.04.2016 02:56, schrieb Guillaume Nault: > > @@ -1043,12 +1048,39 @@ static int ppp_dev_configure(struct net *src_net, > > struct net_device *dev, > > const struct ppp_config *conf) > > { > > struct ppp *ppp = netdev_priv(dev); > > + struct file *file; > > int indx; > > + int err; > > + > > + if (conf->fd < 0) { > > + file = conf->file; > > + if (!file) { > > + err = -EBADF; > > + goto out; > > why not just return -EBADF; > > > + } > > + } else { > > + file = fget(conf->fd); > > + if (!file) { > > + err = -EBADF; > > + goto out; > > why not just return -EBADF; > Just because the 'out' label is declared anyway and because this centralises the return point. But I agree returning -EBADF directly could be more readable. I don't have strong opinion.
Re: [PATCH net-next 05/10] bnxt_en: Add get_eee() and set_eee() ethtool support.
On Tue, 2016-04-05 at 03:36 -0700, Michael Chan wrote: > On Tue, Apr 5, 2016 at 3:07 AM, Ben Hutchingswrote: [...] > > > +static int bnxt_get_eee(struct net_device *dev, struct ethtool_eee > > > *edata) > > > +{ > > > + struct bnxt *bp = netdev_priv(dev); > > > + > > > + if (!(bp->flags & BNXT_FLAG_EEE_CAP)) > > > + return -EOPNOTSUPP; > > > + > > > + *edata = bp->eee; > > > + if (!bp->eee.eee_enabled) { > > > + edata->advertised = 0; > > > + edata->tx_lpi_enabled = 0; > > What about tx_lpi_timer? > We want to keep the tx_lpi_timer value so that it can be used again > when it is turned on again. > > The user doesn't have to figure out what value to use if he just wants > to use the default or the last value. OK, that seems like a good reason. > > > > > > And, wouldn't it make more sense to do these fixups to the internal > > state in bnxt_set_eee()? > I don't understand.  If the user is enabling EEE, we take all the > parameters.  If he is disabling, we don't take any of the parameters. Right - it's just a bit weird that you keep the internal state in a struct ethtool_eee but get_eee() returns a slightly different version of the state. Ben. -- Ben Hutchings No political challenge can be met by shopping. - George Monbiot signature.asc Description: This is a digitally signed message part
Re: [RFC PATCH 5/6] ppp: define reusable device creation functions
On Tue, Apr 05, 2016 at 08:28:32AM -0700, Stephen Hemminger wrote: > On Tue, 5 Apr 2016 02:56:29 +0200 > Guillaume Naultwrote: > > > Move PPP device initialisation and registration out of > > ppp_create_interface(). > > This prepares code for device registration with rtnetlink. > > > > Does PPP module autoload correctly based on the netlink attributes? > Patch #6 has MODULE_ALIAS_RTNL_LINK("ppp"). This works fine for auto-loading ppp_generic when creating a PPP device with rtnetlink. Is there anything else required?
Re: [RFC PATCH 0/6] ppp: add rtnetlink support
.On Tue, Apr 05, 2016 at 08:27:45AM -0700, Stephen Hemminger wrote: > On Tue, 5 Apr 2016 02:56:17 +0200 > Guillaume Naultwrote: > > > The rtnetlink handlers implemented in this series are minimal, and can > > only replace the PPPIOCNEWUNIT ioctl. The rest of PPP ioctls remains > > necessary for any other operation on channels and units. > > It is perfectly to possible to mix PPP devices created by rtnl > > and by ioctl(PPPIOCNEWUNIT). Devices will behave in the same way, > > except for a few specific cases (as detailed in patch #6). > > What blocks PPP from being fully netlink (use attributes), > I just didn't implement other netlink attributes because I wanted to get the foundations validated first. Implementing PPP unit ioctls with rtnetlink attributes shouldn't be a problem because there's a 1:1 mapping between units and netdevices. So we could have some kind of feature parity (I'm not sure if all ioctls are worth a netlink attribute though). But there's the problem of getting the unit identifier of a PPP device. If that device was created with kernel assigned name and index, then the user space daemon has no ifindex or ifname for building an RTM_GETLINK message. So the ability to retrieve the unit identifer with rtnetlink wouldn't be enough to fully replace ioctls on unit. If by "fully netlink", you also meant implementing a netlink replacement for all supported ioctls, then that's going to be even trickier. A genetlink API would probably need to be created for handling generic operations on PPP channels. But that wouldn't be enough since unknown ioctls on channels are passed to the chan->ops->ioctl() callback. So netlink support would also have to be added to the channel handlers (pptp, pppoatm, sync_ppp, irda...). > and work with same API set independent of how device was created. > Special cases are nuisance and source of bugs. > It looks like handling rtnetlink messages in ioctl based PPP devices is just a matter of assigning ->rtnl_link_ops in ppp_create_interface(). I'll consider that for v3. > > I'm sending the series only as RFC this time, because there are a few > > points I'm unsatisfied with. > > > > First, I'm not fond of passing file descriptors as netlink attributes, > > as done with IFLA_PPP_DEV_FD (which is filled with a /dev/ppp fd). But > > given how PPP units work, we have to associate a /dev/ppp fd somehow. > > > > More importantly, the locking constraints of PPP are quite problematic. > > The rtnetlink handler has to associate the new PPP unit with the > > /dev/ppp file descriptor passed as parameter. This requires holding the > > ppp_mutex (see e8e56ffd9d29 "ppp: ensure file->private_data can't be > > overridden"), while the rtnetlink callback is already protected by > > rtnl_lock(). Since other parts of the module take these locks in > > reverse order, most of this series deals with preparing the code for > > inverting the dependency between rtnl_lock and ppp_mutex. Some more > > work is needed on that part (see patch #4 for details), but I wanted > > to be sure that approach it worth it before spending some more time on > > it. > > One other way to handle the locking is to use trylock. Yes it justs > pushs the problem back to userspace, but that is how lock reordering was > handled in sysfs. > If that's considered a valid approach, then I'll use it for v3. That'd simplify things nicely.
[RESEND PATCH net-next v2 3/3] net: bcmgenet: cleanup for dmadesc_set()
dmadesc_set() is used for setting the Tx buffer DMA address, length, and status bits on a Tx ring descriptor when a frame is being Tx'ed. Always set the Tx buffer DMA address first, before updating the length and status bits, i.e. giving the Tx descriptor to the hardware. The reason this is a cleanup rather than a fix is that the hardware won't transmit anything from a Tx ring until the TDMA producer index has been incremented. As long as the dmadesc_set() writes complete before the TDMA producer index write, life is good. Signed-off-by: Petri Gynther--- drivers/net/ethernet/broadcom/genet/bcmgenet.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c index d77cd6d..f7b42b9 100644 --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c @@ -104,8 +104,8 @@ static inline void dmadesc_set_addr(struct bcmgenet_priv *priv, static inline void dmadesc_set(struct bcmgenet_priv *priv, void __iomem *d, dma_addr_t addr, u32 val) { - dmadesc_set_length_status(priv, d, val); dmadesc_set_addr(priv, d, addr); + dmadesc_set_length_status(priv, d, val); } static inline dma_addr_t dmadesc_get_addr(struct bcmgenet_priv *priv, -- 2.8.0.rc3.226.g39d4020
[RESEND PATCH net-next v2 1/3] net: bcmgenet: cleanup for bcmgenet_xmit()
1. Readability: Move nr_frags assignment a few lines down in order to bundle index -> ring -> txq calculations together. 2. Readability: Add parentheses around nr_frags + 1. 3. Minor fix: Stop the Tx queue and throw the error message only if the Tx queue hasn't already been stopped. Signed-off-by: Petri GyntherAcked-by: Florian Fainelli --- drivers/net/ethernet/broadcom/genet/bcmgenet.c | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c index cf6445d..7f85a84 100644 --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c @@ -1447,15 +1447,19 @@ static netdev_tx_t bcmgenet_xmit(struct sk_buff *skb, struct net_device *dev) else index -= 1; - nr_frags = skb_shinfo(skb)->nr_frags; ring = >tx_rings[index]; txq = netdev_get_tx_queue(dev, ring->queue); + nr_frags = skb_shinfo(skb)->nr_frags; + spin_lock_irqsave(>lock, flags); - if (ring->free_bds <= nr_frags + 1) { - netif_tx_stop_queue(txq); - netdev_err(dev, "%s: tx ring %d full when queue %d awake\n", - __func__, index, ring->queue); + if (ring->free_bds <= (nr_frags + 1)) { + if (!netif_tx_queue_stopped(txq)) { + netif_tx_stop_queue(txq); + netdev_err(dev, + "%s: tx ring %d full when queue %d awake\n", + __func__, index, ring->queue); + } ret = NETDEV_TX_BUSY; goto out; } -- 2.8.0.rc3.226.g39d4020
[RESEND PATCH net-next v2 2/3] net: bcmgenet: cleanup for bcmgenet_xmit_frag()
Add frag_size = skb_frag_size(frag) and use it when needed. Signed-off-by: Petri GyntherAcked-by: Florian Fainelli --- drivers/net/ethernet/broadcom/genet/bcmgenet.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c index 7f85a84..d77cd6d 100644 --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c @@ -1331,6 +1331,7 @@ static int bcmgenet_xmit_frag(struct net_device *dev, struct bcmgenet_priv *priv = netdev_priv(dev); struct device *kdev = >pdev->dev; struct enet_cb *tx_cb_ptr; + unsigned int frag_size; dma_addr_t mapping; int ret; @@ -1338,10 +1339,12 @@ static int bcmgenet_xmit_frag(struct net_device *dev, if (unlikely(!tx_cb_ptr)) BUG(); + tx_cb_ptr->skb = NULL; - mapping = skb_frag_dma_map(kdev, frag, 0, - skb_frag_size(frag), DMA_TO_DEVICE); + frag_size = skb_frag_size(frag); + + mapping = skb_frag_dma_map(kdev, frag, 0, frag_size, DMA_TO_DEVICE); ret = dma_mapping_error(kdev, mapping); if (ret) { priv->mib.tx_dma_failed++; @@ -1351,10 +1354,10 @@ static int bcmgenet_xmit_frag(struct net_device *dev, } dma_unmap_addr_set(tx_cb_ptr, dma_addr, mapping); - dma_unmap_len_set(tx_cb_ptr, dma_len, frag->size); + dma_unmap_len_set(tx_cb_ptr, dma_len, frag_size); dmadesc_set(priv, tx_cb_ptr->bd_addr, mapping, - (frag->size << DMA_BUFLENGTH_SHIFT) | dma_desc_flags | + (frag_size << DMA_BUFLENGTH_SHIFT) | dma_desc_flags | (priv->hw_params->qtag_mask << DMA_TX_QTAG_SHIFT)); return 0; -- 2.8.0.rc3.226.g39d4020
[RESEND PATCH net-next v2 0/3] bcmgenet cleanups
Three cleanup patches for bcmgenet. Petri Gynther (3): net: bcmgenet: cleanup for bcmgenet_xmit() net: bcmgenet: cleanup for bcmgenet_xmit_frag() net: bcmgenet: cleanup for dmadesc_set() drivers/net/ethernet/broadcom/genet/bcmgenet.c | 27 -- 1 file changed, 17 insertions(+), 10 deletions(-) -- 2.8.0.rc3.226.g39d4020
[PATCH] Revert "netpoll: Fix extra refcount release in netpoll_cleanup()"
This reverts commit 543e3a8da5a4c453e992d5351ef405d5e32f27d7. Direct callers of __netpoll_setup() depend on it to set np->dev, so we can't simply move that assignment up to netpoll_stup(). Reported-by: Bart Van AsscheSigned-off-by: Bjorn Helgaas --- net/core/netpoll.c |3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/net/core/netpoll.c b/net/core/netpoll.c index a57bd17..94acfc8 100644 --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -603,6 +603,7 @@ int __netpoll_setup(struct netpoll *np, struct net_device *ndev) const struct net_device_ops *ops; int err; + np->dev = ndev; strlcpy(np->dev_name, ndev->name, IFNAMSIZ); INIT_WORK(>cleanup_work, netpoll_async_cleanup); @@ -669,7 +670,6 @@ int netpoll_setup(struct netpoll *np) goto unlock; } dev_hold(ndev); - np->dev = ndev; if (netdev_master_upper_dev_get(ndev)) { np_err(np, "%s is a slave device, aborting\n", np->dev_name); @@ -770,7 +770,6 @@ int netpoll_setup(struct netpoll *np) return 0; put: - np->dev = NULL; dev_put(ndev); unlock: rtnl_unlock();
[PATCH net] bridge, netem: mark mailing lists as moderated
I moderate these (lightly loaded) lists to block spam. Signed-off-by: Stephen Hemminger--- MAINTAINERS | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index 67d99dd..8355536 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -4303,7 +4303,7 @@ F:drivers/net/ethernet/agere/ ETHERNET BRIDGE M: Stephen Hemminger -L: bri...@lists.linux-foundation.org +L: bri...@lists.linux-foundation.org (moderated for non-subscribers) L: netdev@vger.kernel.org W: http://www.linuxfoundation.org/en/Net:Bridge S: Maintained @@ -7576,7 +7576,7 @@ F:drivers/infiniband/hw/nes/ NETEM NETWORK EMULATOR M: Stephen Hemminger -L: ne...@lists.linux-foundation.org +L: ne...@lists.linux-foundation.org (moderated for non-subscribers) S: Maintained F: net/sched/sch_netem.c -- 2.1.4
[PATCH net-next v2 3/3] tipc: reduce transmission rate of reset messages when link is down
When a link is down, it will continuously try to re-establish contact with the peer by sending out a RESET or an ACTIVATE message at each timeout interval. The default value for this interval is currently 375 ms. This is wasteful, and may become a problem in very large clusters with dozens or hundreds of nodes being down simultaneously. We now introduce a simple backoff algorithm for these cases. The first five messages are sent at default rate; thereafter a message is sent only each 16th timer interval. This will cover the vast majority of link recycling cases, since the endpoint starting last will transmit at the higher speed, and the link should normally be established well be before the rate needs to be reduced. The only case where we will see a degradation of link re-establishment is when the endpoints remain intact, and a glitch in the transmission media is causing the link reset. We will then experience a worst-case re-establishing time of 6 seconds, something we deem acceptable. Acked-by: Ying XueSigned-off-by: Jon Maloy --- net/tipc/link.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/net/tipc/link.c b/net/tipc/link.c index 7d2bb3e..42cdbd1 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -140,6 +140,7 @@ struct tipc_link { char if_name[TIPC_MAX_IF_NAME]; u32 priority; char net_plane; + u16 rst_cnt; /* Failover/synch */ u16 drop_point; @@ -701,8 +702,6 @@ static void link_profile_stats(struct tipc_link *l) /* tipc_link_timeout - perform periodic task as instructed from node timeout */ -/* tipc_link_timeout - perform periodic task as instructed from node timeout - */ int tipc_link_timeout(struct tipc_link *l, struct sk_buff_head *xmitq) { int rc = 0; @@ -730,11 +729,13 @@ int tipc_link_timeout(struct tipc_link *l, struct sk_buff_head *xmitq) l->silent_intv_cnt++; break; case LINK_RESET: - xmit = true; + if ((l->rst_cnt++ <= 4) || !(l->rst_cnt % 16)) + xmit = true; mtyp = RESET_MSG; break; case LINK_ESTABLISHING: - xmit = true; + if ((l->rst_cnt++ <= 4) || !(l->rst_cnt % 16)) + xmit = true; mtyp = ACTIVATE_MSG; break; case LINK_PEER_RESET: @@ -833,6 +834,7 @@ void tipc_link_reset(struct tipc_link *l) l->rcv_nxt = 1; l->acked = 0; l->silent_intv_cnt = 0; + l->rst_cnt = 0; l->stats.recv_info = 0; l->stale_count = 0; l->bc_peer_is_up = false; -- 1.9.1
[PATCH net-next v2 1/3] tipc: eliminate buffer leak in bearer layer
When enabling a bearer we create a 'neigbor discoverer' instance by calling the function tipc_disc_create() before the bearer is actually registered in the list of enabled bearers. Because of this, the very first discovery broadcast message, created by the mentioned function, is lost, since it cannot find any valid bearer to use. Furthermore, the used send function, tipc_bearer_xmit_skb() does not free the given buffer when it cannot find a bearer, resulting in the leak of exactly one send buffer each time a bearer is enabled. This commit fixes this problem by introducing two changes: 1) Instead of attemting to send the discovery message directly, we let tipc_disc_create() return the discovery buffer to the calling function, tipc_enable_bearer(), so that the latter can send it when the enabling sequence is finished. 2) In tipc_bearer_xmit_skb(), as well as in the two other transmit functions at the bearer layer, we now free the indicated buffer or buffer chain when a valid bearer cannot be found. Acked-by: Ying XueSigned-off-by: Jon Maloy --- net/tipc/bearer.c | 51 ++- net/tipc/discover.c | 7 ++- net/tipc/discover.h | 2 +- 3 files changed, 29 insertions(+), 31 deletions(-) diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c index 27a5406..20566e9 100644 --- a/net/tipc/bearer.c +++ b/net/tipc/bearer.c @@ -205,6 +205,7 @@ static int tipc_enable_bearer(struct net *net, const char *name, struct tipc_bearer *b; struct tipc_media *m; struct tipc_bearer_names b_names; + struct sk_buff *skb; char addr_string[16]; u32 bearer_id; u32 with_this_prio; @@ -301,7 +302,7 @@ restart: b->net_plane = bearer_id + 'A'; b->priority = priority; - res = tipc_disc_create(net, b, >bcast_addr); + res = tipc_disc_create(net, b, >bcast_addr, ); if (res) { bearer_disable(net, b); pr_warn("Bearer <%s> rejected, discovery object creation failed\n", @@ -310,7 +311,8 @@ restart: } rcu_assign_pointer(tn->bearer_list[bearer_id], b); - + if (skb) + tipc_bearer_xmit_skb(net, bearer_id, skb, >bcast_addr); pr_info("Enabled bearer <%s>, discovery domain %s, priority %u\n", name, tipc_addr_string_fill(addr_string, disc_domain), priority); @@ -450,6 +452,8 @@ void tipc_bearer_xmit_skb(struct net *net, u32 bearer_id, b = rcu_dereference_rtnl(tn->bearer_list[bearer_id]); if (likely(b)) b->media->send_msg(net, skb, b, dest); + else + kfree_skb(skb); rcu_read_unlock(); } @@ -468,11 +472,11 @@ void tipc_bearer_xmit(struct net *net, u32 bearer_id, rcu_read_lock(); b = rcu_dereference_rtnl(tn->bearer_list[bearer_id]); - if (likely(b)) { - skb_queue_walk_safe(xmitq, skb, tmp) { - __skb_dequeue(xmitq); - b->media->send_msg(net, skb, b, dst); - } + if (unlikely(!b)) + __skb_queue_purge(xmitq); + skb_queue_walk_safe(xmitq, skb, tmp) { + __skb_dequeue(xmitq); + b->media->send_msg(net, skb, b, dst); } rcu_read_unlock(); } @@ -490,14 +494,14 @@ void tipc_bearer_bc_xmit(struct net *net, u32 bearer_id, rcu_read_lock(); b = rcu_dereference_rtnl(tn->bearer_list[bearer_id]); - if (likely(b)) { - skb_queue_walk_safe(xmitq, skb, tmp) { - hdr = buf_msg(skb); - msg_set_non_seq(hdr, 1); - msg_set_mc_netid(hdr, net_id); - __skb_dequeue(xmitq); - b->media->send_msg(net, skb, b, >bcast_addr); - } + if (unlikely(!b)) + __skb_queue_purge(xmitq); + skb_queue_walk_safe(xmitq, skb, tmp) { + hdr = buf_msg(skb); + msg_set_non_seq(hdr, 1); + msg_set_mc_netid(hdr, net_id); + __skb_dequeue(xmitq); + b->media->send_msg(net, skb, b, >bcast_addr); } rcu_read_unlock(); } @@ -513,24 +517,21 @@ void tipc_bearer_bc_xmit(struct net *net, u32 bearer_id, * ignores packets sent using interface multicast, and traffic sent to other * nodes (which can happen if interface is running in promiscuous mode). */ -static int tipc_l2_rcv_msg(struct sk_buff *buf, struct net_device *dev, +static int tipc_l2_rcv_msg(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt, struct net_device *orig_dev) { struct tipc_bearer *b; rcu_read_lock(); b = rcu_dereference_rtnl(dev->tipc_ptr); - if (likely(b)) { - if (likely(buf->pkt_type <= PACKET_BROADCAST)) { - buf->next = NULL; -
[PATCH net-next v2 0/3] tipc: some small fixes
When running TIPC in large clusters we experience behavior that may potentially become problematic in the future. This series picks some low-hanging fruit in this regard, and also fixes a couple of other minor issues. v2: Corrected typos in commit #3, as per feedback from S. Shtylyov Jon Maloy (3): tipc: eliminate buffer leak in bearer layer tipc: stricter filtering of packets in bearer layer tipc: reduce transmission rate of reset messages when link is down net/tipc/bearer.c | 101 ++-- net/tipc/discover.c | 7 ++-- net/tipc/discover.h | 2 +- net/tipc/link.c | 10 +++--- net/tipc/msg.h | 5 +++ 5 files changed, 73 insertions(+), 52 deletions(-) -- 1.9.1
[PATCH net-next v2 2/3] tipc: stricter filtering of packets in bearer layer
Resetting a bearer/interface, with the consequence of resetting all its pertaining links, is not an atomic action. This becomes particularly evident in very large clusters, where a lot of traffic may happen on the remaining links while we are busy shutting them down. In extreme cases, we may even see links being re-created and re-established before we are finished with the job. To solve this, we now introduce a solution where we temporarily detach the bearer from the interface when the bearer is reset. This inhibits all packet reception, while sending still is possible. For the latter, we use the fact that the device's user pointer now is zero to filter out which packets can be sent during this situation; i.e., outgoing RESET messages only. This filtering serves to speed up the neighbors' detection of the loss event, and saves us from unnecessary probing. Acked-by: Ying XueSigned-off-by: Jon Maloy --- net/tipc/bearer.c | 50 +- net/tipc/msg.h| 5 + 2 files changed, 38 insertions(+), 17 deletions(-) diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c index 20566e9..6f11c62 100644 --- a/net/tipc/bearer.c +++ b/net/tipc/bearer.c @@ -337,23 +337,16 @@ static int tipc_reset_bearer(struct net *net, struct tipc_bearer *b) */ static void bearer_disable(struct net *net, struct tipc_bearer *b) { - struct tipc_net *tn = net_generic(net, tipc_net_id); - u32 i; + struct tipc_net *tn = tipc_net(net); + int bearer_id = b->identity; pr_info("Disabling bearer <%s>\n", b->name); b->media->disable_media(b); - - tipc_node_delete_links(net, b->identity); + tipc_node_delete_links(net, bearer_id); RCU_INIT_POINTER(b->media_ptr, NULL); if (b->link_req) tipc_disc_delete(b->link_req); - - for (i = 0; i < MAX_BEARERS; i++) { - if (b == rtnl_dereference(tn->bearer_list[i])) { - RCU_INIT_POINTER(tn->bearer_list[i], NULL); - break; - } - } + RCU_INIT_POINTER(tn->bearer_list[bearer_id], NULL); kfree_rcu(b, rcu); } @@ -396,7 +389,7 @@ void tipc_disable_l2_media(struct tipc_bearer *b) /** * tipc_l2_send_msg - send a TIPC packet out over an L2 interface - * @buf: the packet to be sent + * @skb: the packet to be sent * @b: the bearer through which the packet is to be sent * @dest: peer destination address */ @@ -405,17 +398,21 @@ int tipc_l2_send_msg(struct net *net, struct sk_buff *skb, { struct net_device *dev; int delta; + void *tipc_ptr; dev = (struct net_device *)rcu_dereference_rtnl(b->media_ptr); if (!dev) return 0; + /* Send RESET message even if bearer is detached from device */ + tipc_ptr = rtnl_dereference(dev->tipc_ptr); + if (unlikely(!tipc_ptr && !msg_is_reset(buf_msg(skb + goto drop; + delta = dev->hard_header_len - skb_headroom(skb); if ((delta > 0) && - pskb_expand_head(skb, SKB_DATA_ALIGN(delta), 0, GFP_ATOMIC)) { - kfree_skb(skb); - return 0; - } + pskb_expand_head(skb, SKB_DATA_ALIGN(delta), 0, GFP_ATOMIC)) + goto drop; skb_reset_network_header(skb); skb->dev = dev; @@ -424,6 +421,9 @@ int tipc_l2_send_msg(struct net *net, struct sk_buff *skb, dev->dev_addr, skb->len); dev_queue_xmit(skb); return 0; +drop: + kfree_skb(skb); + return 0; } int tipc_bearer_mtu(struct net *net, u32 bearer_id) @@ -549,9 +549,18 @@ static int tipc_l2_device_event(struct notifier_block *nb, unsigned long evt, { struct net_device *dev = netdev_notifier_info_to_dev(ptr); struct net *net = dev_net(dev); + struct tipc_net *tn = tipc_net(net); struct tipc_bearer *b; + int i; b = rtnl_dereference(dev->tipc_ptr); + if (!b) { + for (i = 0; i < MAX_BEARERS; b = NULL, i++) { + b = rtnl_dereference(tn->bearer_list[i]); + if (b && (b->media_ptr == dev)) + break; + } + } if (!b) return NOTIFY_DONE; @@ -561,13 +570,20 @@ static int tipc_l2_device_event(struct notifier_block *nb, unsigned long evt, case NETDEV_CHANGE: if (netif_carrier_ok(dev)) break; + case NETDEV_UP: + rcu_assign_pointer(dev->tipc_ptr, b); + break; case NETDEV_GOING_DOWN: + RCU_INIT_POINTER(dev->tipc_ptr, NULL); + synchronize_net(); + tipc_reset_bearer(net, b); + break; case NETDEV_CHANGEMTU: tipc_reset_bearer(net, b); break; case NETDEV_CHANGEADDR:
[PATCH net-next] bpf, verifier: further improve search pruning
The verifier needs to go through every path of the program in order to check that it terminates safely, which can be quite a lot of instructions that need to be processed f.e. in cases with more branchy programs. With search pruning from f1bca824dabb ("bpf: add search pruning optimization to verifier") the search space can already be reduced significantly when the verifier detects that a previously walked path with same register and stack contents terminated already (see verifier's states_equal()), so the search can skip walking those states. When working with larger programs of > ~2000 (out of max 4096) insns, we found that the current limit of 32k instructions is easily hit. For example, a case we ran into is that the search space cannot be pruned due to branches at the beginning of the program that make use of certain stack space slots (STACK_MISC), which are never used in the remaining program (STACK_INVALID). Therefore, the verifier needs to walk paths for the slots in STACK_INVALID state, but also all remaining paths with a stack structure, where the slots are in STACK_MISC, which can nearly double the search space needed. After various experiments, we find that a limit of 64k processed insns is a more reasonable choice when dealing with larger programs in practice. This still allows to reject extreme crafted cases that can have a much higher complexity (f.e. > ~300k) within the 4096 insns limit due to search pruning not being able to take effect. Furthermore, we found that a lot of states can be pruned after a call instruction, f.e. we were able to reduce the search state by ~35% in some cases with this heuristic, trade-off is to keep a bit more states in env->explored_states. Usually, call instructions have a number of preceding register assignments and/or stack stores, where search pruning has a better chance to suceed in states_equal() test. The current code marks the branch targets with STATE_LIST_MARK in case of conditional jumps, and the next (t + 1) instruction in case of unconditional jump so that f.e. a backjump will walk it. We also did experiments with using t + insns[t].off + 1 as a marker in the unconditionally jump case instead of t + 1 with the rationale that these two branches of execution that converge after the label might have more potential of pruning. We found that it was a bit better, but not necessarily significantly better than the current state, perhaps also due to clang not generating back jumps often. Hence, we left that as is for now. Signed-off-by: Daniel BorkmannAcked-by: Alexei Starovoitov --- kernel/bpf/verifier.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 2e08f8e..212e52a 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -202,6 +202,9 @@ struct verifier_env { bool allow_ptr_leaks; }; +#define BPF_COMPLEXITY_LIMIT_INSNS 65536 +#define BPF_COMPLEXITY_LIMIT_STACK 1024 + /* verbose verifier prints what it's seeing * bpf_check() is called under lock, so no race to access these global vars */ @@ -454,7 +457,7 @@ static struct verifier_state *push_stack(struct verifier_env *env, int insn_idx, elem->next = env->head; env->head = elem; env->stack_size++; - if (env->stack_size > 1024) { + if (env->stack_size > BPF_COMPLEXITY_LIMIT_STACK) { verbose("BPF program is too complex\n"); goto err; } @@ -1539,6 +1542,8 @@ peek_stack: goto peek_stack; else if (ret < 0) goto err_free; + if (t + 1 < insn_cnt) + env->explored_states[t + 1] = STATE_LIST_MARK; } else if (opcode == BPF_JA) { if (BPF_SRC(insns[t].code) != BPF_K) { ret = -EINVAL; @@ -1743,7 +1748,7 @@ static int do_check(struct verifier_env *env) insn = [insn_idx]; class = BPF_CLASS(insn->code); - if (++insn_processed > 32768) { + if (++insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) { verbose("BPF program is too large. Proccessed %d insn\n", insn_processed); return -E2BIG; -- 1.9.3
Re: [PATCH net-next v2 0/3] udp: support SO_PEEK_OFF
From: Willem de BruijnDate: Tue, 5 Apr 2016 12:41:13 -0400 > From: Willem de Bruijn > > Support peeking at a non-zero offset for UDP sockets. Match the > existing behavior on Unix datagram sockets. > > 1/3 makes the sk_peek_offset functions safe to use outside locks > 2/3 removes udp headers before enqueue, to simplify offset arithmetic > 3/3 introduces SO_PEEK_OFFSET support, with Unix socket peek semantics. > > Changes > v1->v2 > - squash patches 3 and 4 Series applied, thanks Willem.
Re: [net-next 00/18][pull request] 40GbE Intel Wired LAN Driver Updates 2016-04-05
From: Jeff KirsherDate: Tue, 5 Apr 2016 13:17:17 -0700 > This series contains updates to i40e and i40evf only. This looks fine, pulled, thanks Jeff.
[net-next 09/18] i40e: Fix up return code
From: Jesse BrandeburgThe i40e_common.c typically uses i40e_status as a return code, but got missed this one case. Signed-off-by: Jesse Brandeburg Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_common.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c index b0fd684..8276a13 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_common.c +++ b/drivers/net/ethernet/intel/i40e/i40e_common.c @@ -1901,13 +1901,13 @@ i40e_status i40e_aq_set_phy_int_mask(struct i40e_hw *hw, * * Reset the external PHY. **/ -enum i40e_status_code i40e_aq_set_phy_debug(struct i40e_hw *hw, u8 cmd_flags, - struct i40e_asq_cmd_details *cmd_details) +i40e_status i40e_aq_set_phy_debug(struct i40e_hw *hw, u8 cmd_flags, + struct i40e_asq_cmd_details *cmd_details) { struct i40e_aq_desc desc; struct i40e_aqc_set_phy_debug *cmd = (struct i40e_aqc_set_phy_debug *) - enum i40e_status_code status; + i40e_status status; i40e_fill_default_direct_cmd_desc(, i40e_aqc_opc_set_phy_debug); -- 2.5.5
[net-next 11/18] i40e: Assure that adminq is alive in debug mode
From: Shannon NelsonWhen dropping into debug mode in a failed probe, make sure that the AdminQ is left alive for possible hand debug of driver and firmware states. Move the mutex_init calls earlier in probe so that if init fails, the admin queue interface is still available for debugging purposes. Signed-off-by: Shannon Nelson Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_main.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 2464dca..56d4416 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -10822,6 +10822,12 @@ static int i40e_probe(struct pci_dev *pdev, const struct pci_device_id *ent) hw->bus.func = PCI_FUNC(pdev->devfn); pf->instance = pfs_found; + /* set up the locks for the AQ, do this only once in probe +* and destroy them only once in remove +*/ + mutex_init(>aq.asq_mutex); + mutex_init(>aq.arq_mutex); + if (debug != -1) { pf->msg_enable = pf->hw.debug_mask; pf->msg_enable = debug; @@ -10867,12 +10873,6 @@ static int i40e_probe(struct pci_dev *pdev, const struct pci_device_id *ent) /* set up a default setting for link flow control */ pf->hw.fc.requested_mode = I40E_FC_NONE; - /* set up the locks for the AQ, do this only once in probe -* and destroy them only once in remove -*/ - mutex_init(>aq.asq_mutex); - mutex_init(>aq.arq_mutex); - err = i40e_init_adminq(hw); if (err) { if (err == I40E_ERR_FIRMWARE_API_VERSION) @@ -11265,7 +11265,6 @@ err_init_lan_hmc: kfree(pf->qp_pile); err_sw_init: err_adminq_setup: - (void)i40e_shutdown_adminq(hw); err_pf_reset: iounmap(hw->hw_addr); err_ioremap: -- 2.5.5
[net-next 04/18] i40e/i40evf: Fix handling of boolean logic in polling routines
From: Alexander DuyckIn the polling routines for i40e and i40evf we were using bitwise operators to avoid the side effects of the logical operators, specifically the fact that if the first case is true with "||" we skip the second case, or if it is false with "&&" we skip the second case. This fixes an earlier patch that converted the bitwise operators over to the logical operators and instead replaces the entire thing with just an if statement since it should be more readable what we are trying to do this way. Fixes: 1a36d7fadd14 ("i40e/i40evf: use logical operators, not bitwise") Signed-off-by: Alexander Duyck Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_txrx.c | 13 - drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 13 - 2 files changed, 16 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c index 9af1411..8fb2a96 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c @@ -1975,9 +1975,11 @@ int i40e_napi_poll(struct napi_struct *napi, int budget) * budget and be more aggressive about cleaning up the Tx descriptors. */ i40e_for_each_ring(ring, q_vector->tx) { - clean_complete = clean_complete && -i40e_clean_tx_irq(ring, vsi->work_limit); - arm_wb = arm_wb || ring->arm_wb; + if (!i40e_clean_tx_irq(ring, vsi->work_limit)) { + clean_complete = false; + continue; + } + arm_wb |= ring->arm_wb; ring->arm_wb = false; } @@ -1999,8 +2001,9 @@ int i40e_napi_poll(struct napi_struct *napi, int budget) cleaned = i40e_clean_rx_irq_1buf(ring, budget_per_ring); work_done += cleaned; - /* if we didn't clean as many as budgeted, we must be done */ - clean_complete = clean_complete && (budget_per_ring > cleaned); + /* if we clean as many as budgeted, we must not be done */ + if (cleaned >= budget_per_ring) + clean_complete = false; } /* If work not completed, return budget and polling will return */ diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c index 5f9c1bb..839a6df 100644 --- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c @@ -1411,9 +1411,11 @@ int i40evf_napi_poll(struct napi_struct *napi, int budget) * budget and be more aggressive about cleaning up the Tx descriptors. */ i40e_for_each_ring(ring, q_vector->tx) { - clean_complete = clean_complete && -i40e_clean_tx_irq(ring, vsi->work_limit); - arm_wb = arm_wb || ring->arm_wb; + if (!i40e_clean_tx_irq(ring, vsi->work_limit)) { + clean_complete = false; + continue; + } + arm_wb |= ring->arm_wb; ring->arm_wb = false; } @@ -1435,8 +1437,9 @@ int i40evf_napi_poll(struct napi_struct *napi, int budget) cleaned = i40e_clean_rx_irq_1buf(ring, budget_per_ring); work_done += cleaned; - /* if we didn't clean as many as budgeted, we must be done */ - clean_complete = clean_complete && (budget_per_ring > cleaned); + /* if we clean as many as budgeted, we must not be done */ + if (cleaned >= budget_per_ring) + clean_complete = false; } /* If work not completed, return budget and polling will return */ -- 2.5.5
[net-next 03/18] i40evf: remove dead code
From: Alan CoxThe only error case is when the malloc fails, in which case the clean up loop does nothing at all, so remove it Signed-off-by: Alan Cox Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40evf/i40evf_main.c | 11 +-- 1 file changed, 1 insertion(+), 10 deletions(-) diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c index 4b70aae..820ad94 100644 --- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c +++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c @@ -1507,7 +1507,7 @@ static int i40evf_alloc_q_vectors(struct i40evf_adapter *adapter) adapter->q_vectors = kcalloc(num_q_vectors, sizeof(*q_vector), GFP_KERNEL); if (!adapter->q_vectors) - goto err_out; + return -ENOMEM; for (q_idx = 0; q_idx < num_q_vectors; q_idx++) { q_vector = >q_vectors[q_idx]; @@ -1519,15 +1519,6 @@ static int i40evf_alloc_q_vectors(struct i40evf_adapter *adapter) } return 0; - -err_out: - while (q_idx) { - q_idx--; - q_vector = >q_vectors[q_idx]; - netif_napi_del(_vector->napi); - } - kfree(adapter->q_vectors); - return -ENOMEM; } /** -- 2.5.5
[net-next 13/18] i40e: Notify VFs of all resets
From: Mitch WilliamsNotify VFs in the reset interrupt handler, instead of the actual reset initiation code. This allows the VFs to get properly notified for all resets, including resets initiated by different PFs on the same physical device. Signed-off-by: Mitch Williams Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index e615f66..98bc749 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -5534,8 +5534,6 @@ void i40e_do_reset(struct i40e_pf *pf, u32 reset_flags) WARN_ON(in_interrupt()); - if (i40e_check_asq_alive(>hw)) - i40e_vc_notify_reset(pf); /* do the biggest reset indicated */ if (reset_flags & BIT_ULL(__I40E_GLOBAL_RESET_REQUESTED)) { @@ -6738,6 +6736,8 @@ static void i40e_prep_for_reset(struct i40e_pf *pf) clear_bit(__I40E_RESET_INTR_RECEIVED, >state); if (test_and_set_bit(__I40E_RESET_RECOVERY_PENDING, >state)) return; + if (i40e_check_asq_alive(>hw)) + i40e_vc_notify_reset(pf); dev_dbg(>pdev->dev, "Tearing down internal switch for reset\n"); -- 2.5.5
[net-next 12/18] i40e: Remove timer and task only if created
From: Shannon NelsonIn some error scenarios, we may find ourselves trying to remove a non-existent timer or worktask. This causes the kernel some bit of consternation, so don't do it. Signed-off-by: Shannon Nelson Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_main.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 56d4416..e615f66 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -11306,8 +11306,10 @@ static void i40e_remove(struct pci_dev *pdev) /* no more scheduling of any task */ set_bit(__I40E_SUSPENDED, >state); set_bit(__I40E_DOWN, >state); - del_timer_sync(>service_timer); - cancel_work_sync(>service_task); + if (pf->service_timer.data) + del_timer_sync(>service_timer); + if (pf->service_task.func) + cancel_work_sync(>service_task); if (pf->flags & I40E_FLAG_SRIOV_ENABLED) { i40e_free_vfs(pf); -- 2.5.5
[net-next 06/18] i40e/i40evf: Fix casting in transmit code
From: Jesse BrandeburgSimple cast to fix a sparse warning. Fixes: commit 5453205cd097 ("i40e/i40evf: Enable support for SKB_GSO_UDP_TUNNEL_CSUM") Signed-off-by: Jesse Brandeburg Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_txrx.c | 5 +++-- drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 5 +++-- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c index 01cff07..5bef5b0 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c @@ -2305,7 +2305,8 @@ static int i40e_tso(struct i40e_ring *tx_ring, struct sk_buff *skb, /* remove payload length from outer checksum */ paylen = (__force u16)l4.udp->check; - paylen += ntohs(1) * (u16)~(skb->len - l4_offset); + paylen += ntohs((__force __be16)1) * + (u16)~(skb->len - l4_offset); l4.udp->check = ~csum_fold((__force __wsum)paylen); } @@ -2327,7 +2328,7 @@ static int i40e_tso(struct i40e_ring *tx_ring, struct sk_buff *skb, /* remove payload length from inner checksum */ paylen = (__force u16)l4.tcp->check; - paylen += ntohs(1) * (u16)~(skb->len - l4_offset); + paylen += ntohs((__force __be16)1) * (u16)~(skb->len - l4_offset); l4.tcp->check = ~csum_fold((__force __wsum)paylen); /* compute length of segmentation header */ diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c index 9e91136..570348d 100644 --- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c @@ -1572,7 +1572,8 @@ static int i40e_tso(struct i40e_ring *tx_ring, struct sk_buff *skb, /* remove payload length from outer checksum */ paylen = (__force u16)l4.udp->check; - paylen += ntohs(1) * (u16)~(skb->len - l4_offset); + paylen += ntohs((__force __be16)1) * + (u16)~(skb->len - l4_offset); l4.udp->check = ~csum_fold((__force __wsum)paylen); } @@ -1594,7 +1595,7 @@ static int i40e_tso(struct i40e_ring *tx_ring, struct sk_buff *skb, /* remove payload length from inner checksum */ paylen = (__force u16)l4.tcp->check; - paylen += ntohs(1) * (u16)~(skb->len - l4_offset); + paylen += ntohs((__force __be16)1) * (u16)~(skb->len - l4_offset); l4.tcp->check = ~csum_fold((__force __wsum)paylen); /* compute length of segmentation header */ -- 2.5.5
[net-next 17/18] i40e: Change comment to reflect correct function name
From: Mitch WilliamsMinor correction in the comment to reflect the correct function name Signed-off-by: Mitch Williams Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c index 291d628..47b9e62 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c @@ -63,7 +63,7 @@ static void i40e_vc_vf_broadcast(struct i40e_pf *pf, } /** - * i40e_vc_notify_link_state + * i40e_vc_notify_vf_link_state * @vf: pointer to the VF structure * * send a link status message to a single VF -- 2.5.5
[net-next 15/18] i40e: Change unknown event error msg to ignore message
From: Shannon NelsonThere's no real error in an unknown event from the Firmware, we're just posting a useful FYI notice, so this patch simply removes the "Error" word. Signed-off-by: Shannon Nelson Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 98bc749..3841005 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -6371,7 +6371,7 @@ static void i40e_clean_adminq_subtask(struct i40e_pf *pf) break; default: dev_info(>pdev->dev, -"ARQ Error: Unknown event 0x%04x received\n", +"ARQ: Unknown event 0x%04x ignored\n", opcode); break; } -- 2.5.5
[net-next 08/18] i40e: Save off VSI resource count when updating VSI
From: Kevin ScottWhen updating a VSI, save off the number of allocated and unallocated VSIs as we do when adding a VSI. Signed-off-by: Kevin Scott Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_common.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c index 4596294..b0fd684 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_common.c +++ b/drivers/net/ethernet/intel/i40e/i40e_common.c @@ -2157,6 +2157,9 @@ i40e_status i40e_aq_update_vsi_params(struct i40e_hw *hw, struct i40e_aq_desc desc; struct i40e_aqc_add_get_update_vsi *cmd = (struct i40e_aqc_add_get_update_vsi *) + struct i40e_aqc_add_get_update_vsi_completion *resp = + (struct i40e_aqc_add_get_update_vsi_completion *) + i40e_status status; i40e_fill_default_direct_cmd_desc(, @@ -2168,6 +2171,9 @@ i40e_status i40e_aq_update_vsi_params(struct i40e_hw *hw, status = i40e_asq_send_command(hw, , _ctx->info, sizeof(vsi_ctx->info), cmd_details); + vsi_ctx->vsis_allocated = le16_to_cpu(resp->vsi_used); + vsi_ctx->vsis_unallocated = le16_to_cpu(resp->vsi_free); + return status; } -- 2.5.5
[net-next 05/18] i40e/i40evf: Add support for bulk free in Tx cleanup
From: Alexander DuyckThis patch enables bulk Tx clean for skbs. In order to enable it we need to pass the napi_budget value as that is used to determine if we are truly running in NAPI mode or if we are simply calling the routine from netpoll with a budget of 0. In order to avoid adding too many more variables I thought it best to pass the VSI directly in a fashion similar to what we do on igb and ixgbe with the q_vector. Signed-off-by: Alexander Duyck Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_txrx.c | 20 +++- drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 20 +++- 2 files changed, 22 insertions(+), 18 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c index 8fb2a96..01cff07 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c @@ -636,19 +636,21 @@ u32 i40e_get_tx_pending(struct i40e_ring *ring, bool in_sw) /** * i40e_clean_tx_irq - Reclaim resources after transmit completes - * @tx_ring: tx ring to clean - * @budget: how many cleans we're allowed + * @vsi: the VSI we care about + * @tx_ring: Tx ring to clean + * @napi_budget: Used to determine if we are in netpoll * * Returns true if there's any budget left (e.g. the clean is finished) **/ -static bool i40e_clean_tx_irq(struct i40e_ring *tx_ring, int budget) +static bool i40e_clean_tx_irq(struct i40e_vsi *vsi, + struct i40e_ring *tx_ring, int napi_budget) { u16 i = tx_ring->next_to_clean; struct i40e_tx_buffer *tx_buf; struct i40e_tx_desc *tx_head; struct i40e_tx_desc *tx_desc; - unsigned int total_packets = 0; - unsigned int total_bytes = 0; + unsigned int total_bytes = 0, total_packets = 0; + unsigned int budget = vsi->work_limit; tx_buf = _ring->tx_bi[i]; tx_desc = I40E_TX_DESC(tx_ring, i); @@ -678,7 +680,7 @@ static bool i40e_clean_tx_irq(struct i40e_ring *tx_ring, int budget) total_packets += tx_buf->gso_segs; /* free the skb */ - dev_consume_skb_any(tx_buf->skb); + napi_consume_skb(tx_buf->skb, napi_budget); /* unmap skb header data */ dma_unmap_single(tx_ring->dev, @@ -749,7 +751,7 @@ static bool i40e_clean_tx_irq(struct i40e_ring *tx_ring, int budget) if (budget && ((j / (WB_STRIDE + 1)) == 0) && (j != 0) && - !test_bit(__I40E_DOWN, _ring->vsi->state) && + !test_bit(__I40E_DOWN, >state) && (I40E_DESC_UNUSED(tx_ring) != tx_ring->count)) tx_ring->arm_wb = true; } @@ -767,7 +769,7 @@ static bool i40e_clean_tx_irq(struct i40e_ring *tx_ring, int budget) smp_mb(); if (__netif_subqueue_stopped(tx_ring->netdev, tx_ring->queue_index) && - !test_bit(__I40E_DOWN, _ring->vsi->state)) { + !test_bit(__I40E_DOWN, >state)) { netif_wake_subqueue(tx_ring->netdev, tx_ring->queue_index); ++tx_ring->tx_stats.restart_queue; @@ -1975,7 +1977,7 @@ int i40e_napi_poll(struct napi_struct *napi, int budget) * budget and be more aggressive about cleaning up the Tx descriptors. */ i40e_for_each_ring(ring, q_vector->tx) { - if (!i40e_clean_tx_irq(ring, vsi->work_limit)) { + if (!i40e_clean_tx_irq(vsi, ring, budget)) { clean_complete = false; continue; } diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c index 839a6df..9e91136 100644 --- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c @@ -155,19 +155,21 @@ u32 i40evf_get_tx_pending(struct i40e_ring *ring, bool in_sw) /** * i40e_clean_tx_irq - Reclaim resources after transmit completes - * @tx_ring: tx ring to clean - * @budget: how many cleans we're allowed + * @vsi: the VSI we care about + * @tx_ring: Tx ring to clean + * @napi_budget: Used to determine if we are in netpoll * * Returns true if there's any budget left (e.g. the clean is finished) **/ -static bool i40e_clean_tx_irq(struct i40e_ring *tx_ring, int budget) +static bool i40e_clean_tx_irq(struct i40e_vsi *vsi, + struct i40e_ring *tx_ring, int napi_budget) { u16 i = tx_ring->next_to_clean; struct i40e_tx_buffer *tx_buf; struct i40e_tx_desc *tx_head; struct i40e_tx_desc *tx_desc; - unsigned int total_packets = 0; -
[net-next 02/18] i40e/i40evf: Allow up to 12K bytes of data per Tx descriptor instead of 8K
From: Alexander Duyck>From what I can tell the practical limitation on the size of the Tx data buffer is the fact that the Tx descriptor is limited to 14 bits. As such we cannot use 16K as is typically used on the other Intel drivers. However artificially limiting ourselves to 8K can be expensive as this means that we will consume up to 10 descriptors (1 context, 1 for header, and 9 for payload, non-8K aligned) in a single send. I propose that we can reduce this by increasing the maximum data for a 4K aligned block to 12K. We can reduce the descriptors used for a 32K aligned block by 1 by increasing the size like this. In addition we still have the 4K - 1 of space that is still unused. We can use this as a bit of extra padding when dealing with data that is not aligned to 4K. By aligning the descriptors after the first to 4K we can improve the efficiency of PCIe accesses as we can avoid using byte enables and can fetch full TLP transactions after the first fetch of the buffer. This helps to improve PCIe efficiency. Below is the results of testing before and after with this patch: Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send RecvSendRecv Size Size Size TimeThroughput localremote local remote bytes bytes bytessecs. 10^6bits/s % S % U us/KB us/KB Before: 87380 16384 1638410.00 33682.24 20.27-1.00 0.592 -1.00 After: 87380 16384 1638410.00 34204.08 20.54-1.00 0.590 -1.00 So the net result of this patch is that we have a small gain in throughput due to a reduction in overhead for putting together the frame. Signed-off-by: Alexander Duyck Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_fcoe.c | 2 +- drivers/net/ethernet/intel/i40e/i40e_txrx.c | 13 +++--- drivers/net/ethernet/intel/i40e/i40e_txrx.h | 35 --- drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 13 +++--- drivers/net/ethernet/intel/i40evf/i40e_txrx.h | 35 --- 5 files changed, 83 insertions(+), 15 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_fcoe.c b/drivers/net/ethernet/intel/i40e/i40e_fcoe.c index 8ad162c..92d2208 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_fcoe.c +++ b/drivers/net/ethernet/intel/i40e/i40e_fcoe.c @@ -1371,7 +1371,7 @@ static netdev_tx_t i40e_fcoe_xmit_frame(struct sk_buff *skb, if (i40e_chk_linearize(skb, count)) { if (__skb_linearize(skb)) goto out_drop; - count = TXD_USE_COUNT(skb->len); + count = i40e_txd_use_count(skb->len); tx_ring->tx_stats.tx_linearize++; } diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c index 084d0ab..9af1411 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c @@ -2717,6 +2717,8 @@ static inline void i40e_tx_map(struct i40e_ring *tx_ring, struct sk_buff *skb, tx_bi = first; for (frag = _shinfo(skb)->frags[0];; frag++) { + unsigned int max_data = I40E_MAX_DATA_PER_TXD_ALIGNED; + if (dma_mapping_error(tx_ring->dev, dma)) goto dma_error; @@ -2724,12 +2726,14 @@ static inline void i40e_tx_map(struct i40e_ring *tx_ring, struct sk_buff *skb, dma_unmap_len_set(tx_bi, len, size); dma_unmap_addr_set(tx_bi, dma, dma); + /* align size to end of page */ + max_data += -dma & (I40E_MAX_READ_REQ_SIZE - 1); tx_desc->buffer_addr = cpu_to_le64(dma); while (unlikely(size > I40E_MAX_DATA_PER_TXD)) { tx_desc->cmd_type_offset_bsz = build_ctob(td_cmd, td_offset, - I40E_MAX_DATA_PER_TXD, td_tag); + max_data, td_tag); tx_desc++; i++; @@ -2740,9 +2744,10 @@ static inline void i40e_tx_map(struct i40e_ring *tx_ring, struct sk_buff *skb, i = 0; } - dma += I40E_MAX_DATA_PER_TXD; - size -= I40E_MAX_DATA_PER_TXD; + dma += max_data; + size -= max_data; + max_data = I40E_MAX_DATA_PER_TXD_ALIGNED; tx_desc->buffer_addr = cpu_to_le64(dma); } @@ -2892,7 +2897,7 @@ static netdev_tx_t i40e_xmit_frame_ring(struct sk_buff *skb, if (i40e_chk_linearize(skb, count)) { if (__skb_linearize(skb)) goto out_drop; -
[net-next 07/18] i40e/i40evf: Remove I40E_MAX_USER_PRIORITY define
From: Catherine SullivanThis patch removes the duplicate definition of I40E_MAX_USER_PRIORITY in i40e.h that is not needed. Signed-off-by: Catherine Sullivan Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e.h | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h index f208570..d25b3be 100644 --- a/drivers/net/ethernet/intel/i40e/i40e.h +++ b/drivers/net/ethernet/intel/i40e/i40e.h @@ -244,7 +244,6 @@ struct i40e_fdir_filter { #define I40E_DCB_PRIO_TYPE_STRICT 0 #define I40E_DCB_PRIO_TYPE_ETS 1 #define I40E_DCB_STRICT_PRIO_CREDITS 127 -#define I40E_MAX_USER_PRIORITY 8 /* DCB per TC information data structure */ struct i40e_tc_info { u16 qoffset;/* Queue offset from base queue */ -- 2.5.5
[net-next 16/18] i40evf: Add additional check for reset
From: Mitch WilliamsIf the driver happens to read a register during the time in which the device is undergoing reset, it will receive a value of 0xdeadbeef instead of a valid value. Unfortunately, the driver may misinterpret this as a valid value, especially if it's just looking for individual bits. Add an explicit check for this value when we are looking for admin queue errors, and trigger reset recovery if we find it. Signed-off-by: Mitch Williams Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40evf/i40evf_main.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c index 820ad94..d783c1b 100644 --- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c +++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c @@ -1994,6 +1994,8 @@ static void i40evf_adminq_task(struct work_struct *work) /* check for error indications */ val = rd32(hw, hw->aq.arq.len); + if (val == 0xdeadbeef) /* indicates device in reset */ + goto freedom; oldval = val; if (val & I40E_VF_ARQLEN1_ARQVFE_MASK) { dev_info(>pdev->dev, "ARQ VF Error detected\n"); -- 2.5.5
[net-next 00/18][pull request] 40GbE Intel Wired LAN Driver Updates 2016-04-05
This series contains updates to i40e and i40evf only. Stefan converts dev_close() to ndo_stop() for ethtool offline self test, since dev_close() causes IFF_UP to be cleared which will remove the interface routes and addresses. Alex bumps up the size of the transmit data buffer to 12K rather than 8K, which provides a gain in throughput and a reduction in overhead for putting together the frame. Fixed an issue in the polling routines where we were using bitwise operators to avoid the side effects of the logical operators. Then added support for bulk transmit clean for skbs. Jesse fixed a sparse issue in the type casting in the transmit code and fixed i40e_aq_set_phy_debug() to use i40e_status as a return code. Catherine cleans up duplicated code. Shannon fixed the cleaning up of the interrupt handling to clean up the IRQs only if we actually got them set up. Also fixed up the error scenarios where we were trying to remove a non-existent timer or worktask, which causes the kernel heartburn. Mitch changes the notification of resets to the reset interrupt handler, instead of the actual reset initiation code. This allows the VFs to get properly notified for all resets, including resets initiated by different PFs on the same physical device. Also moved the clearing of VFLR bit after reset processing, instead of before which could lead to double resets on VF init. Fixed code comment to match the actual function name. The following are changes since commit 15f41e2ba13a6726632e44b1180e805a61e470ad: Merge branch 'tcp-udp-misc' and are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 40GbE Alan Cox (1): i40evf: remove dead code Alexander Duyck (3): i40e/i40evf: Allow up to 12K bytes of data per Tx descriptor instead of 8K i40e/i40evf: Fix handling of boolean logic in polling routines i40e/i40evf: Add support for bulk free in Tx cleanup Catherine Sullivan (2): i40e/i40evf: Remove I40E_MAX_USER_PRIORITY define i40e/i40evf: Bump patch from 1.4.25 to 1.5.1 Jesse Brandeburg (2): i40e/i40evf: Fix casting in transmit code i40e: Fix up return code Kevin Scott (1): i40e: Save off VSI resource count when updating VSI Mitch Williams (4): i40e: Notify VFs of all resets i40e: Added code to prevent double resets i40evf: Add additional check for reset i40e: Change comment to reflect correct function name Shannon Nelson (4): i40e: Remove MSIx only if created i40e: Assure that adminq is alive in debug mode i40e: Remove timer and task only if created i40e: Change unknown event error msg to ignore message Stefan Assmann (1): i40e: call ndo_stop() instead of dev_close() when running offline selftest drivers/net/ethernet/intel/i40e/i40e.h | 3 +- drivers/net/ethernet/intel/i40e/i40e_common.c | 12 -- drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 4 +- drivers/net/ethernet/intel/i40e/i40e_fcoe.c| 2 +- drivers/net/ethernet/intel/i40e/i40e_main.c| 35 +++- drivers/net/ethernet/intel/i40e/i40e_txrx.c| 49 +- drivers/net/ethernet/intel/i40e/i40e_txrx.h| 35 ++-- drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 13 +++--- drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 49 +- drivers/net/ethernet/intel/i40evf/i40e_txrx.h | 35 ++-- drivers/net/ethernet/intel/i40evf/i40evf_main.c| 17 +++- 11 files changed, 166 insertions(+), 88 deletions(-) -- 2.5.5
[net-next 10/18] i40e: Remove MSIx only if created
From: Shannon NelsonWhen cleaning up the interrupt handling, clean up the IRQs only if we actually got them set up. There are a couple of error recovery paths that were violating this and causing the kernel a bit of indigestion. Signed-off-by: Shannon Nelson Reviewed-by: Williams, Mitch A Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 650336e..2464dca 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -4164,7 +4164,7 @@ static void i40e_clear_interrupt_scheme(struct i40e_pf *pf) int i; i40e_stop_misc_vector(pf); - if (pf->flags & I40E_FLAG_MSIX_ENABLED) { + if (pf->flags & I40E_FLAG_MSIX_ENABLED && pf->msix_entries) { synchronize_irq(pf->msix_entries[0].vector); free_irq(pf->msix_entries[0].vector, pf); } -- 2.5.5
[net-next 18/18] i40e/i40evf: Bump patch from 1.4.25 to 1.5.1
From: Catherine SullivanSigned-off-by: Catherine Sullivan Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_main.c | 4 ++-- drivers/net/ethernet/intel/i40evf/i40evf_main.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 3841005..297fd39 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -45,8 +45,8 @@ static const char i40e_driver_string[] = #define DRV_KERN "-k" #define DRV_VERSION_MAJOR 1 -#define DRV_VERSION_MINOR 4 -#define DRV_VERSION_BUILD 25 +#define DRV_VERSION_MINOR 5 +#define DRV_VERSION_BUILD 1 #define DRV_VERSION __stringify(DRV_VERSION_MAJOR) "." \ __stringify(DRV_VERSION_MINOR) "." \ __stringify(DRV_VERSION_BUILD)DRV_KERN diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c index d783c1b..e397368 100644 --- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c +++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c @@ -37,8 +37,8 @@ static const char i40evf_driver_string[] = #define DRV_KERN "-k" #define DRV_VERSION_MAJOR 1 -#define DRV_VERSION_MINOR 4 -#define DRV_VERSION_BUILD 15 +#define DRV_VERSION_MINOR 5 +#define DRV_VERSION_BUILD 1 #define DRV_VERSION __stringify(DRV_VERSION_MAJOR) "." \ __stringify(DRV_VERSION_MINOR) "." \ __stringify(DRV_VERSION_BUILD) \ -- 2.5.5