Re: [PATCH net-next RFC 7/7] bnxt_en: Add bnxt_en initial port params table and register it
On Thu, Dec 6, 2018 at 2:37 AM Jakub Kicinski wrote: > > On Thu, 6 Dec 2018 00:57:05 -0800, Michael Chan wrote: > > On Wed, Dec 5, 2018 at 11:11 PM Jakub Kicinski wrote: > > > > > > On Wed, 5 Dec 2018 22:41:43 -0800, Michael Chan wrote: > > > > > > > > It will be in the BIOS only for a LOM, I think. For a NIC, it should > > > > be in the NIC's NVRAM. > > > > > > This is all vague. Could you please clearly state the use case. > > > > > Well, the WoL setting's use case should be quite simple, right? If > > the card's NVRAM WoL setting is ON, when you plug the card in a slot > > that has Vaux power, it will assert PME# when a magic packet is > > received. Again, the WoL setting in this context is similar to other > > power up settings such as PCIe Gen2 or Gen3. > > If there was some configuration of PME# involved, maybe, but > basic networking configuration has its APIs already. > > > Let's say the power up setting is ON and it boots up to Linux for the > > first time after receiving a magic packet. The Linux user can then > > run ethtool -s to set the driver's non persistent WoL setting. It can > > be the same as the NVRAM's power up setting, or different. Ethtool > > may support additional WoL packet types that the power up setting does > > not support. Let's say the Linux user sets the ethtool WoL setting to > > OFF and shuts down the system. That card now will not wake up the > > system. But if there is a power failure and power comes back on > > later, the card will lose the ethtool setting and go back to the power > > up WoL setting, which is ON in this example. > > So in your example there is a machine with a 25/40/100G NIC that > doesn't have any remote BMC control, and connected to a L2 network > where a magic packet can be received. > > In my experience machines are either low end/embedded and they just > boot on power on fully (to Linux), or they are proper machines which > support IPMI etc. > > If you could illuminate the use case some more I'd really appreciate > that. In your hypothetical scenario you still have to get the link > up, so if we apply this patch a logical extension would be to add all > ethtool link settings as devlink parameters as well. Florian recently > added an option to wake based on a packet that matched an n-tuple > filter. If your use case is legit, doing the same thing with n-tuple > filters instead of Magic Packets is very much legit, too. So we will > poke n-tuple filters via devlink params? We only store a magic packet WoL bit in the NVRAM for basic power up WoL setting. I doubt that people will store the entire n-tuple WoL pattern in NVRAM for basic power up WoL. The whole idea is to have a basic method to wake up the machine after power up with Vaux. If the cable is connected, the NIC will autoneg to some lower speed that Vaux can support. I think we've been supporting this since the tg3 days.
Re: [PATCH net-next RFC 7/7] bnxt_en: Add bnxt_en initial port params table and register it
On Wed, Dec 5, 2018 at 11:11 PM Jakub Kicinski wrote: > > On Wed, 5 Dec 2018 22:41:43 -0800, Michael Chan wrote: > > > > It will be in the BIOS only for a LOM, I think. For a NIC, it should > > be in the NIC's NVRAM. > > This is all vague. Could you please clearly state the use case. > Well, the WoL setting's use case should be quite simple, right? If the card's NVRAM WoL setting is ON, when you plug the card in a slot that has Vaux power, it will assert PME# when a magic packet is received. Again, the WoL setting in this context is similar to other power up settings such as PCIe Gen2 or Gen3. Let's say the power up setting is ON and it boots up to Linux for the first time after receiving a magic packet. The Linux user can then run ethtool -s to set the driver's non persistent WoL setting. It can be the same as the NVRAM's power up setting, or different. Ethtool may support additional WoL packet types that the power up setting does not support. Let's say the Linux user sets the ethtool WoL setting to OFF and shuts down the system. That card now will not wake up the system. But if there is a power failure and power comes back on later, the card will lose the ethtool setting and go back to the power up WoL setting, which is ON in this example.
Re: [PATCH net-next RFC 7/7] bnxt_en: Add bnxt_en initial port params table and register it
On Wed, Dec 5, 2018 at 10:00 PM Jakub Kicinski wrote: > > On Wed, 5 Dec 2018 17:18:52 -0800, Michael Chan wrote: > > On Wed, Dec 5, 2018 at 4:42 PM Jakub Kicinski wrote: > > > On Wed, 5 Dec 2018 16:01:08 -0800, Michael Chan wrote: > > > > On Wed, Dec 5, 2018 at 3:33 PM Jakub Kicinski wrote: > > > > > On Wed, 5 Dec 2018 11:27:00 +0530, Vasundhara Volam wrote: > > > > > > Register devlink_port with devlink and create initial port params > > > > > > table for bnxt_en. The table consists of a generic parameter: > > > > > > > > > > > > wake-on-lan: Enables Wake on Lan for this port when magic packet > > > > > > is received with this port's MAC address using ACPI pattern. > > > > > > If enabled, the controller asserts a wake pin upon reception of > > > > > > WoL packet. ACPI (Advanced Configuration and Power Interface) is > > > > > > an industry specification for the efficient handling of power > > > > > > consumption in desktop and mobile computers. > > > > > > > > > > > > Cc: Michael Chan > > > > > > Signed-off-by: Vasundhara Volam > > > > > > > > > > Why do we need a WoL as a devlink parameter (rather than ethtool -s)? > > > > > > > > I believe ethtool -s for WoL is a non-persistent setting, meaning that > > > > if you power cycle the system, the WoL setting will go back to > > > > default. > > > > > > > > devlink on the other hand is a permanent setting. ethtool should > > > > initially report the default WoL setting and it can then be changed > > > > (in a non permanent way) using ethtool -s. > > > > > > All network configuration settings in Linux are non-persistent AFAIK. > > > That's why network configuration daemons exist: > > > > > > https://wiki.debian.org/WakeOnLan > > > > > > Perhaps the objective to move more of the network configuration into the > > > firmware? That'd be a bleak scenario, so probably not.. > > > > > > My understanding was the persistent devlink settings are for things > > > which have to be set at device init time. Like say PCI endpoint > > > configuration. FW loading configuration. > > > > > > Besides, the parameter you add is just true/false, when ethtool has > > > multiple options. > > > > > > It feels to me like we moved from ioctls to Netlink, and now even > > > before ethtool was converted to Netlink we may move to unstructured > > > strings. That's not a step forward, if you ask me. > > > > We do have a parameter in NVRAM that controls default WoL. I think > > this is to expose that parameter so it can be set one way or the > > other. There are scenarios where Linux has not booted yet (and so > > there is no opportunity to run ethtool -s or any daemons yet) and this > > parameter will control whether the machine will wake up or not. > > Isn't that set in BIOS/setup? The config before any OS boots? Because > the BMC or whatnot has to actually configure the board to power > appropriate things up. Please clarify. It will be in the BIOS only for a LOM, I think. For a NIC, it should be in the NIC's NVRAM. > > And *if* it is proven this config is more than just setting the default > IMHO the setting belongs in the ethtool API. We can't just add devlink > params for all existing config APIs just because it has persistence. I'm not sure I understand your point. I believe the NIC firmware will set up the NIC's WoL setting right after power up based on this NVRAM parameter. Similar to how the firmware will setup PCIe Gen2 or Gen3 right after power up, for example. So why would this belong to ethtool? I understand the confusion that ethtool -s has a similar WoL setting. But again, that's different. This one is the power up setting that impacts whether a magic packet can or cannot wake up the system right after power up (before booting up to Linux or other OS).
Re: [PATCH net-next RFC 7/7] bnxt_en: Add bnxt_en initial port params table and register it
On Wed, Dec 5, 2018 at 4:42 PM Jakub Kicinski wrote: > > On Wed, 5 Dec 2018 16:01:08 -0800, Michael Chan wrote: > > On Wed, Dec 5, 2018 at 3:33 PM Jakub Kicinski > > wrote: > > > > > > On Wed, 5 Dec 2018 11:27:00 +0530, Vasundhara Volam wrote: > > > > Register devlink_port with devlink and create initial port params > > > > table for bnxt_en. The table consists of a generic parameter: > > > > > > > > wake-on-lan: Enables Wake on Lan for this port when magic packet > > > > is received with this port's MAC address using ACPI pattern. > > > > If enabled, the controller asserts a wake pin upon reception of > > > > WoL packet. ACPI (Advanced Configuration and Power Interface) is > > > > an industry specification for the efficient handling of power > > > > consumption in desktop and mobile computers. > > > > > > > > Cc: Michael Chan > > > > Signed-off-by: Vasundhara Volam > > > > > > Why do we need a WoL as a devlink parameter (rather than ethtool -s)? > > > > I believe ethtool -s for WoL is a non-persistent setting, meaning that > > if you power cycle the system, the WoL setting will go back to > > default. > > > > devlink on the other hand is a permanent setting. ethtool should > > initially report the default WoL setting and it can then be changed > > (in a non permanent way) using ethtool -s. > > All network configuration settings in Linux are non-persistent AFAIK. > That's why network configuration daemons exist: > > https://wiki.debian.org/WakeOnLan > > Perhaps the objective to move more of the network configuration into the > firmware? That'd be a bleak scenario, so probably not.. > > My understanding was the persistent devlink settings are for things > which have to be set at device init time. Like say PCI endpoint > configuration. FW loading configuration. > > Besides, the parameter you add is just true/false, when ethtool has > multiple options. > > It feels to me like we moved from ioctls to Netlink, and now even > before ethtool was converted to Netlink we may move to unstructured > strings. That's not a step forward, if you ask me. We do have a parameter in NVRAM that controls default WoL. I think this is to expose that parameter so it can be set one way or the other. There are scenarios where Linux has not booted yet (and so there is no opportunity to run ethtool -s or any daemons yet) and this parameter will control whether the machine will wake up or not.
Re: [PATCH net-next RFC 7/7] bnxt_en: Add bnxt_en initial port params table and register it
On Wed, Dec 5, 2018 at 3:33 PM Jakub Kicinski wrote: > > On Wed, 5 Dec 2018 11:27:00 +0530, Vasundhara Volam wrote: > > Register devlink_port with devlink and create initial port params > > table for bnxt_en. The table consists of a generic parameter: > > > > wake-on-lan: Enables Wake on Lan for this port when magic packet > > is received with this port's MAC address using ACPI pattern. > > If enabled, the controller asserts a wake pin upon reception of > > WoL packet. ACPI (Advanced Configuration and Power Interface) is > > an industry specification for the efficient handling of power > > consumption in desktop and mobile computers. > > > > Cc: Michael Chan > > Signed-off-by: Vasundhara Volam > > Why do we need a WoL as a devlink parameter (rather than ethtool -s)? I believe ethtool -s for WoL is a non-persistent setting, meaning that if you power cycle the system, the WoL setting will go back to default. devlink on the other hand is a permanent setting. ethtool should initially report the default WoL setting and it can then be changed (in a non permanent way) using ethtool -s.
[PATCH net 2/6] bnxt_en: Fix rx_l4_csum_errors counter on 57500 devices.
The software counter structure is defined in both the CP ring's structure and the NQ ring's structure on the new devices. The legacy code adds the counter to the CP ring's structure and the counter won't get displayed since the ethtool code is looking at the NQ ring's structure. Since all other counters are contained in the NQ ring's structure, it makes more sense to count rx_l4_csum_errors in the NQ. Fixes: 50e3ab7836b5 ("bnxt_en: Allocate completion ring structures for 57500 series chips.") Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 4a45a2b..5856099 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -1675,7 +1675,7 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr, } else { if (rxcmp1->rx_cmp_cfa_code_errors_v2 & RX_CMP_L4_CS_ERR_BITS) { if (dev->features & NETIF_F_RXCSUM) - cpr->rx_l4_csum_errors++; + bnapi->cp_ring.rx_l4_csum_errors++; } } -- 2.5.1
[PATCH net 0/6] bnxt_en: Bug fixes.
Most of the bug fixes are related to the new 57500 chips, including some initialization and counter fixes, disabling RDMA support, and a workaround for occasional missing interrupts. The last patch from Vasundhara fixes the year/month parameters for firmware coredump. Michael Chan (5): bnxt_en: Fix RSS context allocation. bnxt_en: Fix rx_l4_csum_errors counter on 57500 devices. bnxt_en: Disable RDMA support on the 57500 chips. bnxt_en: Workaround occasional TX timeout on 57500 A0. bnxt_en: Add software "missed_irqs" counter. Vasundhara Volam (1): bnxt_en: Fix filling time in bnxt_fill_coredump_record() drivers/net/ethernet/broadcom/bnxt/bnxt.c | 70 ++- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 4 ++ drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 9 ++- drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 3 + 4 files changed, 81 insertions(+), 5 deletions(-) -- 2.5.1
[PATCH net 4/6] bnxt_en: Workaround occasional TX timeout on 57500 A0.
Hardware can sometimes not generate NQ MSIX with a single pending CP ring entry. This seems to always happen at the last entry of the CP ring before it wraps. Add logic to check all the CP rings for pending entries without the CP ring consumer index advancing. Calling HWRM_DBG_RING_INFO_GET to read the context of the CP ring will flush out the NQ entry and MSIX. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 65 +++ drivers/net/ethernet/broadcom/bnxt/bnxt.h | 3 ++ 2 files changed, 68 insertions(+) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 5856099..5d4147a 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -8714,6 +8714,26 @@ static int bnxt_set_features(struct net_device *dev, netdev_features_t features) return rc; } +static int bnxt_dbg_hwrm_ring_info_get(struct bnxt *bp, u8 ring_type, + u32 ring_id, u32 *prod, u32 *cons) +{ + struct hwrm_dbg_ring_info_get_output *resp = bp->hwrm_cmd_resp_addr; + struct hwrm_dbg_ring_info_get_input req = {0}; + int rc; + + bnxt_hwrm_cmd_hdr_init(bp, , HWRM_DBG_RING_INFO_GET, -1, -1); + req.ring_type = ring_type; + req.fw_ring_id = cpu_to_le32(ring_id); + mutex_lock(>hwrm_cmd_lock); + rc = _hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); + if (!rc) { + *prod = le32_to_cpu(resp->producer_index); + *cons = le32_to_cpu(resp->consumer_index); + } + mutex_unlock(>hwrm_cmd_lock); + return rc; +} + static void bnxt_dump_tx_sw_state(struct bnxt_napi *bnapi) { struct bnxt_tx_ring_info *txr = bnapi->tx_ring; @@ -8821,6 +8841,11 @@ static void bnxt_timer(struct timer_list *t) bnxt_queue_sp_work(bp); } } + + if ((bp->flags & BNXT_FLAG_CHIP_P5) && netif_carrier_ok(dev)) { + set_bit(BNXT_RING_COAL_NOW_SP_EVENT, >sp_event); + bnxt_queue_sp_work(bp); + } bnxt_restart_timer: mod_timer(>timer, jiffies + bp->current_interval); } @@ -8851,6 +8876,43 @@ static void bnxt_reset(struct bnxt *bp, bool silent) bnxt_rtnl_unlock_sp(bp); } +static void bnxt_chk_missed_irq(struct bnxt *bp) +{ + int i; + + if (!(bp->flags & BNXT_FLAG_CHIP_P5)) + return; + + for (i = 0; i < bp->cp_nr_rings; i++) { + struct bnxt_napi *bnapi = bp->bnapi[i]; + struct bnxt_cp_ring_info *cpr; + u32 fw_ring_id; + int j; + + if (!bnapi) + continue; + + cpr = >cp_ring; + for (j = 0; j < 2; j++) { + struct bnxt_cp_ring_info *cpr2 = cpr->cp_ring_arr[j]; + u32 val[2]; + + if (!cpr2 || cpr2->has_more_work || + !bnxt_has_work(bp, cpr2)) + continue; + + if (cpr2->cp_raw_cons != cpr2->last_cp_raw_cons) { + cpr2->last_cp_raw_cons = cpr2->cp_raw_cons; + continue; + } + fw_ring_id = cpr2->cp_ring_struct.fw_ring_id; + bnxt_dbg_hwrm_ring_info_get(bp, + DBG_RING_INFO_GET_REQ_RING_TYPE_L2_CMPL, + fw_ring_id, [0], [1]); + } + } +} + static void bnxt_cfg_ntp_filters(struct bnxt *); static void bnxt_sp_task(struct work_struct *work) @@ -8930,6 +8992,9 @@ static void bnxt_sp_task(struct work_struct *work) if (test_and_clear_bit(BNXT_FLOW_STATS_SP_EVENT, >sp_event)) bnxt_tc_flow_stats_work(bp); + if (test_and_clear_bit(BNXT_RING_COAL_NOW_SP_EVENT, >sp_event)) + bnxt_chk_missed_irq(bp); + /* These functions below will clear BNXT_STATE_IN_SP_TASK. They * must be the last functions to be called before exiting. */ diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 498b373..00bd17e 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -798,6 +798,8 @@ struct bnxt_cp_ring_info { u8 had_work_done:1; u8 has_more_work:1; + u32 last_cp_raw_cons; + struct bnxt_coalrx_ring_coal; u64 rx_packets; u64 rx_bytes; @@ -1527,6 +1529,7 @@ struct bnxt { #define BNXT_LINK_SPEED_CHNG_SP_EVENT 14 #define BNXT_FLOW_STATS_SP_EVENT 15 #define BNXT_UPDATE_PHY_SP_EVENT 16 +#define
[PATCH net 1/6] bnxt_en: Fix RSS context allocation.
Recent commit has added the reservation of RSS context. This requires bnxt_hwrm_vnic_qcaps() to be called before allocating any RSS contexts. The bnxt_hwrm_vnic_qcaps() call sets up proper flags that will determine how many RSS contexts to allocate to support NTUPLE. This causes a regression that too many RSS contexts are being reserved and causing resource shortage when enabling many VFs. Fix it by calling bnxt_hwrm_vnic_qcaps() earlier. Fixes: 41e8d7983752 ("bnxt_en: Modify the ring reservation functions for 57500 series chips.") Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index dd85d79..4a45a2b 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -10087,6 +10087,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) } bnxt_hwrm_func_qcfg(bp); + bnxt_hwrm_vnic_qcaps(bp); bnxt_hwrm_port_led_qcaps(bp); bnxt_ethtool_init(bp); bnxt_dcb_init(bp); @@ -10120,7 +10121,6 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) VNIC_RSS_CFG_REQ_HASH_TYPE_UDP_IPV6; } - bnxt_hwrm_vnic_qcaps(bp); if (bnxt_rfs_supported(bp)) { dev->hw_features |= NETIF_F_NTUPLE; if (bnxt_rfs_capable(bp)) { -- 2.5.1
[PATCH net 3/6] bnxt_en: Disable RDMA support on the 57500 chips.
There is no RDMA support on 57500 chips yet, so prevent bnxt_re from registering on these chips. There is intermittent failure if bnxt_re is allowed to register and proceed with RDMA operations. Fixes: 1ab968d2f1d6 ("bnxt_en: Add PCI ID for BCM57508 device.") Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c index beee612..b59b382 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c @@ -43,6 +43,9 @@ static int bnxt_register_dev(struct bnxt_en_dev *edev, int ulp_id, if (ulp_id == BNXT_ROCE_ULP) { unsigned int max_stat_ctxs; + if (bp->flags & BNXT_FLAG_CHIP_P5) + return -EOPNOTSUPP; + max_stat_ctxs = bnxt_get_max_func_stat_ctxs(bp); if (max_stat_ctxs <= BNXT_MIN_ROCE_STAT_CTXS || bp->num_stat_ctxs == max_stat_ctxs) -- 2.5.1
[PATCH net 5/6] bnxt_en: Add software "missed_irqs" counter.
To keep track of the number of times the workaround code for 57500 A0 has been triggered. This is a per NQ counter. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 1 + drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 + drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 5 - 3 files changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 5d4147a..d4c3001 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -8909,6 +8909,7 @@ static void bnxt_chk_missed_irq(struct bnxt *bp) bnxt_dbg_hwrm_ring_info_get(bp, DBG_RING_INFO_GET_REQ_RING_TYPE_L2_CMPL, fw_ring_id, [0], [1]); + cpr->missed_irqs++; } } } diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 00bd17e..9e99d4a 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -818,6 +818,7 @@ struct bnxt_cp_ring_info { dma_addr_t hw_stats_map; u32 hw_stats_ctx_id; u64 rx_l4_csum_errors; + u64 missed_irqs; struct bnxt_ring_struct cp_ring_struct; diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c index 4807856..4b734cd 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c @@ -137,7 +137,7 @@ static int bnxt_set_coalesce(struct net_device *dev, return rc; } -#define BNXT_NUM_STATS 21 +#define BNXT_NUM_STATS 22 #define BNXT_RX_STATS_ENTRY(counter) \ { BNXT_RX_STATS_OFFSET(counter), __stringify(counter) } @@ -384,6 +384,7 @@ static void bnxt_get_ethtool_stats(struct net_device *dev, for (k = 0; k < stat_fields; j++, k++) buf[j] = le64_to_cpu(hw_stats[k]); buf[j++] = cpr->rx_l4_csum_errors; + buf[j++] = cpr->missed_irqs; bnxt_sw_func_stats[RX_TOTAL_DISCARDS].counter += le64_to_cpu(cpr->hw_stats->rx_discard_pkts); @@ -468,6 +469,8 @@ static void bnxt_get_strings(struct net_device *dev, u32 stringset, u8 *buf) buf += ETH_GSTRING_LEN; sprintf(buf, "[%d]: rx_l4_csum_errors", i); buf += ETH_GSTRING_LEN; + sprintf(buf, "[%d]: missed_irqs", i); + buf += ETH_GSTRING_LEN; } for (i = 0; i < BNXT_NUM_SW_FUNC_STATS; i++) { strcpy(buf, bnxt_sw_func_stats[i].string); -- 2.5.1
[PATCH net 6/6] bnxt_en: Fix filling time in bnxt_fill_coredump_record()
From: Vasundhara Volam Fix the year and month offset while storing it in bnxt_fill_coredump_record(). Fixes: 6c5657d085ae ("bnxt_en: Add support for ethtool get dump.") Signed-off-by: Vasundhara Volam Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c index 4b734cd..6cc69a5 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c @@ -2945,8 +2945,8 @@ bnxt_fill_coredump_record(struct bnxt *bp, struct bnxt_coredump_record *record, record->asic_state = 0; strlcpy(record->system_name, utsname()->nodename, sizeof(record->system_name)); - record->year = cpu_to_le16(tm.tm_year); - record->month = cpu_to_le16(tm.tm_mon); + record->year = cpu_to_le16(tm.tm_year + 1900); + record->month = cpu_to_le16(tm.tm_mon + 1); record->day = cpu_to_le16(tm.tm_mday); record->hour = cpu_to_le16(tm.tm_hour); record->minute = cpu_to_le16(tm.tm_min); -- 2.5.1
Re: [PATCH net-next] bnxt_en: Copy and paste bug in extended tx_stats
On Thu, Oct 18, 2018 at 1:02 AM Dan Carpenter wrote: > > The struct type was copied from the line before but it should be "tx" > instead of "rx". I have reviewed the code and I can't immediately see > that this bug causes a runtime issue. > > Fixes: 36e53349b60b ("bnxt_en: Add additional extended port statistics.") > Signed-off-by: Dan Carpenter Thanks. Luckily, we did not use sizeof(*bp->hw_tx_port_stats_ext) to allocate the memory, so there is no run-time issue. Acked-by: Michael Chan
[PATCH net-next 22/23] bnxt_en: Add new NAPI poll function for 57500 chips.
Add a new poll function that polls for NQ events. If the NQ event is a CQ notification, we locate the CP ring from the cq_handle and call __bnxt_poll_work() to handle RX/TX events on the CP ring. Add a new has_more_work field in struct bnxt_cp_ring_info to indicate budget has been reached. __bnxt_poll_cqs_done() is called to update or ARM the CP rings if budget has not been reached or not. If budget has been reached, the next bnxt_poll_p5() call will continue to poll from the CQ rings directly. Otherwise, the NQ will be ARMed for the next IRQ. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 114 -- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 4 ++ 2 files changed, 114 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 10d713aa..f518119 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -1900,6 +1900,7 @@ static int __bnxt_poll_work(struct bnxt *bp, struct bnxt_cp_ring_info *cpr, u8 event = 0; struct tx_cmp *txcmp; + cpr->has_more_work = 0; while (1) { int rc; @@ -1920,6 +1921,8 @@ static int __bnxt_poll_work(struct bnxt *bp, struct bnxt_cp_ring_info *cpr, if (unlikely(tx_pkts > bp->tx_wake_thresh)) { rx_pkts = budget; raw_cons = NEXT_RAW_CMP(raw_cons); + if (budget) + cpr->has_more_work = 1; break; } } else if ((TX_CMP_TYPE(txcmp) & 0x30) == 0x10) { @@ -1949,8 +1952,10 @@ static int __bnxt_poll_work(struct bnxt *bp, struct bnxt_cp_ring_info *cpr, } raw_cons = NEXT_RAW_CMP(raw_cons); - if (rx_pkts && rx_pkts == budget) + if (rx_pkts && rx_pkts == budget) { + cpr->has_more_work = 1; break; + } } if (event & BNXT_TX_EVENT) { @@ -2106,6 +2111,104 @@ static int bnxt_poll(struct napi_struct *napi, int budget) return work_done; } +static int __bnxt_poll_cqs(struct bnxt *bp, struct bnxt_napi *bnapi, int budget) +{ + struct bnxt_cp_ring_info *cpr = >cp_ring; + int i, work_done = 0; + + for (i = 0; i < 2; i++) { + struct bnxt_cp_ring_info *cpr2 = cpr->cp_ring_arr[i]; + + if (cpr2) { + work_done += __bnxt_poll_work(bp, cpr2, + budget - work_done); + cpr->has_more_work |= cpr2->has_more_work; + } + } + return work_done; +} + +static void __bnxt_poll_cqs_done(struct bnxt *bp, struct bnxt_napi *bnapi, +u64 dbr_type, bool all) +{ + struct bnxt_cp_ring_info *cpr = >cp_ring; + int i; + + for (i = 0; i < 2; i++) { + struct bnxt_cp_ring_info *cpr2 = cpr->cp_ring_arr[i]; + struct bnxt_db_info *db; + + if (cpr2 && (all || cpr2->had_work_done)) { + db = >cp_db; + writeq(db->db_key64 | dbr_type | + RING_CMP(cpr2->cp_raw_cons), db->doorbell); + cpr2->had_work_done = 0; + } + } + __bnxt_poll_work_done(bp, bnapi); +} + +static int bnxt_poll_p5(struct napi_struct *napi, int budget) +{ + struct bnxt_napi *bnapi = container_of(napi, struct bnxt_napi, napi); + struct bnxt_cp_ring_info *cpr = >cp_ring; + u32 raw_cons = cpr->cp_raw_cons; + struct bnxt *bp = bnapi->bp; + struct nqe_cn *nqcmp; + int work_done = 0; + u32 cons; + + if (cpr->has_more_work) { + cpr->has_more_work = 0; + work_done = __bnxt_poll_cqs(bp, bnapi, budget); + if (cpr->has_more_work) { + __bnxt_poll_cqs_done(bp, bnapi, DBR_TYPE_CQ, false); + return work_done; + } + __bnxt_poll_cqs_done(bp, bnapi, DBR_TYPE_CQ_ARMALL, true); + if (napi_complete_done(napi, work_done)) + BNXT_DB_NQ_ARM_P5(>cp_db, cpr->cp_raw_cons); + return work_done; + } + while (1) { + cons = RING_CMP(raw_cons); + nqcmp = >nq_desc_ring[CP_RING(cons)][CP_IDX(cons)]; + + if (!NQ_CMP_VALID(nqcmp, raw_cons)) { + __bnxt_poll_cqs_done(bp, bnapi, DBR_TYPE_CQ_ARMALL, +false); + cpr->cp_raw_cons = raw_cons; +
[PATCH net-next 20/23] bnxt_en: Add coalescing setup for 57500 chips.
On legacy chips, the CP ring may be shared between RX and TX and so only setup the RX coalescing parameters in such a case. On 57500 chips, we always have a dedicated CP ring for TX so we can always set up the TX coalescing parameters in bnxt_hwrm_set_coal(). Also, the min_timer coalescing parameter applies to the NQ on the new chips and a separate firmware call needs to be made to set it up. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 46 +++ 1 file changed, 46 insertions(+) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 5ec477f..065f4c2 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -5424,6 +5424,7 @@ static void bnxt_hwrm_coal_params_qcaps(struct bnxt *bp) rc = _hwrm_send_message_silent(bp, , sizeof(req), HWRM_CMD_TIMEOUT); if (!rc) { coal_cap->cmpl_params = le32_to_cpu(resp->cmpl_params); + coal_cap->nq_params = le32_to_cpu(resp->nq_params); coal_cap->num_cmpl_dma_aggr_max = le16_to_cpu(resp->num_cmpl_dma_aggr_max); coal_cap->num_cmpl_dma_aggr_during_int_max = @@ -5508,6 +5509,32 @@ static void bnxt_hwrm_set_coal_params(struct bnxt *bp, req->enables |= cpu_to_le16(BNXT_COAL_CMPL_ENABLES); } +/* Caller holds bp->hwrm_cmd_lock */ +static int __bnxt_hwrm_set_coal_nq(struct bnxt *bp, struct bnxt_napi *bnapi, + struct bnxt_coal *hw_coal) +{ + struct hwrm_ring_cmpl_ring_cfg_aggint_params_input req = {0}; + struct bnxt_cp_ring_info *cpr = >cp_ring; + struct bnxt_coal_cap *coal_cap = >coal_cap; + u32 nq_params = coal_cap->nq_params; + u16 tmr; + + if (!(nq_params & RING_AGGINT_QCAPS_RESP_NQ_PARAMS_INT_LAT_TMR_MIN)) + return 0; + + bnxt_hwrm_cmd_hdr_init(bp, , HWRM_RING_CMPL_RING_CFG_AGGINT_PARAMS, + -1, -1); + req.ring_id = cpu_to_le16(cpr->cp_ring_struct.fw_ring_id); + req.flags = + cpu_to_le16(RING_CMPL_RING_CFG_AGGINT_PARAMS_REQ_FLAGS_IS_NQ); + + tmr = bnxt_usec_to_coal_tmr(bp, hw_coal->coal_ticks) / 2; + tmr = clamp_t(u16, tmr, 1, coal_cap->int_lat_tmr_min_max); + req.int_lat_tmr_min = cpu_to_le16(tmr); + req.enables |= cpu_to_le16(BNXT_COAL_CMPL_MIN_TMR_ENABLE); + return _hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); +} + int bnxt_hwrm_set_ring_coal(struct bnxt *bp, struct bnxt_napi *bnapi) { struct hwrm_ring_cmpl_ring_cfg_aggint_params_input req_rx = {0}; @@ -5553,6 +5580,7 @@ int bnxt_hwrm_set_coal(struct bnxt *bp) mutex_lock(>hwrm_cmd_lock); for (i = 0; i < bp->cp_nr_rings; i++) { struct bnxt_napi *bnapi = bp->bnapi[i]; + struct bnxt_coal *hw_coal; u16 ring_id; req = _rx; @@ -5568,6 +5596,24 @@ int bnxt_hwrm_set_coal(struct bnxt *bp) HWRM_CMD_TIMEOUT); if (rc) break; + + if (!(bp->flags & BNXT_FLAG_CHIP_P5)) + continue; + + if (bnapi->rx_ring && bnapi->tx_ring) { + req = _tx; + ring_id = bnxt_cp_ring_for_tx(bp, bnapi->tx_ring); + req->ring_id = cpu_to_le16(ring_id); + rc = _hwrm_send_message(bp, req, sizeof(*req), + HWRM_CMD_TIMEOUT); + if (rc) + break; + } + if (bnapi->rx_ring) + hw_coal = >rx_coal; + else + hw_coal = >tx_coal; + __bnxt_hwrm_set_coal_nq(bp, bnapi, hw_coal); } mutex_unlock(>hwrm_cmd_lock); return rc; -- 2.5.1
[PATCH net-next 17/23] bnxt_en: Increase RSS context array count and skip ring groups on 57500 chips.
On the new 57500 chips, we need to allocate one RSS context for every 64 RX rings. In previous chips, only one RSS context per vnic is required regardless of the number of RX rings. So increase the max RSS context array count to 8. Hardware ring groups are not used on the new chips. Note that the software ring group structure is still maintained in the driver to keep track of the rings associated with the vnic. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 30 +- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 2 +- 2 files changed, 22 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 7952100..1a31328 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -2881,10 +2881,12 @@ static void bnxt_init_vnics(struct bnxt *bp) for (i = 0; i < bp->nr_vnics; i++) { struct bnxt_vnic_info *vnic = >vnic_info[i]; + int j; vnic->fw_vnic_id = INVALID_HW_RING_ID; - vnic->fw_rss_cos_lb_ctx[0] = INVALID_HW_RING_ID; - vnic->fw_rss_cos_lb_ctx[1] = INVALID_HW_RING_ID; + for (j = 0; j < BNXT_MAX_CTX_PER_VNIC; j++) + vnic->fw_rss_cos_lb_ctx[j] = INVALID_HW_RING_ID; + vnic->fw_l2_ctx_id = INVALID_HW_RING_ID; if (bp->vnic_info[i].rss_hash_key) { @@ -3098,6 +3100,9 @@ static int bnxt_alloc_vnic_attributes(struct bnxt *bp) } } + if (bp->flags & BNXT_FLAG_CHIP_P5) + goto vnic_skip_grps; + if (vnic->flags & BNXT_VNIC_RSS_FLAG) max_rings = bp->rx_nr_rings; else @@ -3108,7 +3113,7 @@ static int bnxt_alloc_vnic_attributes(struct bnxt *bp) rc = -ENOMEM; goto out; } - +vnic_skip_grps: if ((bp->flags & BNXT_FLAG_NEW_RSS_CAP) && !(vnic->flags & BNXT_VNIC_RSS_FLAG)) continue; @@ -4397,6 +4402,10 @@ static int bnxt_hwrm_vnic_alloc(struct bnxt *bp, u16 vnic_id, unsigned int i, j, grp_idx, end_idx = start_rx_ring_idx + nr_rings; struct hwrm_vnic_alloc_input req = {0}; struct hwrm_vnic_alloc_output *resp = bp->hwrm_cmd_resp_addr; + struct bnxt_vnic_info *vnic = >vnic_info[vnic_id]; + + if (bp->flags & BNXT_FLAG_CHIP_P5) + goto vnic_no_ring_grps; /* map ring groups to this vnic */ for (i = start_rx_ring_idx, j = 0; i < end_idx; i++, j++) { @@ -4406,12 +4415,12 @@ static int bnxt_hwrm_vnic_alloc(struct bnxt *bp, u16 vnic_id, j, nr_rings); break; } - bp->vnic_info[vnic_id].fw_grp_ids[j] = - bp->grp_info[grp_idx].fw_grp_id; + vnic->fw_grp_ids[j] = bp->grp_info[grp_idx].fw_grp_id; } - bp->vnic_info[vnic_id].fw_rss_cos_lb_ctx[0] = INVALID_HW_RING_ID; - bp->vnic_info[vnic_id].fw_rss_cos_lb_ctx[1] = INVALID_HW_RING_ID; +vnic_no_ring_grps: + for (i = 0; i < BNXT_MAX_CTX_PER_VNIC; i++) + vnic->fw_rss_cos_lb_ctx[i] = INVALID_HW_RING_ID; if (vnic_id == 0) req.flags = cpu_to_le32(VNIC_ALLOC_REQ_FLAGS_DEFAULT); @@ -4420,7 +4429,7 @@ static int bnxt_hwrm_vnic_alloc(struct bnxt *bp, u16 vnic_id, mutex_lock(>hwrm_cmd_lock); rc = _hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); if (!rc) - bp->vnic_info[vnic_id].fw_vnic_id = le32_to_cpu(resp->vnic_id); + vnic->fw_vnic_id = le32_to_cpu(resp->vnic_id); mutex_unlock(>hwrm_cmd_lock); return rc; } @@ -4456,6 +4465,9 @@ static int bnxt_hwrm_ring_grp_alloc(struct bnxt *bp) u16 i; u32 rc = 0; + if (bp->flags & BNXT_FLAG_CHIP_P5) + return 0; + mutex_lock(>hwrm_cmd_lock); for (i = 0; i < bp->rx_nr_rings; i++) { struct hwrm_ring_grp_alloc_input req = {0}; @@ -4488,7 +4500,7 @@ static int bnxt_hwrm_ring_grp_free(struct bnxt *bp) u32 rc = 0; struct hwrm_ring_grp_free_input req = {0}; - if (!bp->grp_info) + if (!bp->grp_info || (bp->flags & BNXT_FLAG_CHIP_P5)) return 0; bnxt_hwrm_cmd_hdr_init(bp, , HWRM_RING_GRP_FREE, -1, -1); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 560e8b7..50b129e 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -862,7 +862,7 @@ struct bnxt_ri
[PATCH net-next 23/23] bnxt_en: Add PCI ID for BCM57508 device.
Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index f518119..de987cc 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -111,6 +111,7 @@ enum board_idx { BCM57452, BCM57454, BCM5745x_NPAR, + BCM57508, BCM58802, BCM58804, BCM58808, @@ -152,6 +153,7 @@ static const struct { [BCM57452] = { "Broadcom BCM57452 NetXtreme-E 10Gb/25Gb/40Gb/50Gb Ethernet" }, [BCM57454] = { "Broadcom BCM57454 NetXtreme-E 10Gb/25Gb/40Gb/50Gb/100Gb Ethernet" }, [BCM5745x_NPAR] = { "Broadcom BCM5745x NetXtreme-E Ethernet Partition" }, + [BCM57508] = { "Broadcom BCM57508 NetXtreme-E 10Gb/25Gb/50Gb/100Gb/200Gb Ethernet" }, [BCM58802] = { "Broadcom BCM58802 NetXtreme-S 10Gb/25Gb/40Gb/50Gb Ethernet" }, [BCM58804] = { "Broadcom BCM58804 NetXtreme-S 10Gb/25Gb/40Gb/50Gb/100Gb Ethernet" }, [BCM58808] = { "Broadcom BCM58808 NetXtreme-S 10Gb/25Gb/40Gb/50Gb/100Gb Ethernet" }, @@ -196,6 +198,7 @@ static const struct pci_device_id bnxt_pci_tbl[] = { { PCI_VDEVICE(BROADCOM, 0x16ef), .driver_data = BCM57416_NPAR }, { PCI_VDEVICE(BROADCOM, 0x16f0), .driver_data = BCM58808 }, { PCI_VDEVICE(BROADCOM, 0x16f1), .driver_data = BCM57452 }, + { PCI_VDEVICE(BROADCOM, 0x1750), .driver_data = BCM57508 }, { PCI_VDEVICE(BROADCOM, 0xd802), .driver_data = BCM58802 }, { PCI_VDEVICE(BROADCOM, 0xd804), .driver_data = BCM58804 }, #ifdef CONFIG_BNXT_SRIOV -- 2.5.1
[PATCH net-next 21/23] bnxt_en: Refactor bnxt_poll_work().
Separate the CP ring polling logic in bnxt_poll_work() into 2 separate functions __bnxt_poll_work() and __bnxt_poll_work_done(). Since the logic is separated, we need to add tx_pkts and events fields to struct bnxt_napi to keep track of the events to handle between the 2 functions. We also add had_work_done field to struct bnxt_cp_ring_info to indicate whether some work was performed on the CP ring. This is needed to better support the 57500 chips. We need to poll up to 2 separate CP rings before we update or ARM the CP rings on the 57500 chips. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 44 +++ drivers/net/ethernet/broadcom/bnxt/bnxt.h | 5 2 files changed, 38 insertions(+), 11 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 065f4c2..10d713aa 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -1889,8 +1889,8 @@ static irqreturn_t bnxt_inta(int irq, void *dev_instance) return IRQ_HANDLED; } -static int bnxt_poll_work(struct bnxt *bp, struct bnxt_cp_ring_info *cpr, - int budget) +static int __bnxt_poll_work(struct bnxt *bp, struct bnxt_cp_ring_info *cpr, + int budget) { struct bnxt_napi *bnapi = cpr->bnapi; u32 raw_cons = cpr->cp_raw_cons; @@ -1913,6 +1913,7 @@ static int bnxt_poll_work(struct bnxt *bp, struct bnxt_cp_ring_info *cpr, * reading any further. */ dma_rmb(); + cpr->had_work_done = 1; if (TX_CMP_TYPE(txcmp) == CMP_TYPE_TX_L2_CMP) { tx_pkts++; /* return full budget so NAPI will complete. */ @@ -1963,22 +1964,43 @@ static int bnxt_poll_work(struct bnxt *bp, struct bnxt_cp_ring_info *cpr, } cpr->cp_raw_cons = raw_cons; - /* ACK completion ring before freeing tx ring and producing new -* buffers in rx/agg rings to prevent overflowing the completion -* ring. -*/ - bnxt_db_cq(bp, >cp_db, cpr->cp_raw_cons); + bnapi->tx_pkts += tx_pkts; + bnapi->events |= event; + return rx_pkts; +} - if (tx_pkts) - bnapi->tx_int(bp, bnapi, tx_pkts); +static void __bnxt_poll_work_done(struct bnxt *bp, struct bnxt_napi *bnapi) +{ + if (bnapi->tx_pkts) { + bnapi->tx_int(bp, bnapi, bnapi->tx_pkts); + bnapi->tx_pkts = 0; + } - if (event & BNXT_RX_EVENT) { + if (bnapi->events & BNXT_RX_EVENT) { struct bnxt_rx_ring_info *rxr = bnapi->rx_ring; bnxt_db_write(bp, >rx_db, rxr->rx_prod); - if (event & BNXT_AGG_EVENT) + if (bnapi->events & BNXT_AGG_EVENT) bnxt_db_write(bp, >rx_agg_db, rxr->rx_agg_prod); } + bnapi->events = 0; +} + +static int bnxt_poll_work(struct bnxt *bp, struct bnxt_cp_ring_info *cpr, + int budget) +{ + struct bnxt_napi *bnapi = cpr->bnapi; + int rx_pkts; + + rx_pkts = __bnxt_poll_work(bp, cpr, budget); + + /* ACK completion ring before freeing tx ring and producing new +* buffers in rx/agg rings to prevent overflowing the completion +* ring. +*/ + bnxt_db_cq(bp, >cp_db, cpr->cp_raw_cons); + + __bnxt_poll_work_done(bp, bnapi); return rx_pkts; } diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 50b129e..48cb2d5 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -792,6 +792,8 @@ struct bnxt_cp_ring_info { u32 cp_raw_cons; struct bnxt_db_info cp_db; + u8 had_work_done:1; + struct bnxt_coalrx_ring_coal; u64 rx_packets; u64 rx_bytes; @@ -829,6 +831,9 @@ struct bnxt_napi { void(*tx_int)(struct bnxt *, struct bnxt_napi *, int); + int tx_pkts; + u8 events; + u32 flags; #define BNXT_NAPI_FLAG_XDP 0x1 -- 2.5.1
[PATCH net-next 16/23] bnxt_en: Allocate/Free CP rings for 57500 series chips.
On the new 57500 chips, we allocate/free one CP ring for each RX ring or TX ring separately. Using separate CP rings for RX/TX is an improvement as TX events will no longer be stuck behind RX events. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 71 --- 1 file changed, 66 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index db1dbad..7952100 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -2758,7 +2758,7 @@ static int bnxt_init_one_rx_ring(struct bnxt *bp, int ring_nr) static void bnxt_init_cp_rings(struct bnxt *bp) { - int i; + int i, j; for (i = 0; i < bp->cp_nr_rings; i++) { struct bnxt_cp_ring_info *cpr = >bnapi[i]->cp_ring; @@ -2767,6 +2767,17 @@ static void bnxt_init_cp_rings(struct bnxt *bp) ring->fw_ring_id = INVALID_HW_RING_ID; cpr->rx_ring_coal.coal_ticks = bp->rx_coal.coal_ticks; cpr->rx_ring_coal.coal_bufs = bp->rx_coal.coal_bufs; + for (j = 0; j < 2; j++) { + struct bnxt_cp_ring_info *cpr2 = cpr->cp_ring_arr[j]; + + if (!cpr2) + continue; + + ring = >cp_ring_struct; + ring->fw_ring_id = INVALID_HW_RING_ID; + cpr2->rx_ring_coal.coal_ticks = bp->rx_coal.coal_ticks; + cpr2->rx_ring_coal.coal_bufs = bp->rx_coal.coal_bufs; + } } } @@ -4711,9 +4722,28 @@ static int bnxt_hwrm_ring_alloc(struct bnxt *bp) type = HWRM_RING_ALLOC_TX; for (i = 0; i < bp->tx_nr_rings; i++) { struct bnxt_tx_ring_info *txr = >tx_ring[i]; - struct bnxt_ring_struct *ring = >tx_ring_struct; - u32 map_idx = i; + struct bnxt_ring_struct *ring; + u32 map_idx; + if (bp->flags & BNXT_FLAG_CHIP_P5) { + struct bnxt_napi *bnapi = txr->bnapi; + struct bnxt_cp_ring_info *cpr, *cpr2; + u32 type2 = HWRM_RING_ALLOC_CMPL; + + cpr = >cp_ring; + cpr2 = cpr->cp_ring_arr[BNXT_TX_HDL]; + ring = >cp_ring_struct; + ring->handle = BNXT_TX_HDL; + map_idx = bnapi->index; + rc = hwrm_ring_alloc_send_msg(bp, ring, type2, map_idx); + if (rc) + goto err_out; + bnxt_set_db(bp, >cp_db, type2, map_idx, + ring->fw_ring_id); + bnxt_db_cq(bp, >cp_db, cpr2->cp_raw_cons); + } + ring = >tx_ring_struct; + map_idx = i; rc = hwrm_ring_alloc_send_msg(bp, ring, type, map_idx); if (rc) goto err_out; @@ -4724,7 +4754,8 @@ static int bnxt_hwrm_ring_alloc(struct bnxt *bp) for (i = 0; i < bp->rx_nr_rings; i++) { struct bnxt_rx_ring_info *rxr = >rx_ring[i]; struct bnxt_ring_struct *ring = >rx_ring_struct; - u32 map_idx = rxr->bnapi->index; + struct bnxt_napi *bnapi = rxr->bnapi; + u32 map_idx = bnapi->index; rc = hwrm_ring_alloc_send_msg(bp, ring, type, map_idx); if (rc) @@ -4732,6 +4763,21 @@ static int bnxt_hwrm_ring_alloc(struct bnxt *bp) bnxt_set_db(bp, >rx_db, type, map_idx, ring->fw_ring_id); bnxt_db_write(bp, >rx_db, rxr->rx_prod); bp->grp_info[map_idx].rx_fw_ring_id = ring->fw_ring_id; + if (bp->flags & BNXT_FLAG_CHIP_P5) { + struct bnxt_cp_ring_info *cpr = >cp_ring; + u32 type2 = HWRM_RING_ALLOC_CMPL; + struct bnxt_cp_ring_info *cpr2; + + cpr2 = cpr->cp_ring_arr[BNXT_RX_HDL]; + ring = >cp_ring_struct; + ring->handle = BNXT_RX_HDL; + rc = hwrm_ring_alloc_send_msg(bp, ring, type2, map_idx); + if (rc) + goto err_out; + bnxt_set_db(bp, >cp_db, type2, map_idx, + ring->fw_ring_id); + bnxt_db_cq(bp, >cp_db, cpr2->cp_raw_cons); + } } if (bp->flags & BNXT_FLAG_AGG_RINGS) { @@ -4858,8 +4904,23 @@ static void bnxt_hwrm_ring_free(struct bnxt *bp, bool close_path) for (i = 0; i < bp->cp_nr_ri
[PATCH net-next 03/23] bnxt_en: Add maximum extended request length fw message support.
Support the max_ext_req_len field from the HWRM_VER_GET_RESPONSE. If this field is valid and greater than the mailbox size, use the short command format to send firmware messages greater than the mailbox size. Newer devices use this method to send larger messages to the firmware. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 34 --- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 + 2 files changed, 28 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 84c1e6c..4c068e6 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -3042,7 +3042,7 @@ static void bnxt_free_hwrm_short_cmd_req(struct bnxt *bp) if (bp->hwrm_short_cmd_req_addr) { struct pci_dev *pdev = bp->pdev; - dma_free_coherent(>dev, BNXT_HWRM_MAX_REQ_LEN, + dma_free_coherent(>dev, bp->hwrm_max_ext_req_len, bp->hwrm_short_cmd_req_addr, bp->hwrm_short_cmd_req_dma_addr); bp->hwrm_short_cmd_req_addr = NULL; @@ -3054,7 +3054,7 @@ static int bnxt_alloc_hwrm_short_cmd_req(struct bnxt *bp) struct pci_dev *pdev = bp->pdev; bp->hwrm_short_cmd_req_addr = - dma_alloc_coherent(>dev, BNXT_HWRM_MAX_REQ_LEN, + dma_alloc_coherent(>dev, bp->hwrm_max_ext_req_len, >hwrm_short_cmd_req_dma_addr, GFP_KERNEL); if (!bp->hwrm_short_cmd_req_addr) @@ -3469,12 +3469,27 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void *msg, u32 msg_len, cp_ring_id = le16_to_cpu(req->cmpl_ring); intr_process = (cp_ring_id == INVALID_HW_RING_ID) ? 0 : 1; - if (bp->fw_cap & BNXT_FW_CAP_SHORT_CMD) { + if (msg_len > BNXT_HWRM_MAX_REQ_LEN) { + if (msg_len > bp->hwrm_max_ext_req_len || + !bp->hwrm_short_cmd_req_addr) + return -EINVAL; + } + + if ((bp->fw_cap & BNXT_FW_CAP_SHORT_CMD) || + msg_len > BNXT_HWRM_MAX_REQ_LEN) { void *short_cmd_req = bp->hwrm_short_cmd_req_addr; + u16 max_msg_len; + + /* Set boundary for maximum extended request length for short +* cmd format. If passed up from device use the max supported +* internal req length. +*/ + max_msg_len = bp->hwrm_max_ext_req_len; memcpy(short_cmd_req, req, msg_len); - memset(short_cmd_req + msg_len, 0, BNXT_HWRM_MAX_REQ_LEN - - msg_len); + if (msg_len < max_msg_len) + memset(short_cmd_req + msg_len, 0, + max_msg_len - msg_len); short_input.req_type = req->req_type; short_input.signature = @@ -5381,8 +5396,12 @@ static int bnxt_hwrm_ver_get(struct bnxt *bp) if (!bp->hwrm_cmd_timeout) bp->hwrm_cmd_timeout = DFLT_HWRM_CMD_TIMEOUT; - if (resp->hwrm_intf_maj_8b >= 1) + if (resp->hwrm_intf_maj_8b >= 1) { bp->hwrm_max_req_len = le16_to_cpu(resp->max_req_win_len); + bp->hwrm_max_ext_req_len = le16_to_cpu(resp->max_ext_req_len); + } + if (bp->hwrm_max_ext_req_len < HWRM_MAX_REQ_LEN) + bp->hwrm_max_ext_req_len = HWRM_MAX_REQ_LEN; bp->chip_num = le16_to_cpu(resp->chip_num); if (bp->chip_num == CHIP_NUM_58700 && !resp->chip_rev && @@ -8908,7 +8927,8 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) if (rc) goto init_err_pci_clean; - if (bp->fw_cap & BNXT_FW_CAP_SHORT_CMD) { + if ((bp->fw_cap & BNXT_FW_CAP_SHORT_CMD) || + bp->hwrm_max_ext_req_len > BNXT_HWRM_MAX_REQ_LEN) { rc = bnxt_alloc_hwrm_short_cmd_req(bp); if (rc) goto init_err_pci_clean; diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 2cd7ee5..8b6874c 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -1315,6 +1315,7 @@ struct bnxt { u16 fw_tx_stats_ext_size; u16 hwrm_max_req_len; + u16 hwrm_max_ext_req_len; int hwrm_cmd_timeout; struct mutexhwrm_cmd_lock; /* serialize hwrm messages */ struct hwrm_ver_get_output ver_resp; -- 2.5.1
[PATCH net-next 04/23] bnxt_en: Update interrupt coalescing logic.
New firmware spec. allows interrupt coalescing parameters, such as maximums, timer units, supported features to be queried. Update the driver to make use of the new call to query these parameters and provide the legacy defaults if the call is not available. Replace the hard-coded values with these parameters. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 107 -- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 39 ++- 2 files changed, 125 insertions(+), 21 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 4c068e6..83b1313 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -4944,46 +4944,113 @@ static int bnxt_hwrm_check_rings(struct bnxt *bp, int tx_rings, int rx_rings, cp_rings, vnics); } -static void bnxt_hwrm_set_coal_params(struct bnxt_coal *hw_coal, +static void bnxt_hwrm_coal_params_qcaps(struct bnxt *bp) +{ + struct hwrm_ring_aggint_qcaps_output *resp = bp->hwrm_cmd_resp_addr; + struct bnxt_coal_cap *coal_cap = >coal_cap; + struct hwrm_ring_aggint_qcaps_input req = {0}; + int rc; + + coal_cap->cmpl_params = BNXT_LEGACY_COAL_CMPL_PARAMS; + coal_cap->num_cmpl_dma_aggr_max = 63; + coal_cap->num_cmpl_dma_aggr_during_int_max = 63; + coal_cap->cmpl_aggr_dma_tmr_max = 65535; + coal_cap->cmpl_aggr_dma_tmr_during_int_max = 65535; + coal_cap->int_lat_tmr_min_max = 65535; + coal_cap->int_lat_tmr_max_max = 65535; + coal_cap->num_cmpl_aggr_int_max = 65535; + coal_cap->timer_units = 80; + + if (bp->hwrm_spec_code < 0x10902) + return; + + bnxt_hwrm_cmd_hdr_init(bp, , HWRM_RING_AGGINT_QCAPS, -1, -1); + mutex_lock(>hwrm_cmd_lock); + rc = _hwrm_send_message_silent(bp, , sizeof(req), HWRM_CMD_TIMEOUT); + if (!rc) { + coal_cap->cmpl_params = le32_to_cpu(resp->cmpl_params); + coal_cap->num_cmpl_dma_aggr_max = + le16_to_cpu(resp->num_cmpl_dma_aggr_max); + coal_cap->num_cmpl_dma_aggr_during_int_max = + le16_to_cpu(resp->num_cmpl_dma_aggr_during_int_max); + coal_cap->cmpl_aggr_dma_tmr_max = + le16_to_cpu(resp->cmpl_aggr_dma_tmr_max); + coal_cap->cmpl_aggr_dma_tmr_during_int_max = + le16_to_cpu(resp->cmpl_aggr_dma_tmr_during_int_max); + coal_cap->int_lat_tmr_min_max = + le16_to_cpu(resp->int_lat_tmr_min_max); + coal_cap->int_lat_tmr_max_max = + le16_to_cpu(resp->int_lat_tmr_max_max); + coal_cap->num_cmpl_aggr_int_max = + le16_to_cpu(resp->num_cmpl_aggr_int_max); + coal_cap->timer_units = le16_to_cpu(resp->timer_units); + } + mutex_unlock(>hwrm_cmd_lock); +} + +static u16 bnxt_usec_to_coal_tmr(struct bnxt *bp, u16 usec) +{ + struct bnxt_coal_cap *coal_cap = >coal_cap; + + return usec * 1000 / coal_cap->timer_units; +} + +static void bnxt_hwrm_set_coal_params(struct bnxt *bp, + struct bnxt_coal *hw_coal, struct hwrm_ring_cmpl_ring_cfg_aggint_params_input *req) { - u16 val, tmr, max, flags; + struct bnxt_coal_cap *coal_cap = >coal_cap; + u32 cmpl_params = coal_cap->cmpl_params; + u16 val, tmr, max, flags = 0; max = hw_coal->bufs_per_record * 128; if (hw_coal->budget) max = hw_coal->bufs_per_record * hw_coal->budget; + max = min_t(u16, max, coal_cap->num_cmpl_aggr_int_max); val = clamp_t(u16, hw_coal->coal_bufs, 1, max); req->num_cmpl_aggr_int = cpu_to_le16(val); - /* This is a 6-bit value and must not be 0, or we'll get non stop IRQ */ - val = min_t(u16, val, 63); + val = min_t(u16, val, coal_cap->num_cmpl_dma_aggr_max); req->num_cmpl_dma_aggr = cpu_to_le16(val); - /* This is a 6-bit value and must not be 0, or we'll get non stop IRQ */ - val = clamp_t(u16, hw_coal->coal_bufs_irq, 1, 63); + val = clamp_t(u16, hw_coal->coal_bufs_irq, 1, + coal_cap->num_cmpl_dma_aggr_during_int_max); req->num_cmpl_dma_aggr_during_int = cpu_to_le16(val); - tmr = BNXT_USEC_TO_COAL_TIMER(hw_coal->coal_ticks); - tmr = max_t(u16, tmr, 1); + tmr = bnxt_usec_to_coal_tmr(bp, hw_coal->coal_ticks); + tmr = clamp_t(u16, tmr, 1, coal_cap->int_lat_tmr_max_max); req->int_lat_tmr_max = cpu_to_le16(tmr); /* min timer set to 1/2 of interrupt timer */ - val = tmr / 2; - req->int_lat_tmr_min = cpu_to_le16(val
[PATCH net-next 01/23] bnxt_en: Update firmware interface spec. to 1.10.0.3.
Among the new changes are trusted VF support, 200Gbps support, and new API to dump ring information on the new chips. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 6 +- drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h | 310 ++ 2 files changed, 224 insertions(+), 92 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index bde3846..766c50b 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -12,11 +12,11 @@ #define BNXT_H #define DRV_MODULE_NAME"bnxt_en" -#define DRV_MODULE_VERSION "1.9.2" +#define DRV_MODULE_VERSION "1.10.0" #define DRV_VER_MAJ1 -#define DRV_VER_MIN9 -#define DRV_VER_UPD2 +#define DRV_VER_MIN10 +#define DRV_VER_UPD0 #include #include diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h index 971ace5d..5dd0860 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h @@ -37,6 +37,8 @@ struct hwrm_resp_hdr { #define TLV_TYPE_HWRM_REQUEST0x1UL #define TLV_TYPE_HWRM_RESPONSE 0x2UL #define TLV_TYPE_ROCE_SP_COMMAND 0x3UL +#define TLV_TYPE_QUERY_ROCE_CC_GEN1 0x4UL +#define TLV_TYPE_MODIFY_ROCE_CC_GEN1 0x5UL #define TLV_TYPE_ENGINE_CKV_DEVICE_SERIAL_NUMBER 0x8001UL #define TLV_TYPE_ENGINE_CKV_NONCE0x8002UL #define TLV_TYPE_ENGINE_CKV_IV 0x8003UL @@ -186,6 +188,7 @@ struct cmd_nums { #define HWRM_TUNNEL_DST_PORT_QUERY0xa0UL #define HWRM_TUNNEL_DST_PORT_ALLOC0xa1UL #define HWRM_TUNNEL_DST_PORT_FREE 0xa2UL + #define HWRM_STAT_CTX_ENG_QUERY 0xafUL #define HWRM_STAT_CTX_ALLOC 0xb0UL #define HWRM_STAT_CTX_FREE0xb1UL #define HWRM_STAT_CTX_QUERY 0xb2UL @@ -235,6 +238,7 @@ struct cmd_nums { #define HWRM_CFA_PAIR_INFO0x10fUL #define HWRM_FW_IPC_MSG 0x110UL #define HWRM_CFA_REDIRECT_TUNNEL_TYPE_INFO0x111UL + #define HWRM_CFA_REDIRECT_QUERY_TUNNEL_TYPE 0x112UL #define HWRM_ENGINE_CKV_HELLO 0x12dUL #define HWRM_ENGINE_CKV_STATUS0x12eUL #define HWRM_ENGINE_CKV_CKEK_ADD 0x12fUL @@ -295,6 +299,7 @@ struct cmd_nums { #define HWRM_DBG_COREDUMP_RETRIEVE0xff19UL #define HWRM_DBG_FW_CLI 0xff1aUL #define HWRM_DBG_I2C_CMD 0xff1bUL + #define HWRM_DBG_RING_INFO_GET0xff1cUL #define HWRM_NVM_FACTORY_DEFAULTS 0xffeeUL #define HWRM_NVM_VALIDATE_OPTION 0xffefUL #define HWRM_NVM_FLUSH0xfff0UL @@ -320,20 +325,21 @@ struct cmd_nums { /* ret_codes (size:64b/8B) */ struct ret_codes { __le16 error_code; - #define HWRM_ERR_CODE_SUCCESS0x0UL - #define HWRM_ERR_CODE_FAIL 0x1UL - #define HWRM_ERR_CODE_INVALID_PARAMS 0x2UL - #define HWRM_ERR_CODE_RESOURCE_ACCESS_DENIED 0x3UL - #define HWRM_ERR_CODE_RESOURCE_ALLOC_ERROR 0x4UL - #define HWRM_ERR_CODE_INVALID_FLAGS 0x5UL - #define HWRM_ERR_CODE_INVALID_ENABLES0x6UL - #define HWRM_ERR_CODE_UNSUPPORTED_TLV0x7UL - #define HWRM_ERR_CODE_NO_BUFFER 0x8UL - #define HWRM_ERR_CODE_UNSUPPORTED_OPTION_ERR 0x9UL - #define HWRM_ERR_CODE_HWRM_ERROR 0xfUL - #define HWRM_ERR_CODE_UNKNOWN_ERR0xfffeUL - #define HWRM_ERR_CODE_CMD_NOT_SUPPORTED 0xUL - #define HWRM_ERR_CODE_LAST HWRM_ERR_CODE_CMD_NOT_SUPPORTED + #define HWRM_ERR_CODE_SUCCESS 0x0UL + #define HWRM_ERR_CODE_FAIL 0x1UL + #define HWRM_ERR_CODE_INVALID_PARAMS0x2UL + #define HWRM_ERR_CODE_RESOURCE_ACCESS_DENIED0x3UL + #define HWRM_ERR_CODE_RESOURCE_ALLOC_ERROR 0x4UL + #define HWRM_ERR_CODE_INVALID_FLAGS 0x5UL + #define HWRM_ERR_CODE_INVALID_ENABLES 0x6UL + #define HWRM_ERR_CODE_UNSUPPORTED_TLV 0x7UL + #define HWRM_ERR_CODE_NO_BUFFER 0x8UL + #define HWRM_ERR_CODE_UNSUPPORTED_OPTION_ERR0x9UL + #define HWRM_ERR_CODE_HWRM_ERROR0xfUL + #define HWRM_ERR_CODE_TLV_ENCAPSULATED_RESPONSE 0x8000UL + #define HWRM_ERR_CODE_UNKNOWN_ERR 0xfffeUL + #define HWRM_ERR_CODE_CMD_
[PATCH net-next 19/23] bnxt_en: Use bnxt_cp_ring_info struct pointer as parameter for RX path.
In the RX code path, we current use the bnxt_napi struct pointer to identify the associated RX/CP rings. Change it to use the struct bnxt_cp_ring_info pointer instead since there are now up to 2 CP rings per MSIX. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 69 --- drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 19 --- 2 files changed, 45 insertions(+), 43 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index d1f9130..5ec477f 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -807,11 +807,11 @@ static inline int bnxt_alloc_rx_page(struct bnxt *bp, return 0; } -static void bnxt_reuse_rx_agg_bufs(struct bnxt_napi *bnapi, u16 cp_cons, +static void bnxt_reuse_rx_agg_bufs(struct bnxt_cp_ring_info *cpr, u16 cp_cons, u32 agg_bufs) { + struct bnxt_napi *bnapi = cpr->bnapi; struct bnxt *bp = bnapi->bp; - struct bnxt_cp_ring_info *cpr = >cp_ring; struct bnxt_rx_ring_info *rxr = bnapi->rx_ring; u16 prod = rxr->rx_agg_prod; u16 sw_prod = rxr->rx_sw_agg_prod; @@ -934,12 +934,13 @@ static struct sk_buff *bnxt_rx_skb(struct bnxt *bp, return skb; } -static struct sk_buff *bnxt_rx_pages(struct bnxt *bp, struct bnxt_napi *bnapi, +static struct sk_buff *bnxt_rx_pages(struct bnxt *bp, +struct bnxt_cp_ring_info *cpr, struct sk_buff *skb, u16 cp_cons, u32 agg_bufs) { + struct bnxt_napi *bnapi = cpr->bnapi; struct pci_dev *pdev = bp->pdev; - struct bnxt_cp_ring_info *cpr = >cp_ring; struct bnxt_rx_ring_info *rxr = bnapi->rx_ring; u16 prod = rxr->rx_agg_prod; u32 i; @@ -986,7 +987,7 @@ static struct sk_buff *bnxt_rx_pages(struct bnxt *bp, struct bnxt_napi *bnapi, * allocated already. */ rxr->rx_agg_prod = prod; - bnxt_reuse_rx_agg_bufs(bnapi, cp_cons, agg_bufs - i); + bnxt_reuse_rx_agg_bufs(cpr, cp_cons, agg_bufs - i); return NULL; } @@ -1043,10 +1044,9 @@ static inline struct sk_buff *bnxt_copy_skb(struct bnxt_napi *bnapi, u8 *data, return skb; } -static int bnxt_discard_rx(struct bnxt *bp, struct bnxt_napi *bnapi, +static int bnxt_discard_rx(struct bnxt *bp, struct bnxt_cp_ring_info *cpr, u32 *raw_cons, void *cmp) { - struct bnxt_cp_ring_info *cpr = >cp_ring; struct rx_cmp *rxcmp = cmp; u32 tmp_raw_cons = *raw_cons; u8 cmp_type, agg_bufs = 0; @@ -1172,11 +1172,11 @@ static void bnxt_tpa_start(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, cons_rx_buf->data = NULL; } -static void bnxt_abort_tpa(struct bnxt *bp, struct bnxt_napi *bnapi, - u16 cp_cons, u32 agg_bufs) +static void bnxt_abort_tpa(struct bnxt_cp_ring_info *cpr, u16 cp_cons, + u32 agg_bufs) { if (agg_bufs) - bnxt_reuse_rx_agg_bufs(bnapi, cp_cons, agg_bufs); + bnxt_reuse_rx_agg_bufs(cpr, cp_cons, agg_bufs); } static struct sk_buff *bnxt_gro_func_5731x(struct bnxt_tpa_info *tpa_info, @@ -1370,13 +1370,13 @@ static struct net_device *bnxt_get_pkt_dev(struct bnxt *bp, u16 cfa_code) } static inline struct sk_buff *bnxt_tpa_end(struct bnxt *bp, - struct bnxt_napi *bnapi, + struct bnxt_cp_ring_info *cpr, u32 *raw_cons, struct rx_tpa_end_cmp *tpa_end, struct rx_tpa_end_cmp_ext *tpa_end1, u8 *event) { - struct bnxt_cp_ring_info *cpr = >cp_ring; + struct bnxt_napi *bnapi = cpr->bnapi; struct bnxt_rx_ring_info *rxr = bnapi->rx_ring; u8 agg_id = TPA_END_AGG_ID(tpa_end); u8 *data_ptr, agg_bufs; @@ -1388,7 +1388,7 @@ static inline struct sk_buff *bnxt_tpa_end(struct bnxt *bp, void *data; if (unlikely(bnapi->in_reset)) { - int rc = bnxt_discard_rx(bp, bnapi, raw_cons, tpa_end); + int rc = bnxt_discard_rx(bp, cpr, raw_cons, tpa_end); if (rc < 0) return ERR_PTR(-EBUSY); @@ -1414,7 +1414,7 @@ static inline struct sk_buff *bnxt_tpa_end(struct bnxt *bp, } if (unlikely(agg_bufs > MAX_SKB_FRAGS || TPA_END_ERRORS(tpa_end1))) { - bnxt_abort_tpa(bp, bnapi, cp_cons, agg_bufs); + bnxt_abort_tpa(cpr, cp_cons, agg_bufs); if (a
[PATCH net-next 09/23] bnxt_en: Add 57500 new chip ID and basic structures.
57500 series is a new chip class (P5) that requires some driver changes in the next several patches. This adds basic chip ID, doorbells, and the notification queue (NQ) structures. Each MSIX is associated with an NQ instead of a CP ring in legacy chips. Each NQ has up to 2 associated CP rings for RX and TX. The same bnxt_cp_ring_info struct will be used for the NQ. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 48 --- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 55 +-- 2 files changed, 88 insertions(+), 15 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index b0e2416..88ea8c7 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -3322,6 +3322,13 @@ static int bnxt_alloc_mem(struct bnxt *bp, bool irq_re_init) bp->bnapi[i] = bnapi; bp->bnapi[i]->index = i; bp->bnapi[i]->bp = bp; + if (bp->flags & BNXT_FLAG_CHIP_P5) { + struct bnxt_cp_ring_info *cpr = + >bnapi[i]->cp_ring; + + cpr->cp_ring_struct.ring_mem.flags = + BNXT_RMEM_RING_PTE_FLAG; + } } bp->rx_ring = kcalloc(bp->rx_nr_rings, @@ -3331,7 +3338,15 @@ static int bnxt_alloc_mem(struct bnxt *bp, bool irq_re_init) return -ENOMEM; for (i = 0; i < bp->rx_nr_rings; i++) { - bp->rx_ring[i].bnapi = bp->bnapi[i]; + struct bnxt_rx_ring_info *rxr = >rx_ring[i]; + + if (bp->flags & BNXT_FLAG_CHIP_P5) { + rxr->rx_ring_struct.ring_mem.flags = + BNXT_RMEM_RING_PTE_FLAG; + rxr->rx_agg_ring_struct.ring_mem.flags = + BNXT_RMEM_RING_PTE_FLAG; + } + rxr->bnapi = bp->bnapi[i]; bp->bnapi[i]->rx_ring = >rx_ring[i]; } @@ -3353,12 +3368,16 @@ static int bnxt_alloc_mem(struct bnxt *bp, bool irq_re_init) j = bp->rx_nr_rings; for (i = 0; i < bp->tx_nr_rings; i++, j++) { - bp->tx_ring[i].bnapi = bp->bnapi[j]; - bp->bnapi[j]->tx_ring = >tx_ring[i]; + struct bnxt_tx_ring_info *txr = >tx_ring[i]; + + if (bp->flags & BNXT_FLAG_CHIP_P5) + txr->tx_ring_struct.ring_mem.flags = + BNXT_RMEM_RING_PTE_FLAG; + txr->bnapi = bp->bnapi[j]; + bp->bnapi[j]->tx_ring = txr; bp->tx_ring_map[i] = bp->tx_nr_rings_xdp + i; if (i >= bp->tx_nr_rings_xdp) { - bp->tx_ring[i].txq_index = i - - bp->tx_nr_rings_xdp; + txr->txq_index = i - bp->tx_nr_rings_xdp; bp->bnapi[j]->tx_int = bnxt_tx_int; } else { bp->bnapi[j]->flags |= BNXT_NAPI_FLAG_XDP; @@ -9326,6 +9345,9 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) goto init_err_pci_clean; } + if (BNXT_CHIP_P5(bp)) + bp->flags |= BNXT_FLAG_CHIP_P5; + rc = bnxt_hwrm_func_reset(bp); if (rc) goto init_err_pci_clean; @@ -9340,7 +9362,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) NETIF_F_GSO_PARTIAL | NETIF_F_RXHASH | NETIF_F_RXCSUM | NETIF_F_GRO; - if (!BNXT_CHIP_TYPE_NITRO_A0(bp)) + if (BNXT_SUPPORTS_TPA(bp)) dev->hw_features |= NETIF_F_LRO; dev->hw_enc_features = @@ -9354,7 +9376,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) dev->vlan_features = dev->hw_features | NETIF_F_HIGHDMA; dev->hw_features |= NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_RX | NETIF_F_HW_VLAN_STAG_TX; - if (!BNXT_CHIP_TYPE_NITRO_A0(bp)) + if (BNXT_SUPPORTS_TPA(bp)) dev->hw_features |= NETIF_F_GRO_HW; dev->features |= dev->hw_features | NETIF_F_HIGHDMA; if (dev->features & NETIF_F_GRO_HW) @@ -9365,10 +
[PATCH net-next 13/23] bnxt_en: Allocate completion ring structures for 57500 series chips.
On 57500 chips, the original bnxt_cp_ring_info struct now refers to the NQ. bp->cp_nr_rings refer to the number of NQs on 57500 chips. There are now 2 pointers for the CP rings associated with RX and TX rings. Modify bnxt_alloc_cp_rings() and bnxt_free_cp_rings() accordingly. With multiple CP rings per NAPI, we need to add a pointer in bnxt_cp_ring_info struct to point back to the bnxt_napi struct. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 64 +++ drivers/net/ethernet/broadcom/bnxt/bnxt.h | 3 ++ 2 files changed, 67 insertions(+) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index a0d7237..9af99dd 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -2482,6 +2482,7 @@ static void bnxt_free_cp_rings(struct bnxt *bp) struct bnxt_napi *bnapi = bp->bnapi[i]; struct bnxt_cp_ring_info *cpr; struct bnxt_ring_struct *ring; + int j; if (!bnapi) continue; @@ -2490,11 +2491,50 @@ static void bnxt_free_cp_rings(struct bnxt *bp) ring = >cp_ring_struct; bnxt_free_ring(bp, >ring_mem); + + for (j = 0; j < 2; j++) { + struct bnxt_cp_ring_info *cpr2 = cpr->cp_ring_arr[j]; + + if (cpr2) { + ring = >cp_ring_struct; + bnxt_free_ring(bp, >ring_mem); + kfree(cpr2); + cpr->cp_ring_arr[j] = NULL; + } + } } } +static struct bnxt_cp_ring_info *bnxt_alloc_cp_sub_ring(struct bnxt *bp) +{ + struct bnxt_ring_mem_info *rmem; + struct bnxt_ring_struct *ring; + struct bnxt_cp_ring_info *cpr; + int rc; + + cpr = kzalloc(sizeof(*cpr), GFP_KERNEL); + if (!cpr) + return NULL; + + ring = >cp_ring_struct; + rmem = >ring_mem; + rmem->nr_pages = bp->cp_nr_pages; + rmem->page_size = HW_CMPD_RING_SIZE; + rmem->pg_arr = (void **)cpr->cp_desc_ring; + rmem->dma_arr = cpr->cp_desc_mapping; + rmem->flags = BNXT_RMEM_RING_PTE_FLAG; + rc = bnxt_alloc_ring(bp, rmem); + if (rc) { + bnxt_free_ring(bp, rmem); + kfree(cpr); + cpr = NULL; + } + return cpr; +} + static int bnxt_alloc_cp_rings(struct bnxt *bp) { + bool sh = !!(bp->flags & BNXT_FLAG_SHARED_RINGS); int i, rc, ulp_base_vec, ulp_msix; ulp_msix = bnxt_get_ulp_msix_num(bp); @@ -2508,6 +2548,7 @@ static int bnxt_alloc_cp_rings(struct bnxt *bp) continue; cpr = >cp_ring; + cpr->bnapi = bnapi; ring = >cp_ring_struct; rc = bnxt_alloc_ring(bp, >ring_mem); @@ -2518,6 +2559,29 @@ static int bnxt_alloc_cp_rings(struct bnxt *bp) ring->map_idx = i + ulp_msix; else ring->map_idx = i; + + if (!(bp->flags & BNXT_FLAG_CHIP_P5)) + continue; + + if (i < bp->rx_nr_rings) { + struct bnxt_cp_ring_info *cpr2 = + bnxt_alloc_cp_sub_ring(bp); + + cpr->cp_ring_arr[BNXT_RX_HDL] = cpr2; + if (!cpr2) + return -ENOMEM; + cpr2->bnapi = bnapi; + } + if ((sh && i < bp->tx_nr_rings) || + (!sh && i >= bp->rx_nr_rings)) { + struct bnxt_cp_ring_info *cpr2 = + bnxt_alloc_cp_sub_ring(bp); + + cpr->cp_ring_arr[BNXT_TX_HDL] = cpr2; + if (!cpr2) + return -ENOMEM; + cpr2->bnapi = bnapi; + } } return 0; } diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 25d592d..589b0be 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -787,6 +787,7 @@ struct bnxt_rx_ring_info { }; struct bnxt_cp_ring_info { + struct bnxt_napi*bnapi; u32 cp_raw_cons; struct bnxt_db_info cp_db; @@ -812,6 +813,8 @@ struct bnxt_cp_ring_info { struct bnxt_ring_struct cp_ring_struct; struct bnxt_cp_ring_info *cp_ring_arr[2]; +#define BNXT_RX_HDL0 +#define BNXT_TX_HDL1 }; struct bnxt_napi { -- 2.5.1
[PATCH net-next 14/23] bnxt_en: Add helper functions to get firmware CP ring ID.
On the new 57500 chips, getting the associated CP ring ID associated with an RX ring or TX ring is different than before. On the legacy chips, we find the associated ring group and look up the CP ring ID. On the 57500 chips, each RX ring and TX ring has a dedicated CP ring even if they share the MSIX. Use these helper functions at appropriate places to get the CP ring ID. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 67 ++- 1 file changed, 56 insertions(+), 11 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 9af99dd..99af288 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -2358,6 +2358,7 @@ static int bnxt_alloc_rx_rings(struct bnxt *bp) if (rc) return rc; + ring->grp_idx = i; if (agg_rings) { u16 mem_size; @@ -4145,6 +4146,40 @@ static int bnxt_hwrm_vnic_set_tpa(struct bnxt *bp, u16 vnic_id, u32 tpa_flags) return hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); } +static u16 bnxt_cp_ring_from_grp(struct bnxt *bp, struct bnxt_ring_struct *ring) +{ + struct bnxt_ring_grp_info *grp_info; + + grp_info = >grp_info[ring->grp_idx]; + return grp_info->cp_fw_ring_id; +} + +static u16 bnxt_cp_ring_for_rx(struct bnxt *bp, struct bnxt_rx_ring_info *rxr) +{ + if (bp->flags & BNXT_FLAG_CHIP_P5) { + struct bnxt_napi *bnapi = rxr->bnapi; + struct bnxt_cp_ring_info *cpr; + + cpr = bnapi->cp_ring.cp_ring_arr[BNXT_RX_HDL]; + return cpr->cp_ring_struct.fw_ring_id; + } else { + return bnxt_cp_ring_from_grp(bp, >rx_ring_struct); + } +} + +static u16 bnxt_cp_ring_for_tx(struct bnxt *bp, struct bnxt_tx_ring_info *txr) +{ + if (bp->flags & BNXT_FLAG_CHIP_P5) { + struct bnxt_napi *bnapi = txr->bnapi; + struct bnxt_cp_ring_info *cpr; + + cpr = bnapi->cp_ring.cp_ring_arr[BNXT_TX_HDL]; + return cpr->cp_ring_struct.fw_ring_id; + } else { + return bnxt_cp_ring_from_grp(bp, >tx_ring_struct); + } +} + static int bnxt_hwrm_vnic_set_rss(struct bnxt *bp, u16 vnic_id, bool set_rss) { u32 i, j, max_rings; @@ -4491,15 +4526,20 @@ static int hwrm_ring_alloc_send_msg(struct bnxt *bp, req.logical_id = cpu_to_le16(map_index); switch (ring_type) { - case HWRM_RING_ALLOC_TX: + case HWRM_RING_ALLOC_TX: { + struct bnxt_tx_ring_info *txr; + + txr = container_of(ring, struct bnxt_tx_ring_info, + tx_ring_struct); req.ring_type = RING_ALLOC_REQ_RING_TYPE_TX; /* Association of transmit ring with completion ring */ grp_info = >grp_info[ring->grp_idx]; - req.cmpl_ring_id = cpu_to_le16(grp_info->cp_fw_ring_id); + req.cmpl_ring_id = cpu_to_le16(bnxt_cp_ring_for_tx(bp, txr)); req.length = cpu_to_le32(bp->tx_ring_mask + 1); req.stat_ctx_id = cpu_to_le32(grp_info->fw_stats_ctx); req.queue_id = cpu_to_le16(ring->queue_id); break; + } case HWRM_RING_ALLOC_RX: req.ring_type = RING_ALLOC_REQ_RING_TYPE_RX; req.length = cpu_to_le32(bp->rx_ring_mask + 1); @@ -4711,9 +4751,9 @@ static void bnxt_hwrm_ring_free(struct bnxt *bp, bool close_path) for (i = 0; i < bp->tx_nr_rings; i++) { struct bnxt_tx_ring_info *txr = >tx_ring[i]; struct bnxt_ring_struct *ring = >tx_ring_struct; - u32 grp_idx = txr->bnapi->index; - u32 cmpl_ring_id = bp->grp_info[grp_idx].cp_fw_ring_id; + u32 cmpl_ring_id; + cmpl_ring_id = bnxt_cp_ring_for_tx(bp, txr); if (ring->fw_ring_id != INVALID_HW_RING_ID) { hwrm_ring_free_send_msg(bp, ring, RING_FREE_REQ_RING_TYPE_TX, @@ -4727,8 +4767,9 @@ static void bnxt_hwrm_ring_free(struct bnxt *bp, bool close_path) struct bnxt_rx_ring_info *rxr = >rx_ring[i]; struct bnxt_ring_struct *ring = >rx_ring_struct; u32 grp_idx = rxr->bnapi->index; - u32 cmpl_ring_id = bp->grp_info[grp_idx].cp_fw_ring_id; + u32 cmpl_ring_id; + cmpl_ring_id = bnxt_cp_ring_for_rx(bp, rxr); if (ring->fw_ring_id != INVALID_HW_RING_ID) { hwrm_ring_free_send_msg(bp, ring, RING_FREE_REQ_RING_TYPE_RX, @@ -4744,8 +4785,9 @@ static void
[PATCH net-next 02/23] bnxt_en: Add additional extended port statistics.
Latest firmware spec. has some additional rx extended port stats and new tx extended port stats added. We now need to check the size of the returned rx and tx extended stats and determine how many counters are valid. New counters added include CoS byte and packet counts for rx and tx. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 30 +++- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 7 ++ drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 91 +-- 3 files changed, 121 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index e2d9254..84c1e6c 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -3078,6 +3078,13 @@ static void bnxt_free_stats(struct bnxt *bp) bp->hw_rx_port_stats = NULL; } + if (bp->hw_tx_port_stats_ext) { + dma_free_coherent(>dev, sizeof(struct tx_port_stats_ext), + bp->hw_tx_port_stats_ext, + bp->hw_tx_port_stats_ext_map); + bp->hw_tx_port_stats_ext = NULL; + } + if (bp->hw_rx_port_stats_ext) { dma_free_coherent(>dev, sizeof(struct rx_port_stats_ext), bp->hw_rx_port_stats_ext, @@ -3152,6 +3159,13 @@ static int bnxt_alloc_stats(struct bnxt *bp) if (!bp->hw_rx_port_stats_ext) return 0; + if (bp->hwrm_spec_code >= 0x10902) { + bp->hw_tx_port_stats_ext = + dma_zalloc_coherent(>dev, + sizeof(struct tx_port_stats_ext), + >hw_tx_port_stats_ext_map, + GFP_KERNEL); + } bp->flags |= BNXT_FLAG_PORT_STATS_EXT; } return 0; @@ -5425,8 +5439,10 @@ static int bnxt_hwrm_port_qstats(struct bnxt *bp) static int bnxt_hwrm_port_qstats_ext(struct bnxt *bp) { + struct hwrm_port_qstats_ext_output *resp = bp->hwrm_cmd_resp_addr; struct hwrm_port_qstats_ext_input req = {0}; struct bnxt_pf_info *pf = >pf; + int rc; if (!(bp->flags & BNXT_FLAG_PORT_STATS_EXT)) return 0; @@ -5435,7 +5451,19 @@ static int bnxt_hwrm_port_qstats_ext(struct bnxt *bp) req.port_id = cpu_to_le16(pf->port_id); req.rx_stat_size = cpu_to_le16(sizeof(struct rx_port_stats_ext)); req.rx_stat_host_addr = cpu_to_le64(bp->hw_rx_port_stats_ext_map); - return hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); + req.tx_stat_size = cpu_to_le16(sizeof(struct tx_port_stats_ext)); + req.tx_stat_host_addr = cpu_to_le64(bp->hw_tx_port_stats_ext_map); + mutex_lock(>hwrm_cmd_lock); + rc = _hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); + if (!rc) { + bp->fw_rx_stats_ext_size = le16_to_cpu(resp->rx_stat_size) / 8; + bp->fw_tx_stats_ext_size = le16_to_cpu(resp->tx_stat_size) / 8; + } else { + bp->fw_rx_stats_ext_size = 0; + bp->fw_tx_stats_ext_size = 0; + } + mutex_unlock(>hwrm_cmd_lock); + return rc; } static void bnxt_hwrm_free_tunnel_ports(struct bnxt *bp) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 766c50b..2cd7ee5 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -1305,10 +1305,14 @@ struct bnxt { struct rx_port_stats*hw_rx_port_stats; struct tx_port_stats*hw_tx_port_stats; struct rx_port_stats_ext*hw_rx_port_stats_ext; + struct rx_port_stats_ext*hw_tx_port_stats_ext; dma_addr_t hw_rx_port_stats_map; dma_addr_t hw_tx_port_stats_map; dma_addr_t hw_rx_port_stats_ext_map; + dma_addr_t hw_tx_port_stats_ext_map; int hw_port_stats_size; + u16 fw_rx_stats_ext_size; + u16 fw_tx_stats_ext_size; u16 hwrm_max_req_len; int hwrm_cmd_timeout; @@ -1425,6 +1429,9 @@ struct bnxt { #define BNXT_RX_STATS_EXT_OFFSET(counter) \ (offsetof(struct rx_port_stats_ext, counter) / 8) +#define BNXT_TX_STATS_EXT_OFFSET(counter) \ + (offsetof(struct tx_port_stats_ext, counter) / 8) + #define I2C_DEV_ADDR_A00xa0 #define I2C_DEV_ADDR_A20xa2 #define SFF_DIAG_SUPPORT_OFFSET0x5c diff --git a/drivers/net/et
[PATCH net-next 05/23] bnxt_en: Refactor bnxt_ring_struct.
Move the DMA page table and vmem fields in bnxt_ring_struct to a new bnxt_ring_mem_info struct. This will allow context memory management for a new device to re-use some of the existing infrastructure. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 138 -- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 6 +- 2 files changed, 77 insertions(+), 67 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 83b1313..602dc09 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -2202,60 +2202,60 @@ static void bnxt_free_skbs(struct bnxt *bp) bnxt_free_rx_skbs(bp); } -static void bnxt_free_ring(struct bnxt *bp, struct bnxt_ring_struct *ring) +static void bnxt_free_ring(struct bnxt *bp, struct bnxt_ring_mem_info *rmem) { struct pci_dev *pdev = bp->pdev; int i; - for (i = 0; i < ring->nr_pages; i++) { - if (!ring->pg_arr[i]) + for (i = 0; i < rmem->nr_pages; i++) { + if (!rmem->pg_arr[i]) continue; - dma_free_coherent(>dev, ring->page_size, - ring->pg_arr[i], ring->dma_arr[i]); + dma_free_coherent(>dev, rmem->page_size, + rmem->pg_arr[i], rmem->dma_arr[i]); - ring->pg_arr[i] = NULL; + rmem->pg_arr[i] = NULL; } - if (ring->pg_tbl) { - dma_free_coherent(>dev, ring->nr_pages * 8, - ring->pg_tbl, ring->pg_tbl_map); - ring->pg_tbl = NULL; + if (rmem->pg_tbl) { + dma_free_coherent(>dev, rmem->nr_pages * 8, + rmem->pg_tbl, rmem->pg_tbl_map); + rmem->pg_tbl = NULL; } - if (ring->vmem_size && *ring->vmem) { - vfree(*ring->vmem); - *ring->vmem = NULL; + if (rmem->vmem_size && *rmem->vmem) { + vfree(*rmem->vmem); + *rmem->vmem = NULL; } } -static int bnxt_alloc_ring(struct bnxt *bp, struct bnxt_ring_struct *ring) +static int bnxt_alloc_ring(struct bnxt *bp, struct bnxt_ring_mem_info *rmem) { - int i; struct pci_dev *pdev = bp->pdev; + int i; - if (ring->nr_pages > 1) { - ring->pg_tbl = dma_alloc_coherent(>dev, - ring->nr_pages * 8, - >pg_tbl_map, + if (rmem->nr_pages > 1) { + rmem->pg_tbl = dma_alloc_coherent(>dev, + rmem->nr_pages * 8, + >pg_tbl_map, GFP_KERNEL); - if (!ring->pg_tbl) + if (!rmem->pg_tbl) return -ENOMEM; } - for (i = 0; i < ring->nr_pages; i++) { - ring->pg_arr[i] = dma_alloc_coherent(>dev, -ring->page_size, ->dma_arr[i], + for (i = 0; i < rmem->nr_pages; i++) { + rmem->pg_arr[i] = dma_alloc_coherent(>dev, +rmem->page_size, +>dma_arr[i], GFP_KERNEL); - if (!ring->pg_arr[i]) + if (!rmem->pg_arr[i]) return -ENOMEM; - if (ring->nr_pages > 1) - ring->pg_tbl[i] = cpu_to_le64(ring->dma_arr[i]); + if (rmem->nr_pages > 1) + rmem->pg_tbl[i] = cpu_to_le64(rmem->dma_arr[i]); } - if (ring->vmem_size) { - *ring->vmem = vzalloc(ring->vmem_size); - if (!(*ring->vmem)) + if (rmem->vmem_size) { + *rmem->vmem = vzalloc(rmem->vmem_size); + if (!(*rmem->vmem)) return -ENOMEM; } return 0; @@ -2285,10 +2285,10 @@ static void bnxt_free_rx_rings(struct bnxt *bp) rxr->rx_agg_bmap = NULL; ring = >rx_ring_struct; - bnxt_free_ring(bp, ring); + bnxt_free_ring(bp, >ring_mem); ring = >rx_agg_ring_struct; - bnxt_free_ring(bp, ring); + bnxt_free_ring(bp, >ring_mem); } } @@ -2315,7 +2315,7 @@ static int bnxt_alloc_rx_rings(struct bnxt *bp)
[PATCH net-next 18/23] bnxt_en: Add RSS support for 57500 chips.
RSS context allocation and RSS indirection table setup are very different on the new chip. Refactor bnxt_setup_vnic() to call 2 different functions to set up RSS for the vnic based on chip type. On the new chip, the number of RSS contexts and the indirection table size depends on the number of RX rings. Each indirection table entry is also different on the new chip since ring groups are no longer used. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 113 -- 1 file changed, 109 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 1a31328..d1f9130 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -4202,7 +4202,8 @@ static int bnxt_hwrm_vnic_set_rss(struct bnxt *bp, u16 vnic_id, bool set_rss) struct bnxt_vnic_info *vnic = >vnic_info[vnic_id]; struct hwrm_vnic_rss_cfg_input req = {0}; - if (vnic->fw_rss_cos_lb_ctx[0] == INVALID_HW_RING_ID) + if ((bp->flags & BNXT_FLAG_CHIP_P5) || + vnic->fw_rss_cos_lb_ctx[0] == INVALID_HW_RING_ID) return 0; bnxt_hwrm_cmd_hdr_init(bp, , HWRM_VNIC_RSS_CFG, -1, -1); @@ -4233,6 +4234,51 @@ static int bnxt_hwrm_vnic_set_rss(struct bnxt *bp, u16 vnic_id, bool set_rss) return hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); } +static int bnxt_hwrm_vnic_set_rss_p5(struct bnxt *bp, u16 vnic_id, bool set_rss) +{ + struct bnxt_vnic_info *vnic = >vnic_info[vnic_id]; + u32 i, j, k, nr_ctxs, max_rings = bp->rx_nr_rings; + struct bnxt_rx_ring_info *rxr = >rx_ring[0]; + struct hwrm_vnic_rss_cfg_input req = {0}; + + bnxt_hwrm_cmd_hdr_init(bp, , HWRM_VNIC_RSS_CFG, -1, -1); + req.vnic_id = cpu_to_le16(vnic->fw_vnic_id); + if (!set_rss) { + hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); + return 0; + } + req.hash_type = cpu_to_le32(bp->rss_hash_cfg); + req.hash_mode_flags = VNIC_RSS_CFG_REQ_HASH_MODE_FLAGS_DEFAULT; + req.ring_grp_tbl_addr = cpu_to_le64(vnic->rss_table_dma_addr); + req.hash_key_tbl_addr = cpu_to_le64(vnic->rss_hash_key_dma_addr); + nr_ctxs = DIV_ROUND_UP(bp->rx_nr_rings, 64); + for (i = 0, k = 0; i < nr_ctxs; i++) { + __le16 *ring_tbl = vnic->rss_table; + int rc; + + req.ring_table_pair_index = i; + req.rss_ctx_idx = cpu_to_le16(vnic->fw_rss_cos_lb_ctx[i]); + for (j = 0; j < 64; j++) { + u16 ring_id; + + ring_id = rxr->rx_ring_struct.fw_ring_id; + *ring_tbl++ = cpu_to_le16(ring_id); + ring_id = bnxt_cp_ring_for_rx(bp, rxr); + *ring_tbl++ = cpu_to_le16(ring_id); + rxr++; + k++; + if (k == max_rings) { + k = 0; + rxr = >rx_ring[0]; + } + } + rc = hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); + if (rc) + return -EIO; + } + return 0; +} + static int bnxt_hwrm_vnic_set_hds(struct bnxt *bp, u16 vnic_id) { struct bnxt_vnic_info *vnic = >vnic_info[vnic_id]; @@ -4316,6 +4362,18 @@ int bnxt_hwrm_vnic_cfg(struct bnxt *bp, u16 vnic_id) bnxt_hwrm_cmd_hdr_init(bp, , HWRM_VNIC_CFG, -1, -1); + if (bp->flags & BNXT_FLAG_CHIP_P5) { + struct bnxt_rx_ring_info *rxr = >rx_ring[0]; + + req.default_rx_ring_id = + cpu_to_le16(rxr->rx_ring_struct.fw_ring_id); + req.default_cmpl_ring_id = + cpu_to_le16(bnxt_cp_ring_for_rx(bp, rxr)); + req.enables = + cpu_to_le32(VNIC_CFG_REQ_ENABLES_DEFAULT_RX_RING_ID | + VNIC_CFG_REQ_ENABLES_DEFAULT_CMPL_RING_ID); + goto vnic_mru; + } req.enables = cpu_to_le32(VNIC_CFG_REQ_ENABLES_DFLT_RING_GRP); /* Only RSS support for now TBD: COS & LB */ if (vnic->fw_rss_cos_lb_ctx[0] != INVALID_HW_RING_ID) { @@ -4348,13 +4406,13 @@ int bnxt_hwrm_vnic_cfg(struct bnxt *bp, u16 vnic_id) ring = bp->rx_nr_rings - 1; grp_idx = bp->rx_ring[ring].bnapi->index; - req.vnic_id = cpu_to_le16(vnic->fw_vnic_id); req.dflt_ring_grp = cpu_to_le16(bp->grp_info[grp_idx].fw_grp_id); - req.lb_rule = cpu_to_le16(0x); +vnic_mru: req.mru = cpu_to_le16(bp->dev->mtu + ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN); + req.vnic_id = cpu_to_le16(vnic->fw_vnic_id); #ifdef CONFIG_BNX
[PATCH net-next 07/23] bnxt_en: Check context memory requirements from firmware.
New device requires host context memory as a backing store. Call firmware to check for context memory requirements and store the parameters. Allocate host pages accordingly. We also need to move the call bnxt_hwrm_queue_qportcfg() earlier so that all the supported hardware queues and the IDs are known before checking and allocating context memory. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 208 -- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 48 +++ 2 files changed, 248 insertions(+), 8 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index f0da558..83427da 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -5255,6 +5255,187 @@ static int bnxt_hwrm_func_qcfg(struct bnxt *bp) return rc; } +static int bnxt_hwrm_func_backing_store_qcaps(struct bnxt *bp) +{ + struct hwrm_func_backing_store_qcaps_input req = {0}; + struct hwrm_func_backing_store_qcaps_output *resp = + bp->hwrm_cmd_resp_addr; + int rc; + + if (bp->hwrm_spec_code < 0x10902 || BNXT_VF(bp) || bp->ctx) + return 0; + + bnxt_hwrm_cmd_hdr_init(bp, , HWRM_FUNC_BACKING_STORE_QCAPS, -1, -1); + mutex_lock(>hwrm_cmd_lock); + rc = _hwrm_send_message_silent(bp, , sizeof(req), HWRM_CMD_TIMEOUT); + if (!rc) { + struct bnxt_ctx_pg_info *ctx_pg; + struct bnxt_ctx_mem_info *ctx; + int i; + + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); + if (!ctx) { + rc = -ENOMEM; + goto ctx_err; + } + ctx_pg = kzalloc(sizeof(*ctx_pg) * (bp->max_q + 1), GFP_KERNEL); + if (!ctx_pg) { + kfree(ctx); + rc = -ENOMEM; + goto ctx_err; + } + for (i = 0; i < bp->max_q + 1; i++, ctx_pg++) + ctx->tqm_mem[i] = ctx_pg; + + bp->ctx = ctx; + ctx->qp_max_entries = le32_to_cpu(resp->qp_max_entries); + ctx->qp_min_qp1_entries = le16_to_cpu(resp->qp_min_qp1_entries); + ctx->qp_max_l2_entries = le16_to_cpu(resp->qp_max_l2_entries); + ctx->qp_entry_size = le16_to_cpu(resp->qp_entry_size); + ctx->srq_max_l2_entries = le16_to_cpu(resp->srq_max_l2_entries); + ctx->srq_max_entries = le32_to_cpu(resp->srq_max_entries); + ctx->srq_entry_size = le16_to_cpu(resp->srq_entry_size); + ctx->cq_max_l2_entries = le16_to_cpu(resp->cq_max_l2_entries); + ctx->cq_max_entries = le32_to_cpu(resp->cq_max_entries); + ctx->cq_entry_size = le16_to_cpu(resp->cq_entry_size); + ctx->vnic_max_vnic_entries = + le16_to_cpu(resp->vnic_max_vnic_entries); + ctx->vnic_max_ring_table_entries = + le16_to_cpu(resp->vnic_max_ring_table_entries); + ctx->vnic_entry_size = le16_to_cpu(resp->vnic_entry_size); + ctx->stat_max_entries = le32_to_cpu(resp->stat_max_entries); + ctx->stat_entry_size = le16_to_cpu(resp->stat_entry_size); + ctx->tqm_entry_size = le16_to_cpu(resp->tqm_entry_size); + ctx->tqm_min_entries_per_ring = + le32_to_cpu(resp->tqm_min_entries_per_ring); + ctx->tqm_max_entries_per_ring = + le32_to_cpu(resp->tqm_max_entries_per_ring); + ctx->tqm_entries_multiple = resp->tqm_entries_multiple; + if (!ctx->tqm_entries_multiple) + ctx->tqm_entries_multiple = 1; + ctx->mrav_max_entries = le32_to_cpu(resp->mrav_max_entries); + ctx->mrav_entry_size = le16_to_cpu(resp->mrav_entry_size); + ctx->tim_entry_size = le16_to_cpu(resp->tim_entry_size); + ctx->tim_max_entries = le32_to_cpu(resp->tim_max_entries); + } else { + rc = 0; + } +ctx_err: + mutex_unlock(>hwrm_cmd_lock); + return rc; +} + +static int bnxt_alloc_ctx_mem_blk(struct bnxt *bp, + struct bnxt_ctx_pg_info *ctx_pg, u32 mem_size) +{ + struct bnxt_ring_mem_info *rmem = _pg->ring_mem; + + if (!mem_size) + return 0; + + rmem->nr_pages = DIV_ROUND_UP(mem_size, BNXT_PAGE_SIZE); + if (rmem->nr_pages > MAX_CTX_PAGES) { + rmem->nr_pages = 0; + return -EINVAL; + } + rmem->page_size = BNXT_PAGE_SIZE; + rmem->pg_arr = ctx_pg->ctx_pg_arr; +
[PATCH net-next 08/23] bnxt_en: Configure context memory on new devices.
Call firmware to configure the DMA addresses of all context memory pages on new devices requiring context memory. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 123 +- 1 file changed, 120 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 83427da..b0e2416 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -5325,6 +5325,114 @@ static int bnxt_hwrm_func_backing_store_qcaps(struct bnxt *bp) return rc; } +static void bnxt_hwrm_set_pg_attr(struct bnxt_ring_mem_info *rmem, u8 *pg_attr, + __le64 *pg_dir) +{ + u8 pg_size = 0; + + if (BNXT_PAGE_SHIFT == 13) + pg_size = 1 << 4; + else if (BNXT_PAGE_SIZE == 16) + pg_size = 2 << 4; + + *pg_attr = pg_size; + if (rmem->nr_pages > 1) { + *pg_attr |= 1; + *pg_dir = cpu_to_le64(rmem->pg_tbl_map); + } else { + *pg_dir = cpu_to_le64(rmem->dma_arr[0]); + } +} + +#define FUNC_BACKING_STORE_CFG_REQ_DFLT_ENABLES\ + (FUNC_BACKING_STORE_CFG_REQ_ENABLES_QP |\ +FUNC_BACKING_STORE_CFG_REQ_ENABLES_SRQ | \ +FUNC_BACKING_STORE_CFG_REQ_ENABLES_CQ |\ +FUNC_BACKING_STORE_CFG_REQ_ENABLES_VNIC | \ +FUNC_BACKING_STORE_CFG_REQ_ENABLES_STAT) + +static int bnxt_hwrm_func_backing_store_cfg(struct bnxt *bp, u32 enables) +{ + struct hwrm_func_backing_store_cfg_input req = {0}; + struct bnxt_ctx_mem_info *ctx = bp->ctx; + struct bnxt_ctx_pg_info *ctx_pg; + __le32 *num_entries; + __le64 *pg_dir; + u8 *pg_attr; + int i, rc; + u32 ena; + + if (!ctx) + return 0; + + bnxt_hwrm_cmd_hdr_init(bp, , HWRM_FUNC_BACKING_STORE_CFG, -1, -1); + req.enables = cpu_to_le32(enables); + + if (enables & FUNC_BACKING_STORE_CFG_REQ_ENABLES_QP) { + ctx_pg = >qp_mem; + req.qp_num_entries = cpu_to_le32(ctx_pg->entries); + req.qp_num_qp1_entries = cpu_to_le16(ctx->qp_min_qp1_entries); + req.qp_num_l2_entries = cpu_to_le16(ctx->qp_max_l2_entries); + req.qp_entry_size = cpu_to_le16(ctx->qp_entry_size); + bnxt_hwrm_set_pg_attr(_pg->ring_mem, + _pg_size_qpc_lvl, + _page_dir); + } + if (enables & FUNC_BACKING_STORE_CFG_REQ_ENABLES_SRQ) { + ctx_pg = >srq_mem; + req.srq_num_entries = cpu_to_le32(ctx_pg->entries); + req.srq_num_l2_entries = cpu_to_le16(ctx->srq_max_l2_entries); + req.srq_entry_size = cpu_to_le16(ctx->srq_entry_size); + bnxt_hwrm_set_pg_attr(_pg->ring_mem, + _pg_size_srq_lvl, + _page_dir); + } + if (enables & FUNC_BACKING_STORE_CFG_REQ_ENABLES_CQ) { + ctx_pg = >cq_mem; + req.cq_num_entries = cpu_to_le32(ctx_pg->entries); + req.cq_num_l2_entries = cpu_to_le16(ctx->cq_max_l2_entries); + req.cq_entry_size = cpu_to_le16(ctx->cq_entry_size); + bnxt_hwrm_set_pg_attr(_pg->ring_mem, _pg_size_cq_lvl, + _page_dir); + } + if (enables & FUNC_BACKING_STORE_CFG_REQ_ENABLES_VNIC) { + ctx_pg = >vnic_mem; + req.vnic_num_vnic_entries = + cpu_to_le16(ctx->vnic_max_vnic_entries); + req.vnic_num_ring_table_entries = + cpu_to_le16(ctx->vnic_max_ring_table_entries); + req.vnic_entry_size = cpu_to_le16(ctx->vnic_entry_size); + bnxt_hwrm_set_pg_attr(_pg->ring_mem, + _pg_size_vnic_lvl, + _page_dir); + } + if (enables & FUNC_BACKING_STORE_CFG_REQ_ENABLES_STAT) { + ctx_pg = >stat_mem; + req.stat_num_entries = cpu_to_le32(ctx->stat_max_entries); + req.stat_entry_size = cpu_to_le16(ctx->stat_entry_size); + bnxt_hwrm_set_pg_attr(_pg->ring_mem, + _pg_size_stat_lvl, + _page_dir); + } + for (i = 0, num_entries = _sp_num_entries, +pg_attr = _sp_pg_size_tqm_sp_lvl, +pg_dir = _sp_page_dir, +ena = FUNC_BACKING_STORE_CFG_REQ_ENABLES_TQM_SP; +i < 9; i++, num_entries++, pg_attr++, pg_dir++, ena <<= 1) { + if (!(enables & ena)) +
[PATCH net-next 00/23] bnxt_en: Add support for new 57500 chips.
This patch-set is larger than normal because I wanted a complete series to add basic support for the new 57500 chips. The new chips have the following main differences compared to legacy chips: 1. Requires the PF driver to allocate DMA context memory as a backing store. 2. New NQ (notification queue) for interrupt events. 3. One or more CP rings can be associated with an NQ. 4. 64-bit doorbells. Most other structures and firmware APIs are compatible with legacy devices with some exceptions. For example, ring groups are no longer used and RSS table format has changed. The patch-set includes the usual firmware spec. update, some refactoring and restructuring, and adding the new code to add basic support for the new class of devices. Michael Chan (23): bnxt_en: Update firmware interface spec. to 1.10.0.3. bnxt_en: Add additional extended port statistics. bnxt_en: Add maximum extended request length fw message support. bnxt_en: Update interrupt coalescing logic. bnxt_en: Refactor bnxt_ring_struct. bnxt_en: Add new flags to setup new page table PTE bits on newer devices. bnxt_en: Check context memory requirements from firmware. bnxt_en: Configure context memory on new devices. bnxt_en: Add 57500 new chip ID and basic structures. bnxt_en: Re-structure doorbells. bnxt_en: Adjust MSIX and ring groups for 57500 series chips. bnxt_en: Modify the ring reservation functions for 57500 series chips. bnxt_en: Allocate completion ring structures for 57500 series chips. bnxt_en: Add helper functions to get firmware CP ring ID. bnxt_en: Modify bnxt_ring_alloc_send_msg() to support 57500 chips. bnxt_en: Allocate/Free CP rings for 57500 series chips. bnxt_en: Increase RSS context array count and skip ring groups on 57500 chips. bnxt_en: Add RSS support for 57500 chips. bnxt_en: Use bnxt_cp_ring_info struct pointer as parameter for RX path. bnxt_en: Add coalescing setup for 57500 chips. bnxt_en: Refactor bnxt_poll_work(). bnxt_en: Add new NAPI poll function for 57500 chips. bnxt_en: Add PCI ID for BCM57508 device. drivers/net/ethernet/broadcom/bnxt/bnxt.c | 1671 + drivers/net/ethernet/broadcom/bnxt/bnxt.h | 250 ++- drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 112 +- drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h | 310 ++-- drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c |2 +- 5 files changed, 1944 insertions(+), 401 deletions(-) -- 2.5.1
[PATCH net-next 15/23] bnxt_en: Modify bnxt_ring_alloc_send_msg() to support 57500 chips.
Firmware ring allocation semantics are slightly different for most ring types on 57500 chips. Allocation/deallocation for NQ rings are also added for the new chips. A CP ring handle is also added so that from the NQ interrupt event, we can locate the CP ring. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 61 --- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 + 2 files changed, 56 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 99af288..db1dbad 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -4543,14 +4543,53 @@ static int hwrm_ring_alloc_send_msg(struct bnxt *bp, case HWRM_RING_ALLOC_RX: req.ring_type = RING_ALLOC_REQ_RING_TYPE_RX; req.length = cpu_to_le32(bp->rx_ring_mask + 1); + if (bp->flags & BNXT_FLAG_CHIP_P5) { + u16 flags = 0; + + /* Association of rx ring with stats context */ + grp_info = >grp_info[ring->grp_idx]; + req.rx_buf_size = cpu_to_le16(bp->rx_buf_use_size); + req.stat_ctx_id = cpu_to_le32(grp_info->fw_stats_ctx); + req.enables |= cpu_to_le32( + RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID); + if (NET_IP_ALIGN == 2) + flags = RING_ALLOC_REQ_FLAGS_RX_SOP_PAD; + req.flags = cpu_to_le16(flags); + } break; case HWRM_RING_ALLOC_AGG: - req.ring_type = RING_ALLOC_REQ_RING_TYPE_RX; + if (bp->flags & BNXT_FLAG_CHIP_P5) { + req.ring_type = RING_ALLOC_REQ_RING_TYPE_RX_AGG; + /* Association of agg ring with rx ring */ + grp_info = >grp_info[ring->grp_idx]; + req.rx_ring_id = cpu_to_le16(grp_info->rx_fw_ring_id); + req.rx_buf_size = cpu_to_le16(BNXT_RX_PAGE_SIZE); + req.stat_ctx_id = cpu_to_le32(grp_info->fw_stats_ctx); + req.enables |= cpu_to_le32( + RING_ALLOC_REQ_ENABLES_RX_RING_ID_VALID | + RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID); + } else { + req.ring_type = RING_ALLOC_REQ_RING_TYPE_RX; + } req.length = cpu_to_le32(bp->rx_agg_ring_mask + 1); break; case HWRM_RING_ALLOC_CMPL: req.ring_type = RING_ALLOC_REQ_RING_TYPE_L2_CMPL; req.length = cpu_to_le32(bp->cp_ring_mask + 1); + if (bp->flags & BNXT_FLAG_CHIP_P5) { + /* Association of cp ring with nq */ + grp_info = >grp_info[map_index]; + req.nq_ring_id = cpu_to_le16(grp_info->cp_fw_ring_id); + req.cq_handle = cpu_to_le64(ring->handle); + req.enables |= cpu_to_le32( + RING_ALLOC_REQ_ENABLES_NQ_RING_ID_VALID); + } else if (bp->flags & BNXT_FLAG_USING_MSIX) { + req.int_mode = RING_ALLOC_REQ_INT_MODE_MSIX; + } + break; + case HWRM_RING_ALLOC_NQ: + req.ring_type = RING_ALLOC_REQ_RING_TYPE_NQ; + req.length = cpu_to_le32(bp->cp_ring_mask + 1); if (bp->flags & BNXT_FLAG_USING_MSIX) req.int_mode = RING_ALLOC_REQ_INT_MODE_MSIX; break; @@ -4645,7 +4684,10 @@ static int bnxt_hwrm_ring_alloc(struct bnxt *bp) int i, rc = 0; u32 type; - type = HWRM_RING_ALLOC_CMPL; + if (bp->flags & BNXT_FLAG_CHIP_P5) + type = HWRM_RING_ALLOC_NQ; + else + type = HWRM_RING_ALLOC_CMPL; for (i = 0; i < bp->cp_nr_rings; i++) { struct bnxt_napi *bnapi = bp->bnapi[i]; struct bnxt_cp_ring_info *cpr = >cp_ring; @@ -4743,6 +4785,7 @@ static int hwrm_ring_free_send_msg(struct bnxt *bp, static void bnxt_hwrm_ring_free(struct bnxt *bp, bool close_path) { + u32 type; int i; if (!bp->bnapi) @@ -4781,6 +4824,10 @@ static void bnxt_hwrm_ring_free(struct bnxt *bp, bool close_path) } } + if (bp->flags & BNXT_FLAG_CHIP_P5) + type = RING_FREE_REQ_RING_TYPE_RX_AGG; + else + type = RING_FREE_REQ_RING_TYPE_RX; for (i = 0; i < bp->rx_nr_rings; i++) { struct bnxt_rx_ring_info *rxr = >rx_ring[i]; struct bnxt_ring_struct *ring = >rx_agg_ring_struct
[PATCH net-next 10/23] bnxt_en: Re-structure doorbells.
The 57500 series chips have a new 64-bit doorbell format. Use a new bnxt_db_info structure to unify the new and the old 32-bit doorbells. Add a new bnxt_set_db() function to set up the doorbell addreses and doorbell keys ahead of time. Modify and introduce new doorbell helpers to help abstract and unify the old and new doorbells. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 164 +++--- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 65 +++-- drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 2 +- drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 2 +- 4 files changed, 171 insertions(+), 62 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 88ea8c7..56439a4 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -241,15 +241,46 @@ static bool bnxt_vf_pciid(enum board_idx idx) #define DB_CP_FLAGS(DB_KEY_CP | DB_IDX_VALID | DB_IRQ_DIS) #define DB_CP_IRQ_DIS_FLAGS(DB_KEY_CP | DB_IRQ_DIS) -#define BNXT_CP_DB_REARM(db, raw_cons) \ - writel(DB_CP_REARM_FLAGS | RING_CMP(raw_cons), db) - -#define BNXT_CP_DB(db, raw_cons) \ - writel(DB_CP_FLAGS | RING_CMP(raw_cons), db) - #define BNXT_CP_DB_IRQ_DIS(db) \ writel(DB_CP_IRQ_DIS_FLAGS, db) +#define BNXT_DB_CQ(db, idx)\ + writel(DB_CP_FLAGS | RING_CMP(idx), (db)->doorbell) + +#define BNXT_DB_NQ_P5(db, idx) \ + writeq((db)->db_key64 | DBR_TYPE_NQ | RING_CMP(idx), (db)->doorbell) + +#define BNXT_DB_CQ_ARM(db, idx) \ + writel(DB_CP_REARM_FLAGS | RING_CMP(idx), (db)->doorbell) + +#define BNXT_DB_NQ_ARM_P5(db, idx) \ + writeq((db)->db_key64 | DBR_TYPE_NQ_ARM | RING_CMP(idx), (db)->doorbell) + +static void bnxt_db_nq(struct bnxt *bp, struct bnxt_db_info *db, u32 idx) +{ + if (bp->flags & BNXT_FLAG_CHIP_P5) + BNXT_DB_NQ_P5(db, idx); + else + BNXT_DB_CQ(db, idx); +} + +static void bnxt_db_nq_arm(struct bnxt *bp, struct bnxt_db_info *db, u32 idx) +{ + if (bp->flags & BNXT_FLAG_CHIP_P5) + BNXT_DB_NQ_ARM_P5(db, idx); + else + BNXT_DB_CQ_ARM(db, idx); +} + +static void bnxt_db_cq(struct bnxt *bp, struct bnxt_db_info *db, u32 idx) +{ + if (bp->flags & BNXT_FLAG_CHIP_P5) + writeq(db->db_key64 | DBR_TYPE_CQ_ARMALL | RING_CMP(idx), + db->doorbell); + else + BNXT_DB_CQ(db, idx); +} + const u16 bnxt_lhint_arr[] = { TX_BD_FLAGS_LHINT_512_AND_SMALLER, TX_BD_FLAGS_LHINT_512_TO_1023, @@ -341,6 +372,7 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev) struct tx_push_buffer *tx_push_buf = txr->tx_push; struct tx_push_bd *tx_push = _push_buf->push_bd; struct tx_bd_ext *tx_push1 = _push->txbd2; + void __iomem *db = txr->tx_db.doorbell; void *pdata = tx_push_buf->data; u64 *end; int j, push_len; @@ -398,12 +430,11 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev) push_len = (length + sizeof(*tx_push) + 7) / 8; if (push_len > 16) { - __iowrite64_copy(txr->tx_doorbell, tx_push_buf, 16); - __iowrite32_copy(txr->tx_doorbell + 4, tx_push_buf + 1, + __iowrite64_copy(db, tx_push_buf, 16); + __iowrite32_copy(db + 4, tx_push_buf + 1, (push_len - 16) << 1); } else { - __iowrite64_copy(txr->tx_doorbell, tx_push_buf, -push_len); + __iowrite64_copy(db, tx_push_buf, push_len); } goto tx_done; @@ -505,7 +536,7 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev) txr->tx_prod = prod; if (!skb->xmit_more || netif_xmit_stopped(txq)) - bnxt_db_write(bp, txr->tx_doorbell, DB_KEY_TX | prod); + bnxt_db_write(bp, >tx_db, prod); tx_done: @@ -513,7 +544,7 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev) if (unlikely(bnxt_tx_avail(bp, txr) <= MAX_SKB_FRAGS + 1)) { if (skb->xmit_more && !tx_buf->is_push) - bnxt_db_write(bp, txr->tx_doorbell, DB_KEY_TX | prod); +
[PATCH net-next 12/23] bnxt_en: Modify the ring reservation functions for 57500 series chips.
The ring reservation functions have to be modified for P5 chips in the following ways: - bnxt_cp_ring_info structs map to internal NQs as well as CP rings. - Ring groups are not used. - 1 CP ring must be available for each RX or TX ring. - number of RSS contexts to reserve is multiples of 64 RX rings. - RFS currently not supported. Also, RX AGG rings are only used for jumbo frames, so we need to unconditionally call bnxt_reserve_rings() in __bnxt_open_nic() to see if we need to reserve AGG rings in case MTU has changed. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 127 +++--- 1 file changed, 97 insertions(+), 30 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 427eb82..a0d7237 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -4330,7 +4330,8 @@ static int bnxt_hwrm_vnic_qcaps(struct bnxt *bp) if (!rc) { u32 flags = le32_to_cpu(resp->flags); - if (flags & VNIC_QCAPS_RESP_FLAGS_RSS_DFLT_CR_CAP) + if (!(bp->flags & BNXT_FLAG_CHIP_P5) && + (flags & VNIC_QCAPS_RESP_FLAGS_RSS_DFLT_CR_CAP)) bp->flags |= BNXT_FLAG_NEW_RSS_CAP; if (flags & VNIC_QCAPS_RESP_FLAGS_ROCE_MIRRORING_CAPABLE_VNIC_CAP) @@ -4713,6 +4714,9 @@ static void bnxt_hwrm_ring_free(struct bnxt *bp, bool close_path) } } +static int bnxt_trim_rings(struct bnxt *bp, int *rx, int *tx, int max, + bool shared); + static int bnxt_hwrm_get_rings(struct bnxt *bp) { struct hwrm_func_qcfg_output *resp = bp->hwrm_cmd_resp_addr; @@ -4743,6 +4747,22 @@ static int bnxt_hwrm_get_rings(struct bnxt *bp) cp = le16_to_cpu(resp->alloc_cmpl_rings); stats = le16_to_cpu(resp->alloc_stat_ctx); cp = min_t(u16, cp, stats); + if (bp->flags & BNXT_FLAG_CHIP_P5) { + int rx = hw_resc->resv_rx_rings; + int tx = hw_resc->resv_tx_rings; + + if (bp->flags & BNXT_FLAG_AGG_RINGS) + rx >>= 1; + if (cp < (rx + tx)) { + bnxt_trim_rings(bp, , , cp, false); + if (bp->flags & BNXT_FLAG_AGG_RINGS) + rx <<= 1; + hw_resc->resv_rx_rings = rx; + hw_resc->resv_tx_rings = tx; + } + cp = le16_to_cpu(resp->alloc_msix); + hw_resc->resv_hw_ring_grps = rx; + } hw_resc->resv_cp_rings = cp; } mutex_unlock(>hwrm_cmd_lock); @@ -4768,6 +4788,8 @@ int __bnxt_hwrm_get_tx_rings(struct bnxt *bp, u16 fid, int *tx_rings) return rc; } +static bool bnxt_rfs_supported(struct bnxt *bp); + static void __bnxt_hwrm_reserve_pf_rings(struct bnxt *bp, struct hwrm_func_cfg_input *req, int tx_rings, int rx_rings, int ring_grps, @@ -4781,15 +4803,38 @@ __bnxt_hwrm_reserve_pf_rings(struct bnxt *bp, struct hwrm_func_cfg_input *req, req->num_tx_rings = cpu_to_le16(tx_rings); if (BNXT_NEW_RM(bp)) { enables |= rx_rings ? FUNC_CFG_REQ_ENABLES_NUM_RX_RINGS : 0; - enables |= cp_rings ? FUNC_CFG_REQ_ENABLES_NUM_CMPL_RINGS | - FUNC_CFG_REQ_ENABLES_NUM_STAT_CTXS : 0; - enables |= ring_grps ? - FUNC_CFG_REQ_ENABLES_NUM_HW_RING_GRPS : 0; + if (bp->flags & BNXT_FLAG_CHIP_P5) { + enables |= cp_rings ? FUNC_CFG_REQ_ENABLES_NUM_MSIX : 0; + enables |= tx_rings + ring_grps ? + FUNC_CFG_REQ_ENABLES_NUM_CMPL_RINGS | + FUNC_CFG_REQ_ENABLES_NUM_STAT_CTXS : 0; + enables |= rx_rings ? + FUNC_CFG_REQ_ENABLES_NUM_RSSCOS_CTXS : 0; + } else { + enables |= cp_rings ? + FUNC_CFG_REQ_ENABLES_NUM_CMPL_RINGS | + FUNC_CFG_REQ_ENABLES_NUM_STAT_CTXS : 0; + enables |= ring_grps ? + FUNC_CFG_REQ_ENABLES_NUM_HW_RING_GRPS | + FUNC_CFG_REQ_ENABLES_NUM_RSSCOS_CTXS : 0; + } enables |= vnics ? FUNC_CFG_REQ_ENABLES_NUM_VNICS : 0; req->num_rx_rings = cpu_to_le16(rx_rings); - req->num_hw_ring_grps = cpu_to_le16(ring_grps); - req->num_cmpl_rings = cpu_to_le16(
[PATCH net-next 11/23] bnxt_en: Adjust MSIX and ring groups for 57500 series chips.
Store the maximum MSIX capability in PCIe config. space earlier. When we call firmware to query capability, we need to compare the PCIe MSIX max count with the firmware count and use the smaller one as the MSIX count for 57500 (P5) chips. The new chips don't use ring groups. But previous chips do and the existing logic limits the available rings based on resource calculations including ring groups. Setting the max ring groups to the max rx rings will work on the new chips without changing the existing logic. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 56439a4..427eb82 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -5677,6 +5677,13 @@ int bnxt_hwrm_func_resc_qcaps(struct bnxt *bp, bool all) hw_resc->min_stat_ctxs = le16_to_cpu(resp->min_stat_ctx); hw_resc->max_stat_ctxs = le16_to_cpu(resp->max_stat_ctx); + if (bp->flags & BNXT_FLAG_CHIP_P5) { + u16 max_msix = le16_to_cpu(resp->max_msix); + + hw_resc->max_irqs = min_t(u16, hw_resc->max_irqs, max_msix); + hw_resc->max_hw_ring_grps = hw_resc->max_rx_rings; + } + if (BNXT_PF(bp)) { struct bnxt_pf_info *pf = >pf; @@ -9382,6 +9389,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) return -ENOMEM; bp = netdev_priv(dev); + bnxt_set_max_func_irqs(bp, max_irqs); if (bnxt_vf_pciid(ent->driver_data)) bp->flags |= BNXT_FLAG_VF; @@ -9513,7 +9521,6 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) bnxt_set_rx_skb_mode(bp, false); bnxt_set_tpa_flags(bp); bnxt_set_ring_params(bp); - bnxt_set_max_func_irqs(bp, max_irqs); rc = bnxt_set_dflt_rings(bp, true); if (rc) { netdev_err(bp->dev, "Not enough rings available.\n"); -- 2.5.1
[PATCH net-next 06/23] bnxt_en: Add new flags to setup new page table PTE bits on newer devices.
Newer chips require the PTU_PTE_VALID bit to be set for every page table entry for context memory and rings. Additional bits are also required for page table entries for all rings. Add a flags field to bnxt_ring_mem_info struct to specify these additional bits to be used when setting up the pages tables as needed. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 17 +++-- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 8 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 602dc09..f0da558 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -2230,8 +2230,11 @@ static void bnxt_free_ring(struct bnxt *bp, struct bnxt_ring_mem_info *rmem) static int bnxt_alloc_ring(struct bnxt *bp, struct bnxt_ring_mem_info *rmem) { struct pci_dev *pdev = bp->pdev; + u64 valid_bit = 0; int i; + if (rmem->flags & (BNXT_RMEM_VALID_PTE_FLAG | BNXT_RMEM_RING_PTE_FLAG)) + valid_bit = PTU_PTE_VALID; if (rmem->nr_pages > 1) { rmem->pg_tbl = dma_alloc_coherent(>dev, rmem->nr_pages * 8, @@ -2242,6 +2245,8 @@ static int bnxt_alloc_ring(struct bnxt *bp, struct bnxt_ring_mem_info *rmem) } for (i = 0; i < rmem->nr_pages; i++) { + u64 extra_bits = valid_bit; + rmem->pg_arr[i] = dma_alloc_coherent(>dev, rmem->page_size, >dma_arr[i], @@ -2249,8 +2254,16 @@ static int bnxt_alloc_ring(struct bnxt *bp, struct bnxt_ring_mem_info *rmem) if (!rmem->pg_arr[i]) return -ENOMEM; - if (rmem->nr_pages > 1) - rmem->pg_tbl[i] = cpu_to_le64(rmem->dma_arr[i]); + if (rmem->nr_pages > 1) { + if (i == rmem->nr_pages - 2 && + (rmem->flags & BNXT_RMEM_RING_PTE_FLAG)) + extra_bits |= PTU_PTE_NEXT_TO_LAST; + else if (i == rmem->nr_pages - 1 && +(rmem->flags & BNXT_RMEM_RING_PTE_FLAG)) + extra_bits |= PTU_PTE_LAST; + rmem->pg_tbl[i] = + cpu_to_le64(rmem->dma_arr[i] | extra_bits); + } } if (rmem->vmem_size) { diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 2e4b621..5792e5c 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -580,6 +580,10 @@ struct bnxt_sw_rx_agg_bd { struct bnxt_ring_mem_info { int nr_pages; int page_size; + u32 flags; +#define BNXT_RMEM_VALID_PTE_FLAG 1 +#define BNXT_RMEM_RING_PTE_FLAG2 + void**pg_arr; dma_addr_t *dma_arr; @@ -1109,6 +1113,10 @@ struct bnxt_vf_rep { struct bnxt_vf_rep_statstx_stats; }; +#define PTU_PTE_VALID 0x1UL +#define PTU_PTE_LAST 0x2UL +#define PTU_PTE_NEXT_TO_LAST 0x4UL + struct bnxt { void __iomem*bar0; void __iomem*bar1; -- 2.5.1
[PATCH net 2/4] bnxt_en: Fix enables field in HWRM_QUEUE_COS2BW_CFG request
From: Vasundhara Volam In HWRM_QUEUE_COS2BW_CFG request, enables field should have the bits set only for the queue ids which are having the valid parameters. This causes firmware to return error when the TC to hardware CoS queue mapping is not 1:1 during DCBNL ETS setup. Fixes: 2e8ef77ee0ff ("bnxt_en: Add TC to hardware QoS queue mapping logic.") Signed-off-by: Vasundhara Volam Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c index ddc98c3..a85d2be 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c @@ -98,13 +98,13 @@ static int bnxt_hwrm_queue_cos2bw_cfg(struct bnxt *bp, struct ieee_ets *ets, bnxt_hwrm_cmd_hdr_init(bp, , HWRM_QUEUE_COS2BW_CFG, -1, -1); for (i = 0; i < max_tc; i++) { - u8 qidx; + u8 qidx = bp->tc_to_qidx[i]; req.enables |= cpu_to_le32( - QUEUE_COS2BW_CFG_REQ_ENABLES_COS_QUEUE_ID0_VALID << i); + QUEUE_COS2BW_CFG_REQ_ENABLES_COS_QUEUE_ID0_VALID << + qidx); memset(, 0, sizeof(cos2bw)); - qidx = bp->tc_to_qidx[i]; cos2bw.queue_id = bp->q_info[qidx].queue_id; if (ets->tc_tsa[i] == IEEE_8021QAZ_TSA_STRICT) { cos2bw.tsa = -- 2.5.1
[PATCH net 1/4] bnxt_en: Fix VNIC reservations on the PF.
The enables bit for VNIC was set wrong when calling the HWRM_FUNC_CFG firmware call to reserve VNICs. This has the effect that the firmware will keep a large number of VNICs for the PF, and having very few for VFs. DPDK driver running on the VFs, which requires more VNICs, may not work properly as a result. Fixes: 674f50a5b026 ("bnxt_en: Implement new method to reserve rings.") Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 0478e56..2564a92 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -4650,7 +4650,7 @@ __bnxt_hwrm_reserve_pf_rings(struct bnxt *bp, struct hwrm_func_cfg_input *req, FUNC_CFG_REQ_ENABLES_NUM_STAT_CTXS : 0; enables |= ring_grps ? FUNC_CFG_REQ_ENABLES_NUM_HW_RING_GRPS : 0; - enables |= vnics ? FUNC_VF_CFG_REQ_ENABLES_NUM_VNICS : 0; + enables |= vnics ? FUNC_CFG_REQ_ENABLES_NUM_VNICS : 0; req->num_rx_rings = cpu_to_le16(rx_rings); req->num_hw_ring_grps = cpu_to_le16(ring_grps); -- 2.5.1
[PATCH net 3/4] bnxt_en: free hwrm resources, if driver probe fails.
From: Venkat Duvvuru When the driver probe fails, all the resources that were allocated prior to the failure must be freed. However, hwrm dma response memory is not getting freed. This patch fixes the problem described above. Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Venkat Duvvuru Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 2564a92..3718984 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -3017,10 +3017,11 @@ static void bnxt_free_hwrm_resources(struct bnxt *bp) { struct pci_dev *pdev = bp->pdev; - dma_free_coherent(>dev, PAGE_SIZE, bp->hwrm_cmd_resp_addr, - bp->hwrm_cmd_resp_dma_addr); - - bp->hwrm_cmd_resp_addr = NULL; + if (bp->hwrm_cmd_resp_addr) { + dma_free_coherent(>dev, PAGE_SIZE, bp->hwrm_cmd_resp_addr, + bp->hwrm_cmd_resp_dma_addr); + bp->hwrm_cmd_resp_addr = NULL; + } } static int bnxt_alloc_hwrm_resources(struct bnxt *bp) @@ -9057,6 +9058,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) bnxt_clear_int_mode(bp); init_err_pci_clean: + bnxt_free_hwrm_resources(bp); bnxt_cleanup_pci(bp); init_err_free: -- 2.5.1
[PATCH net 0/4] bnxt_en: Misc. bug fixes.
4 small bug fixes related to setting firmware message enables bits, possible memory leak when probe fails, and ring accouting when RDMA driver is loaded. Please queue these for -stable as well. Thanks. Michael Chan (1): bnxt_en: Fix VNIC reservations on the PF. Vasundhara Volam (2): bnxt_en: Fix enables field in HWRM_QUEUE_COS2BW_CFG request bnxt_en: get the reduced max_irqs by the ones used by RDMA Venkat Duvvuru (1): bnxt_en: free hwrm resources, if driver probe fails. drivers/net/ethernet/broadcom/bnxt/bnxt.c | 14 -- drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c | 6 +++--- 2 files changed, 11 insertions(+), 9 deletions(-) -- 2.5.1
[PATCH net 4/4] bnxt_en: get the reduced max_irqs by the ones used by RDMA
From: Vasundhara Volam When getting the max rings supported, get the reduced max_irqs by the ones used by RDMA. If the number MSIX is the limiting factor, this bug may cause the max ring count to be higher than it should be when RDMA driver is loaded and may result in ring allocation failures. Fixes: 30f529473ec9 ("bnxt_en: Do not modify max IRQ count after RDMA driver requests/frees IRQs.") Signed-off-by: Vasundhara Volam Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 3718984..e2d9254 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -8622,7 +8622,7 @@ static void _bnxt_get_max_rings(struct bnxt *bp, int *max_rx, int *max_tx, *max_tx = hw_resc->max_tx_rings; *max_rx = hw_resc->max_rx_rings; *max_cp = min_t(int, bnxt_get_max_func_cp_rings_for_en(bp), - hw_resc->max_irqs); + hw_resc->max_irqs - bnxt_get_ulp_msix_num(bp)); *max_cp = min_t(int, *max_cp, hw_resc->max_stat_ctxs); max_ring_grps = hw_resc->max_hw_ring_grps; if (BNXT_CHIP_TYPE_NITRO_A0(bp) && BNXT_PF(bp)) { -- 2.5.1
Re: [PATCH net 01/11] netpoll: do not test NAPI_STATE_SCHED in poll_one_napi()
On Thu, Sep 27, 2018 at 9:32 AM Eric Dumazet wrote: > > Since we do no longer require NAPI drivers to provide > an ndo_poll_controller(), napi_schedule() has not been done > before poll_one_napi() invocation. > > So testing NAPI_STATE_SCHED is likely to cause early returns. > > While we are at it, remove outdated comment. > > Note to future bisections : This change might surface prior > bugs in drivers. See commit 73f21c653f93 ("bnxt_en: Fix TX > timeout during netpoll.") for one occurrence. > > Fixes: ac3d9dd034e5 ("netpoll: make ndo_poll_controller() optional") > Signed-off-by: Eric Dumazet > Tested-by: Song Liu > Cc: Michael Chan Reviewed-and-tested-by: Michael Chan
[PATCH net v2] bnxt_en: Fix TX timeout during netpoll.
The current netpoll implementation in the bnxt_en driver has problems that may miss TX completion events. bnxt_poll_work() in effect is only handling at most 1 TX packet before exiting. In addition, there may be in flight TX completions that ->poll() may miss even after we fix bnxt_poll_work() to handle all visible TX completions. netpoll may not call ->poll() again and HW may not generate IRQ because the driver does not ARM the IRQ when the budget (0 for netpoll) is reached. We fix it by handling all TX completions and to always ARM the IRQ when we exit ->poll() with 0 budget. Also, the logic to ACK the completion ring in case it is almost filled with TX completions need to be adjusted to take care of the 0 budget case, as discussed with Eric Dumazet Reported-by: Song Liu Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 61957b0..0478e56 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -1884,8 +1884,11 @@ static int bnxt_poll_work(struct bnxt *bp, struct bnxt_napi *bnapi, int budget) if (TX_CMP_TYPE(txcmp) == CMP_TYPE_TX_L2_CMP) { tx_pkts++; /* return full budget so NAPI will complete. */ - if (unlikely(tx_pkts > bp->tx_wake_thresh)) + if (unlikely(tx_pkts > bp->tx_wake_thresh)) { rx_pkts = budget; + raw_cons = NEXT_RAW_CMP(raw_cons); + break; + } } else if ((TX_CMP_TYPE(txcmp) & 0x30) == 0x10) { if (likely(budget)) rc = bnxt_rx_pkt(bp, bnapi, _cons, ); @@ -1913,7 +1916,7 @@ static int bnxt_poll_work(struct bnxt *bp, struct bnxt_napi *bnapi, int budget) } raw_cons = NEXT_RAW_CMP(raw_cons); - if (rx_pkts == budget) + if (rx_pkts && rx_pkts == budget) break; } @@ -2027,8 +2030,12 @@ static int bnxt_poll(struct napi_struct *napi, int budget) while (1) { work_done += bnxt_poll_work(bp, bnapi, budget - work_done); - if (work_done >= budget) + if (work_done >= budget) { + if (!budget) + BNXT_CP_DB_REARM(cpr->cp_doorbell, +cpr->cp_raw_cons); break; + } if (!bnxt_has_work(bp, cpr)) { if (napi_complete_done(napi, work_done)) -- 2.5.1
Re: [PATCH net RFT] bnxt_en: Fix TX timeout during netpoll.
On Tue, Sep 25, 2018 at 7:25 PM Eric Dumazet wrote: > > On Tue, Sep 25, 2018 at 7:15 PM Michael Chan > wrote: > > > > On Tue, Sep 25, 2018 at 4:11 PM Michael Chan > > wrote: > > > > > > On Tue, Sep 25, 2018 at 3:15 PM Eric Dumazet > > > wrote: > > > > > > > > > > > It seems bnx2 should have a similar issue ? > > > > > > > > > > Yes, I think so. The MSIX mode in bnx2 is also auto-masking, meaning > > > that MSIX will only assert once after it is ARMed. If we return from > > > ->poll() when budget of 0 is reached without ARMing, we may not get > > > another MSIX. > > > > > > > On second thought, I think bnx2 is ok. If netpoll is polling on the > > TX packets and reaching budget of 0 and returning, the INT_ACK_CMD > > register is untouched. bnx2 uses the status block for events and the > > producers/consumers are cumulative. So there is no need to ACK the > > status block unless ARMing for interrupts. If there is an IRQ about > > to be fired, it won't be affected by the polling done by netpoll. > > > > In the case of bnxt, a completion ring is used for the events. The > > polling done by netpoll will cause the completion ring to be ACKed as > > entries are processed. ACKing the completion ring without ARMing may > > cause future IRQs to be disabled for that ring. > > About bnxt : Are you sure it is all about IRQ problems ? I'm pretty sure, because FB first reported TX timeouts followed by ring reset failures when running netconsole. These ring reset failures are caused by IRQs no longer working on some rings. > > What if the whole ring buffer is is filled, then all entries > are processed from netpoll. > > If cp_raw_cons becomes too high without the NIC knowing its (updated) > value, maybe no IRQ can be generated anymore because > of some wrapping issue (based on ring size) Good point. We have logic to handle that. We will ACK the ring at least once every tp->tx_wake_thresh TX packets. But this logic fails when the budget is 0, so I need to send a revised patch take care of this one case. > > I guess that in order to test this, we would need something bursting > 16000 messages while holding napi->poll_owner. > The (single) IRQ would set/grab the SCHED bit but the cpu responsible > to service this (soft)irq would spin for the whole test, > and no more IRQ should be fired really. Right, not easy to hit. But it should be handled by my v2 patch. Thanks.
Re: [PATCH net RFT] bnxt_en: Fix TX timeout during netpoll.
On Tue, Sep 25, 2018 at 4:11 PM Michael Chan wrote: > > On Tue, Sep 25, 2018 at 3:15 PM Eric Dumazet wrote: > > > > > It seems bnx2 should have a similar issue ? > > > > Yes, I think so. The MSIX mode in bnx2 is also auto-masking, meaning > that MSIX will only assert once after it is ARMed. If we return from > ->poll() when budget of 0 is reached without ARMing, we may not get > another MSIX. > On second thought, I think bnx2 is ok. If netpoll is polling on the TX packets and reaching budget of 0 and returning, the INT_ACK_CMD register is untouched. bnx2 uses the status block for events and the producers/consumers are cumulative. So there is no need to ACK the status block unless ARMing for interrupts. If there is an IRQ about to be fired, it won't be affected by the polling done by netpoll. In the case of bnxt, a completion ring is used for the events. The polling done by netpoll will cause the completion ring to be ACKed as entries are processed. ACKing the completion ring without ARMing may cause future IRQs to be disabled for that ring.
Re: [PATCH net RFT] bnxt_en: Fix TX timeout during netpoll.
On Tue, Sep 25, 2018 at 3:15 PM Eric Dumazet wrote: > > It seems bnx2 should have a similar issue ? > Yes, I think so. The MSIX mode in bnx2 is also auto-masking, meaning that MSIX will only assert once after it is ARMed. If we return from ->poll() when budget of 0 is reached without ARMing, we may not get another MSIX. I can work on a similar patch but I don't have bnx2 cards to test with anymore. Thanks.
[PATCH net RFT] bnxt_en: Fix TX timeout during netpoll.
The current netpoll implementation in the bnxt_en driver has problems that may miss TX completion events. bnxt_poll_work() in effect is only handling at most 1 TX packet before exiting. In addition, there may be in flight TX completions that ->poll() may miss even after we fix bnxt_poll_work() to handle all visible TX completions. netpoll may not call ->poll() again and HW may not generate IRQ because the driver does not ARM the IRQ when the budget (0 for netpoll) is reached. We fix it by handling all TX completions and to always ARM the IRQ when we exit ->poll() with 0 budget. Reported-by: Song Liu Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 61957b0..c981b53 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -1913,7 +1913,7 @@ static int bnxt_poll_work(struct bnxt *bp, struct bnxt_napi *bnapi, int budget) } raw_cons = NEXT_RAW_CMP(raw_cons); - if (rx_pkts == budget) + if (rx_pkts && rx_pkts == budget) break; } @@ -2027,8 +2027,12 @@ static int bnxt_poll(struct napi_struct *napi, int budget) while (1) { work_done += bnxt_poll_work(bp, bnapi, budget - work_done); - if (work_done >= budget) + if (work_done >= budget) { + if (!budget) + BNXT_CP_DB_REARM(cpr->cp_doorbell, +cpr->cp_raw_cons); break; + } if (!bnxt_has_work(bp, cpr)) { if (napi_complete_done(napi, work_done)) -- 2.5.1
Re: [PATCH net 00/15] netpoll: avoid capture effects for NAPI drivers
On Tue, Sep 25, 2018 at 11:25 AM Song Liu wrote: > > Hi Michael, > > This may not be related. But I am looking at this: > > bnxt_poll_work() { > > while (1) { > > if (rx_pkts == budget) > return > } > } > > With budget of 0, the loop will terminate after processing one packet. > But I think the expectation is to finish all tx packets. So it doesn't > feel right. Could you please confirm? > Right, this in effect is processing only 1 TX packet so it will be inefficient at least. But I think fixing it here still will not fix all the issues, because even if we process all the TX packets here, we may still miss some that are in flight. When we exit poll, netpoll may not call us back again and there may be no interrupts because we don't ARM the IRQ when budget of 0 is reached. I will send a test patch shortly for review and testing. Thanks.
Re: [PATCH net 00/15] netpoll: avoid capture effects for NAPI drivers
On Tue, Sep 25, 2018 at 7:20 AM Eric Dumazet wrote: > > On Tue, Sep 25, 2018 at 7:02 AM Michael Chan > wrote: > > > > On Mon, Sep 24, 2018 at 2:18 PM Song Liu wrote: > > > > > > > > > > > > > On Sep 24, 2018, at 2:05 PM, Eric Dumazet wrote: > > > > > > > >> > > > >> Interesting, maybe a bnxt specific issue. > > > >> > > > >> It seems their model is to process TX/RX notification in the same > > > >> queue, > > > >> they throw away RX events if budget == 0 > > > >> > > > >> It means commit e7b9569102995ebc26821789628eef45bd9840d8 is wrong and > > > >> must be reverted. > > > >> > > > >> Otherwise, we have a possibility of blocking a queue under netpoll > > > >> pressure. > > > > > > > > Hmm, actually a revert might not be enough, since code at lines > > > > 2030-2031 > > > > would fire and we might not call napi_complete_done() anyway. > > > > > > > > Unfortunately this driver logic is quite complex. > > > > > > > > Could you test on other NIC eventually ? > > > > > > > > > > It actually runs OK on ixgbe. > > > > > > @Michael, could you please help us with this? > > > > > I've taken a quick look using today's net tree plus Eric's > > poll_one_napi() patch. The problem I'm seeing is that netpoll calls > > bnxt_poll() with budget 0. And since work_done >= budget of 0, we > > return without calling napi_complete_done() and without arming the > > interrupt. netpoll doesn't always call us back until we call > > napi_complete_done(), right? So I think if there are in-flight TX > > completions, we'll miss those. > > That's the whole point of netpoll : > > We drain the TX queues, without interrupts being involved at all, > by calling ->napi() with a zero budget. > > napi_complete(), even if called from ->napi() while budget was zero, > should do nothing but return early. > > budget==0 means that ->napi() should process all TX completions. All TX completions that we can see. We cannot see the in-flight ones. If budget is exceeded, I think the assumption is that poll will always be called again. > > So it looks like bnxt has a bug, that is showing up after the latest > poll_one_napi() patch. > This latest patch is needed otherwise the cpu attempting the > netpoll-TX-drain might drain nothing at all, > since it does not anymore call ndo_poll_controller() that was grabbing > SCHED bits on all queues (napi_schedule() like calls) I think the latest patch is preventing the normal interrupt -> NAPI path from coming in and cleaning the remaining TX completions and arming the interrupt.
Re: [PATCH net 00/15] netpoll: avoid capture effects for NAPI drivers
On Mon, Sep 24, 2018 at 2:18 PM Song Liu wrote: > > > > > On Sep 24, 2018, at 2:05 PM, Eric Dumazet wrote: > > > >> > >> Interesting, maybe a bnxt specific issue. > >> > >> It seems their model is to process TX/RX notification in the same queue, > >> they throw away RX events if budget == 0 > >> > >> It means commit e7b9569102995ebc26821789628eef45bd9840d8 is wrong and > >> must be reverted. > >> > >> Otherwise, we have a possibility of blocking a queue under netpoll > >> pressure. > > > > Hmm, actually a revert might not be enough, since code at lines 2030-2031 > > would fire and we might not call napi_complete_done() anyway. > > > > Unfortunately this driver logic is quite complex. > > > > Could you test on other NIC eventually ? > > > > It actually runs OK on ixgbe. > > @Michael, could you please help us with this? > I've taken a quick look using today's net tree plus Eric's poll_one_napi() patch. The problem I'm seeing is that netpoll calls bnxt_poll() with budget 0. And since work_done >= budget of 0, we return without calling napi_complete_done() and without arming the interrupt. netpoll doesn't always call us back until we call napi_complete_done(), right? So I think if there are in-flight TX completions, we'll miss those.
[PATCH net] bnxt_en: Fix VF mac address regression.
The recent commit to always forward the VF MAC address to the PF for approval may not work if the PF driver or the firmware is older. This will cause the VF driver to fail during probe: bnxt_en :00:03.0 (unnamed net_device) (uninitialized): hwrm req_type 0xf seq id 0x5 error 0x bnxt_en :00:03.0 (unnamed net_device) (uninitialized): VF MAC address 00:00:17:02:05:d0 not approved by the PF bnxt_en :00:03.0: Unable to initialize mac address. bnxt_en: probe of :00:03.0 failed with error -99 We fix it by treating the error as fatal only if the VF MAC address is locally generated by the VF. Fixes: 707e7e966026 ("bnxt_en: Always forward VF MAC address to the PF.") Reported-by: Seth Forshee Reported-by: Siwei Liu Signed-off-by: Michael Chan --- Please queue this for stable as well. Thanks. drivers/net/ethernet/broadcom/bnxt/bnxt.c | 9 +++-- drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 9 + drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h | 2 +- 3 files changed, 13 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index cecbb1d..177587f 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -8027,7 +8027,7 @@ static int bnxt_change_mac_addr(struct net_device *dev, void *p) if (ether_addr_equal(addr->sa_data, dev->dev_addr)) return 0; - rc = bnxt_approve_mac(bp, addr->sa_data); + rc = bnxt_approve_mac(bp, addr->sa_data, true); if (rc) return rc; @@ -8827,14 +8827,19 @@ static int bnxt_init_mac_addr(struct bnxt *bp) } else { #ifdef CONFIG_BNXT_SRIOV struct bnxt_vf_info *vf = >vf; + bool strict_approval = true; if (is_valid_ether_addr(vf->mac_addr)) { /* overwrite netdev dev_addr with admin VF MAC */ memcpy(bp->dev->dev_addr, vf->mac_addr, ETH_ALEN); + /* Older PF driver or firmware may not approve this +* correctly. +*/ + strict_approval = false; } else { eth_hw_addr_random(bp->dev); } - rc = bnxt_approve_mac(bp, bp->dev->dev_addr); + rc = bnxt_approve_mac(bp, bp->dev->dev_addr, strict_approval); #endif } return rc; diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c index fcd085a..3962f6f 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c @@ -1104,7 +1104,7 @@ void bnxt_update_vf_mac(struct bnxt *bp) mutex_unlock(>hwrm_cmd_lock); } -int bnxt_approve_mac(struct bnxt *bp, u8 *mac) +int bnxt_approve_mac(struct bnxt *bp, u8 *mac, bool strict) { struct hwrm_func_vf_cfg_input req = {0}; int rc = 0; @@ -1122,12 +1122,13 @@ int bnxt_approve_mac(struct bnxt *bp, u8 *mac) memcpy(req.dflt_mac_addr, mac, ETH_ALEN); rc = hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); mac_done: - if (rc) { + if (rc && strict) { rc = -EADDRNOTAVAIL; netdev_warn(bp->dev, "VF MAC address %pM not approved by the PF\n", mac); + return rc; } - return rc; + return 0; } #else @@ -1144,7 +1145,7 @@ void bnxt_update_vf_mac(struct bnxt *bp) { } -int bnxt_approve_mac(struct bnxt *bp, u8 *mac) +int bnxt_approve_mac(struct bnxt *bp, u8 *mac, bool strict) { return 0; } diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h index e9b20cd..2eed9ed 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h @@ -39,5 +39,5 @@ int bnxt_sriov_configure(struct pci_dev *pdev, int num_vfs); void bnxt_sriov_disable(struct bnxt *); void bnxt_hwrm_exec_fwd_req(struct bnxt *); void bnxt_update_vf_mac(struct bnxt *); -int bnxt_approve_mac(struct bnxt *, u8 *); +int bnxt_approve_mac(struct bnxt *, u8 *, bool); #endif -- 2.5.1
Re: [PATCH net 0/3] bnxt_en: Bug fixes.
On Mon, Sep 3, 2018 at 10:50 PM, Michael Chan wrote: > On Mon, Sep 3, 2018 at 10:01 PM, David Miller wrote: >> >> From: Michael Chan >> Date: Mon, 3 Sep 2018 04:23:16 -0400 >> >> > This short series fixes resource related logic in the driver, mostly >> > affecting the RDMA driver under corner cases. >> >> Series applied, thanks Michael. >> >> Do you want patch #3 queued up for -stable? > > Yes, please go ahead. Thanks. But there is a dependency on patch #2 though. So #2 needs to be queued as well.
Re: [PATCH net 0/3] bnxt_en: Bug fixes.
On Mon, Sep 3, 2018 at 10:01 PM, David Miller wrote: > > From: Michael Chan > Date: Mon, 3 Sep 2018 04:23:16 -0400 > > > This short series fixes resource related logic in the driver, mostly > > affecting the RDMA driver under corner cases. > > Series applied, thanks Michael. > > Do you want patch #3 queued up for -stable? Yes, please go ahead. Thanks.
[PATCH net 1/3] bnxt_en: Fix firmware signaled resource change logic in open.
When the driver detects that resources have changed during open, it should reset the rx and tx rings to 0. This will properly setup the init sequence to initialize the default rings again. We also need to signal the RDMA driver to stop and clear its interrupts. We then call the RoCE driver to restart if a new set of default rings is successfully reserved. Fixes: 25e1acd6b92b ("bnxt_en: Notify firmware about IF state changes.") Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 8bb1e38..6a1baf3 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -6684,6 +6684,8 @@ static int bnxt_hwrm_if_change(struct bnxt *bp, bool up) hw_resc->resv_rx_rings = 0; hw_resc->resv_hw_ring_grps = 0; hw_resc->resv_vnics = 0; + bp->tx_nr_rings = 0; + bp->rx_nr_rings = 0; } return rc; } @@ -8769,20 +8771,25 @@ static int bnxt_init_dflt_ring_mode(struct bnxt *bp) if (bp->tx_nr_rings) return 0; + bnxt_ulp_irq_stop(bp); + bnxt_clear_int_mode(bp); rc = bnxt_set_dflt_rings(bp, true); if (rc) { netdev_err(bp->dev, "Not enough rings available.\n"); - return rc; + goto init_dflt_ring_err; } rc = bnxt_init_int_mode(bp); if (rc) - return rc; + goto init_dflt_ring_err; + bp->tx_nr_rings_per_tc = bp->tx_nr_rings; if (bnxt_rfs_supported(bp) && bnxt_rfs_capable(bp)) { bp->flags |= BNXT_FLAG_RFS; bp->dev->features |= NETIF_F_NTUPLE; } - return 0; +init_dflt_ring_err: + bnxt_ulp_irq_restart(bp, rc); + return rc; } int bnxt_restore_pf_fw_resources(struct bnxt *bp) -- 2.5.1
[PATCH net 3/3] bnxt_en: Do not adjust max_cp_rings by the ones used by RDMA.
Currently, the driver adjusts the bp->hw_resc.max_cp_rings by the number of MSIX vectors used by RDMA. There is one code path in open that needs to check the true max_cp_rings including any used by RDMA. This code is now checking for the reduced max_cp_rings which will fail when the number of cp rings is very small. To fix this in a clean way, we don't adjust max_cp_rings anymore. Instead, we add a helper bnxt_get_max_func_cp_rings_for_en() to get the reduced max_cp_rings when appropriate. Fixes: ec86f14ea506 ("bnxt_en: Add ULP calls to stop and restart IRQs.") Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 7 --- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 2 +- drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 7 --- drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 5 - 4 files changed, 9 insertions(+), 12 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 6472ce4..cecbb1d 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -5913,9 +5913,9 @@ unsigned int bnxt_get_max_func_cp_rings(struct bnxt *bp) return bp->hw_resc.max_cp_rings; } -void bnxt_set_max_func_cp_rings(struct bnxt *bp, unsigned int max) +unsigned int bnxt_get_max_func_cp_rings_for_en(struct bnxt *bp) { - bp->hw_resc.max_cp_rings = max; + return bp->hw_resc.max_cp_rings - bnxt_get_ulp_msix_num(bp); } static unsigned int bnxt_get_max_func_irqs(struct bnxt *bp) @@ -8631,7 +8631,8 @@ static void _bnxt_get_max_rings(struct bnxt *bp, int *max_rx, int *max_tx, *max_tx = hw_resc->max_tx_rings; *max_rx = hw_resc->max_rx_rings; - *max_cp = min_t(int, hw_resc->max_irqs, hw_resc->max_cp_rings); + *max_cp = min_t(int, bnxt_get_max_func_cp_rings_for_en(bp), + hw_resc->max_irqs); *max_cp = min_t(int, *max_cp, hw_resc->max_stat_ctxs); max_ring_grps = hw_resc->max_hw_ring_grps; if (BNXT_CHIP_TYPE_NITRO_A0(bp) && BNXT_PF(bp)) { diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index c4c77b9..bde3846 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -1481,7 +1481,7 @@ int bnxt_hwrm_set_coal(struct bnxt *); unsigned int bnxt_get_max_func_stat_ctxs(struct bnxt *bp); void bnxt_set_max_func_stat_ctxs(struct bnxt *bp, unsigned int max); unsigned int bnxt_get_max_func_cp_rings(struct bnxt *bp); -void bnxt_set_max_func_cp_rings(struct bnxt *bp, unsigned int max); +unsigned int bnxt_get_max_func_cp_rings_for_en(struct bnxt *bp); int bnxt_get_avail_msix(struct bnxt *bp, int num); int bnxt_reserve_rings(struct bnxt *bp); void bnxt_tx_disable(struct bnxt *bp); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c index 6d583bc..fcd085a 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c @@ -451,7 +451,7 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, int num_vfs) bnxt_hwrm_cmd_hdr_init(bp, , HWRM_FUNC_VF_RESOURCE_CFG, -1, -1); - vf_cp_rings = hw_resc->max_cp_rings - bp->cp_nr_rings; + vf_cp_rings = bnxt_get_max_func_cp_rings_for_en(bp) - bp->cp_nr_rings; vf_stat_ctx = hw_resc->max_stat_ctxs - bp->num_stat_ctxs; if (bp->flags & BNXT_FLAG_AGG_RINGS) vf_rx_rings = hw_resc->max_rx_rings - bp->rx_nr_rings * 2; @@ -549,7 +549,8 @@ static int bnxt_hwrm_func_cfg(struct bnxt *bp, int num_vfs) max_stat_ctxs = hw_resc->max_stat_ctxs; /* Remaining rings are distributed equally amongs VF's for now */ - vf_cp_rings = (hw_resc->max_cp_rings - bp->cp_nr_rings) / num_vfs; + vf_cp_rings = (bnxt_get_max_func_cp_rings_for_en(bp) - + bp->cp_nr_rings) / num_vfs; vf_stat_ctx = (max_stat_ctxs - bp->num_stat_ctxs) / num_vfs; if (bp->flags & BNXT_FLAG_AGG_RINGS) vf_rx_rings = (hw_resc->max_rx_rings - bp->rx_nr_rings * 2) / @@ -643,7 +644,7 @@ static int bnxt_sriov_enable(struct bnxt *bp, int *num_vfs) */ vfs_supported = *num_vfs; - avail_cp = hw_resc->max_cp_rings - bp->cp_nr_rings; + avail_cp = bnxt_get_max_func_cp_rings_for_en(bp) - bp->cp_nr_rings; avail_stat = hw_resc->max_stat_ctxs - bp->num_stat_ctxs; avail_cp = min_t(int, avail_cp, avail_stat); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c index deac73e..beee612 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c @@ -169,7 +169,6 @@ static int bnxt_req_msix_vecs(struct bnxt_en_dev *edev, int
[PATCH net 0/3] bnxt_en: Bug fixes.
This short series fixes resource related logic in the driver, mostly affecting the RDMA driver under corner cases. Michael Chan (3): bnxt_en: Fix firmware signaled resource change logic in open. bnxt_en: Clean up unused functions. bnxt_en: Do not adjust max_cp_rings by the ones used by RDMA. drivers/net/ethernet/broadcom/bnxt/bnxt.c | 22 +++--- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 3 +-- drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 7 --- drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 20 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h | 1 - 5 files changed, 20 insertions(+), 33 deletions(-) -- 2.5.1
[PATCH net 2/3] bnxt_en: Clean up unused functions.
Remove unused bnxt_subtract_ulp_resources(). Change bnxt_get_max_func_irqs() to static since it is only locally used. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 - drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 15 --- drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h | 1 - 4 files changed, 1 insertion(+), 18 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 6a1baf3..6472ce4 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -5918,7 +5918,7 @@ void bnxt_set_max_func_cp_rings(struct bnxt *bp, unsigned int max) bp->hw_resc.max_cp_rings = max; } -unsigned int bnxt_get_max_func_irqs(struct bnxt *bp) +static unsigned int bnxt_get_max_func_irqs(struct bnxt *bp) { struct bnxt_hw_resc *hw_resc = >hw_resc; diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index fefa011..c4c77b9 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -1482,7 +1482,6 @@ unsigned int bnxt_get_max_func_stat_ctxs(struct bnxt *bp); void bnxt_set_max_func_stat_ctxs(struct bnxt *bp, unsigned int max); unsigned int bnxt_get_max_func_cp_rings(struct bnxt *bp); void bnxt_set_max_func_cp_rings(struct bnxt *bp, unsigned int max); -unsigned int bnxt_get_max_func_irqs(struct bnxt *bp); int bnxt_get_avail_msix(struct bnxt *bp, int num); int bnxt_reserve_rings(struct bnxt *bp); void bnxt_tx_disable(struct bnxt *bp); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c index c37b284..deac73e 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c @@ -220,21 +220,6 @@ int bnxt_get_ulp_msix_base(struct bnxt *bp) return 0; } -void bnxt_subtract_ulp_resources(struct bnxt *bp, int ulp_id) -{ - ASSERT_RTNL(); - if (bnxt_ulp_registered(bp->edev, ulp_id)) { - struct bnxt_en_dev *edev = bp->edev; - unsigned int msix_req, max; - - msix_req = edev->ulp_tbl[ulp_id].msix_requested; - max = bnxt_get_max_func_cp_rings(bp); - bnxt_set_max_func_cp_rings(bp, max - msix_req); - max = bnxt_get_max_func_stat_ctxs(bp); - bnxt_set_max_func_stat_ctxs(bp, max - 1); - } -} - static int bnxt_send_msg(struct bnxt_en_dev *edev, int ulp_id, struct bnxt_fw_msg *fw_msg) { diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h index df48ac7..d9bea37 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h @@ -90,7 +90,6 @@ static inline bool bnxt_ulp_registered(struct bnxt_en_dev *edev, int ulp_id) int bnxt_get_ulp_msix_num(struct bnxt *bp); int bnxt_get_ulp_msix_base(struct bnxt *bp); -void bnxt_subtract_ulp_resources(struct bnxt *bp, int ulp_id); void bnxt_ulp_stop(struct bnxt *bp); void bnxt_ulp_start(struct bnxt *bp); void bnxt_ulp_sriov_cfg(struct bnxt *bp, int num_vfs); -- 2.5.1
Re: bnxt: card intermittently hanging and dropping link
On Thu, Aug 16, 2018 at 2:09 AM, Daniel Axtens wrote: > Hi Michael, > >> The main issue is the TX timeout. >> . >> >>> [ 2682.911693] bnxt_en :3b:00.0 eth4: TX timeout detected, starting >>> reset task! >>> [ 2683.782496] bnxt_en :3b:00.0 eth4: Resp cmpl intr err msg: 0x51 >>> [ 2683.783061] bnxt_en :3b:00.0 eth4: hwrm_ring_free tx failed. rc:-1 >>> [ 2684.634557] bnxt_en :3b:00.0 eth4: Resp cmpl intr err msg: 0x51 >>> [ 2684.635120] bnxt_en :3b:00.0 eth4: hwrm_ring_free tx failed. rc:-1 >> >> and it is not recovering. >> >> Please provide ethtool -i eth4 which will show the firmware version on >> the NIC. Let's see if the firmware is too old. > > driver: bnxt_en > version: 1.8.0 > firmware-version: 20.6.151.0/pkg 20.06.05.11 I believe the firmware should be updated. My colleague will contact you on how to proceed. Thanks.
Re: bnxt: card intermittently hanging and dropping link
On Wed, Aug 15, 2018 at 10:29 PM, Daniel Axtens wrote: > [ 2682.911295] [ cut here ] > [ 2682.911319] NETDEV WATCHDOG: eth4 (bnxt_en): transmit queue 0 timed out The main issue is the TX timeout. . > [ 2682.911693] bnxt_en :3b:00.0 eth4: TX timeout detected, starting reset > task! > [ 2683.782496] bnxt_en :3b:00.0 eth4: Resp cmpl intr err msg: 0x51 > [ 2683.783061] bnxt_en :3b:00.0 eth4: hwrm_ring_free tx failed. rc:-1 > [ 2684.634557] bnxt_en :3b:00.0 eth4: Resp cmpl intr err msg: 0x51 > [ 2684.635120] bnxt_en :3b:00.0 eth4: hwrm_ring_free tx failed. rc:-1 and it is not recovering. Please provide ethtool -i eth4 which will show the firmware version on the NIC. Let's see if the firmware is too old. Thanks.
[PATCH net-next v2] bnxt_en: Fix strcpy() warnings in bnxt_ethtool.c
From: Vasundhara Volam This patch fixes following smatch warnings: drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:2826 bnxt_fill_coredump_seg_hdr() error: strcpy() '"sEgM"' too large for 'seg_hdr->signature' (5 vs 4) drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:2858 bnxt_fill_coredump_record() error: strcpy() '"cOrE"' too large for 'record->signature' (5 vs 4) drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:2879 bnxt_fill_coredump_record() error: strcpy() 'utsname()->sysname' too large for 'record->os_name' (65 vs 32) Fixes: 6c5657d085ae ("bnxt_en: Add support for ethtool get dump.") Reported-by: Dan Carpenter Signed-off-by: Vasundhara Volam Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c index b6dbc3f..9c929cd 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c @@ -2823,7 +2823,7 @@ bnxt_fill_coredump_seg_hdr(struct bnxt *bp, int status, u32 duration, u32 instance) { memset(seg_hdr, 0, sizeof(*seg_hdr)); - strcpy(seg_hdr->signature, "sEgM"); + memcpy(seg_hdr->signature, "sEgM", 4); if (seg_rec) { seg_hdr->component_id = (__force __le32)seg_rec->component_id; seg_hdr->segment_id = (__force __le32)seg_rec->segment_id; @@ -2855,7 +2855,7 @@ bnxt_fill_coredump_record(struct bnxt *bp, struct bnxt_coredump_record *record, time64_to_tm(start, 0, ); memset(record, 0, sizeof(*record)); - strcpy(record->signature, "cOrE"); + memcpy(record->signature, "cOrE", 4); record->flags = 0; record->low_version = 0; record->high_version = 1; @@ -2876,7 +2876,7 @@ bnxt_fill_coredump_record(struct bnxt *bp, struct bnxt_coredump_record *record, record->os_ver_major = cpu_to_le32(os_ver_major); record->os_ver_minor = cpu_to_le32(os_ver_minor); - strcpy(record->os_name, utsname()->sysname); + strlcpy(record->os_name, utsname()->sysname, 32); time64_to_tm(end, 0, ); record->end_year = cpu_to_le16(tm.tm_year + 1900); record->end_month = cpu_to_le16(tm.tm_mon + 1); -- 2.5.1
Re: [PATCH net-next] bnxt_en: Fix strcpy() warnings in bnxt_ethtool.c
On Fri, Aug 10, 2018 at 2:37 PM, David Miller wrote: > From: David Miller > Date: Fri, 10 Aug 2018 14:35:45 -0700 (PDT) > >> From: Michael Chan >> Date: Fri, 10 Aug 2018 17:02:12 -0400 >> >>> From: Vasundhara Volam >>> >>> This patch fixes following smatch warnings: >>> >>> drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:2826 >>> bnxt_fill_coredump_seg_hdr() error: strcpy() '"sEgM"' too large for >>> 'seg_hdr->signature' (5 vs 4) >>> drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:2858 >>> bnxt_fill_coredump_record() error: strcpy() '"cOrE"' too large for >>> 'record->signature' (5 vs 4) >>> drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:2879 >>> bnxt_fill_coredump_record() error: strcpy() 'utsname()->sysname' too large >>> for 'record->os_name' (65 vs 32) >>> >>> Fixes: 6c5657d085ae ("bnxt_en: Add support for ethtool get dump.") >>> Reported-by: Dan Carpenter >>> Signed-off-by: Vasundhara Volam >>> Signed-off-by: Michael Chan >> >> Applied, thanks Michael. > > Actually, I'm reverting, this may fix those three warnings, but they are > replaced with > a new one: > > ./include/linux/string.h:246:9: warning: ‘__builtin_strncpy’ output may be > truncated copying 32 bytes from a string of length 64 [-Wstringop-truncation] > OK. I'm guessing strlcpy() is the right variant to use here. I will repost v2 using strlcpy(). Thanks.
[PATCH net-next] bnxt_en: Fix strcpy() warnings in bnxt_ethtool.c
From: Vasundhara Volam This patch fixes following smatch warnings: drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:2826 bnxt_fill_coredump_seg_hdr() error: strcpy() '"sEgM"' too large for 'seg_hdr->signature' (5 vs 4) drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:2858 bnxt_fill_coredump_record() error: strcpy() '"cOrE"' too large for 'record->signature' (5 vs 4) drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:2879 bnxt_fill_coredump_record() error: strcpy() 'utsname()->sysname' too large for 'record->os_name' (65 vs 32) Fixes: 6c5657d085ae ("bnxt_en: Add support for ethtool get dump.") Reported-by: Dan Carpenter Signed-off-by: Vasundhara Volam Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c index b6dbc3f..d6f3289 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c @@ -2823,7 +2823,7 @@ bnxt_fill_coredump_seg_hdr(struct bnxt *bp, int status, u32 duration, u32 instance) { memset(seg_hdr, 0, sizeof(*seg_hdr)); - strcpy(seg_hdr->signature, "sEgM"); + memcpy(seg_hdr->signature, "sEgM", 4); if (seg_rec) { seg_hdr->component_id = (__force __le32)seg_rec->component_id; seg_hdr->segment_id = (__force __le32)seg_rec->segment_id; @@ -2855,7 +2855,7 @@ bnxt_fill_coredump_record(struct bnxt *bp, struct bnxt_coredump_record *record, time64_to_tm(start, 0, ); memset(record, 0, sizeof(*record)); - strcpy(record->signature, "cOrE"); + memcpy(record->signature, "cOrE", 4); record->flags = 0; record->low_version = 0; record->high_version = 1; @@ -2876,7 +2876,7 @@ bnxt_fill_coredump_record(struct bnxt *bp, struct bnxt_coredump_record *record, record->os_ver_major = cpu_to_le32(os_ver_major); record->os_ver_minor = cpu_to_le32(os_ver_minor); - strcpy(record->os_name, utsname()->sysname); + strncpy(record->os_name, utsname()->sysname, 32); time64_to_tm(end, 0, ); record->end_year = cpu_to_le16(tm.tm_year + 1900); record->end_month = cpu_to_le16(tm.tm_mon + 1); -- 2.5.1
[PATCH net-next 02/13] bnxt_en: Adjust timer based on ethtool stats-block-usecs settings.
The driver gathers statistics using 2 mechanisms. Some stats are DMA'ed directly from hardware and others are polled from the driver's timer. Currently, we only adjust the DMA frequency based on the ethtool stats-block-usecs setting. This patch adjusts the driver's timer frequency as well to make everything consistent. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c index 3d40e49..1f626af 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c @@ -112,6 +112,11 @@ static int bnxt_set_coalesce(struct net_device *dev, BNXT_MAX_STATS_COAL_TICKS); stats_ticks = rounddown(stats_ticks, BNXT_MIN_STATS_COAL_TICKS); bp->stats_coal_ticks = stats_ticks; + if (bp->stats_coal_ticks) + bp->current_interval = + bp->stats_coal_ticks * HZ / 100; + else + bp->current_interval = BNXT_TIMER_INTERVAL; update_stats = true; } -- 2.5.1
[PATCH net-next 05/13] bnxt_en: Add new VF resource allocation strategy mode.
The new mode is "minimal-static" to be used when resources are more limited to support a large number of VFs, for example The PF driver will provision guaranteed minimum resources of 0. Each VF has no guranteed resources until it tries to reserve resources during device open. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 + drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 23 ++- 3 files changed, 16 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index fd936c5..e0e3b4b 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -5162,7 +5162,7 @@ int bnxt_hwrm_func_resc_qcaps(struct bnxt *bp, bool all) pf->vf_resv_strategy = le16_to_cpu(resp->vf_reservation_strategy); - if (pf->vf_resv_strategy > BNXT_VF_RESV_STRATEGY_MINIMAL) + if (pf->vf_resv_strategy > BNXT_VF_RESV_STRATEGY_MINIMAL_STATIC) pf->vf_resv_strategy = BNXT_VF_RESV_STRATEGY_MAXIMAL; } hwrm_func_resc_qcaps_exit: diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 47eec14..b44a758 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -862,6 +862,7 @@ struct bnxt_pf_info { u8 vf_resv_strategy; #define BNXT_VF_RESV_STRATEGY_MAXIMAL 0 #define BNXT_VF_RESV_STRATEGY_MINIMAL 1 +#define BNXT_VF_RESV_STRATEGY_MINIMAL_STATIC 2 void*hwrm_cmd_req_addr[4]; dma_addr_t hwrm_cmd_req_dma_addr[4]; struct bnxt_vf_info *vf; diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c index f560845..b896a52 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c @@ -447,7 +447,7 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, int num_vfs) u16 vf_tx_rings, vf_rx_rings, vf_cp_rings; u16 vf_stat_ctx, vf_vnics, vf_ring_grps; struct bnxt_pf_info *pf = >pf; - int i, rc = 0; + int i, rc = 0, min = 1; bnxt_hwrm_cmd_hdr_init(bp, , HWRM_FUNC_VF_RESOURCE_CFG, -1, -1); @@ -464,14 +464,19 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, int num_vfs) req.min_rsscos_ctx = cpu_to_le16(BNXT_VF_MIN_RSS_CTX); req.max_rsscos_ctx = cpu_to_le16(BNXT_VF_MAX_RSS_CTX); - if (pf->vf_resv_strategy == BNXT_VF_RESV_STRATEGY_MINIMAL) { - req.min_cmpl_rings = cpu_to_le16(1); - req.min_tx_rings = cpu_to_le16(1); - req.min_rx_rings = cpu_to_le16(1); - req.min_l2_ctxs = cpu_to_le16(BNXT_VF_MIN_L2_CTX); - req.min_vnics = cpu_to_le16(1); - req.min_stat_ctx = cpu_to_le16(1); - req.min_hw_ring_grps = cpu_to_le16(1); + if (pf->vf_resv_strategy == BNXT_VF_RESV_STRATEGY_MINIMAL_STATIC) { + min = 0; + req.min_rsscos_ctx = cpu_to_le16(min); + } + if (pf->vf_resv_strategy == BNXT_VF_RESV_STRATEGY_MINIMAL || + pf->vf_resv_strategy == BNXT_VF_RESV_STRATEGY_MINIMAL_STATIC) { + req.min_cmpl_rings = cpu_to_le16(min); + req.min_tx_rings = cpu_to_le16(min); + req.min_rx_rings = cpu_to_le16(min); + req.min_l2_ctxs = cpu_to_le16(min); + req.min_vnics = cpu_to_le16(min); + req.min_stat_ctx = cpu_to_le16(min); + req.min_hw_ring_grps = cpu_to_le16(min); } else { vf_cp_rings /= num_vfs; vf_tx_rings /= num_vfs; -- 2.5.1
[PATCH net-next 04/13] bnxt_en: Add PHY retry logic.
During hotplug, the driver's open function can be called almost immediately after power on reset. The PHY may not be ready and the firmware may return failure when the driver tries to update PHY settings. Add retry logic fired from the driver's timer to retry the operation for 5 seconds. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 31 ++- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 4 2 files changed, 34 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index d9fc905..fd936c5 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -6898,8 +6898,14 @@ static int __bnxt_open_nic(struct bnxt *bp, bool irq_re_init, bool link_re_init) mutex_lock(>link_lock); rc = bnxt_update_phy_setting(bp); mutex_unlock(>link_lock); - if (rc) + if (rc) { netdev_warn(bp->dev, "failed to update phy settings\n"); + if (BNXT_SINGLE_PF(bp)) { + bp->link_info.phy_retry = true; + bp->link_info.phy_retry_expires = + jiffies + 5 * HZ; + } + } } if (irq_re_init) @@ -7583,6 +7589,16 @@ static void bnxt_timer(struct timer_list *t) set_bit(BNXT_FLOW_STATS_SP_EVENT, >sp_event); bnxt_queue_sp_work(bp); } + + if (bp->link_info.phy_retry) { + if (time_after(jiffies, bp->link_info.phy_retry_expires)) { + bp->link_info.phy_retry = 0; + netdev_warn(bp->dev, "failed to update phy settings after maximum retries.\n"); + } else { + set_bit(BNXT_UPDATE_PHY_SP_EVENT, >sp_event); + bnxt_queue_sp_work(bp); + } + } bnxt_restart_timer: mod_timer(>timer, jiffies + bp->current_interval); } @@ -7670,6 +7686,19 @@ static void bnxt_sp_task(struct work_struct *work) netdev_err(bp->dev, "SP task can't update link (rc: %x)\n", rc); } + if (test_and_clear_bit(BNXT_UPDATE_PHY_SP_EVENT, >sp_event)) { + int rc; + + mutex_lock(>link_lock); + rc = bnxt_update_phy_setting(bp); + mutex_unlock(>link_lock); + if (rc) { + netdev_warn(bp->dev, "update phy settings retry failed\n"); + } else { + bp->link_info.phy_retry = false; + netdev_info(bp->dev, "update phy settings retry succeeded\n"); + } + } if (test_and_clear_bit(BNXT_HWRM_PORT_MODULE_SP_EVENT, >sp_event)) { mutex_lock(>link_lock); bnxt_get_port_module_status(bp); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 0d49fe0..47eec14 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -959,6 +959,9 @@ struct bnxt_link_info { u16 advertising;/* user adv setting */ boolforce_link_chng; + boolphy_retry; + unsigned long phy_retry_expires; + /* a copy of phy_qcfg output used to report link * info to VF */ @@ -1344,6 +1347,7 @@ struct bnxt { #define BNXT_GENEVE_DEL_PORT_SP_EVENT 13 #define BNXT_LINK_SPEED_CHNG_SP_EVENT 14 #define BNXT_FLOW_STATS_SP_EVENT 15 +#define BNXT_UPDATE_PHY_SP_EVENT 16 struct bnxt_hw_resc hw_resc; struct bnxt_pf_info pf; -- 2.5.1
[PATCH net-next 13/13] bnxt_en: Do not use the CNP CoS queue for networking traffic.
The CNP CoS queue is reserved for internal RDMA Congestion Notification Packets (CNP) and should not be used for a TC. Modify the CoS queue discovery code to skip over the CNP CoS queue and to reduce bp->max_tc accordingly. However, if RDMA is disabled in NVRAM, the the CNP CoS queue can be used for a TC. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 22 ++ drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h | 4 2 files changed, 18 insertions(+), 8 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index dde904b..d7f51ab 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -5281,7 +5281,8 @@ static int bnxt_hwrm_queue_qportcfg(struct bnxt *bp) int rc = 0; struct hwrm_queue_qportcfg_input req = {0}; struct hwrm_queue_qportcfg_output *resp = bp->hwrm_cmd_resp_addr; - u8 i, *qptr; + u8 i, j, *qptr; + bool no_rdma; bnxt_hwrm_cmd_hdr_init(bp, , HWRM_QUEUE_QPORTCFG, -1, -1); @@ -5299,19 +5300,24 @@ static int bnxt_hwrm_queue_qportcfg(struct bnxt *bp) if (bp->max_tc > BNXT_MAX_QUEUE) bp->max_tc = BNXT_MAX_QUEUE; + no_rdma = !(bp->flags & BNXT_FLAG_ROCE_CAP); + qptr = >queue_id0; + for (i = 0, j = 0; i < bp->max_tc; i++) { + bp->q_info[j].queue_id = *qptr++; + bp->q_info[j].queue_profile = *qptr++; + bp->tc_to_qidx[j] = j; + if (!BNXT_CNPQ(bp->q_info[j].queue_profile) || + (no_rdma && BNXT_PF(bp))) + j++; + } + bp->max_tc = max_t(u8, j, 1); + if (resp->queue_cfg_info & QUEUE_QPORTCFG_RESP_QUEUE_CFG_INFO_ASYM_CFG) bp->max_tc = 1; if (bp->max_lltc > bp->max_tc) bp->max_lltc = bp->max_tc; - qptr = >queue_id0; - for (i = 0; i < bp->max_tc; i++) { - bp->q_info[i].queue_id = *qptr++; - bp->q_info[i].queue_profile = *qptr++; - bp->tc_to_qidx[i] = i; - } - qportcfg_exit: mutex_unlock(>hwrm_cmd_lock); return rc; diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h index c0e16c0..6eed231 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h @@ -43,6 +43,10 @@ struct bnxt_dscp2pri_entry { ((q_profile) == \ QUEUE_QPORTCFG_RESP_QUEUE_ID0_SERVICE_PROFILE_LOSSLESS_ROCE) +#define BNXT_CNPQ(q_profile) \ + ((q_profile) == \ +QUEUE_QPORTCFG_RESP_QUEUE_ID0_SERVICE_PROFILE_LOSSY_ROCE_CNP) + #define HWRM_STRUCT_DATA_SUBTYPE_HOST_OPERATIONAL 0x0300 void bnxt_dcb_init(struct bnxt *bp); -- 2.5.1
[PATCH net-next 10/13] bnxt_en: Notify firmware about IF state changes.
Use latest firmware API to notify firmware about IF state changes. Firmware has the option to clean up resources during IF down and to require the driver to reserve resources again during IF up. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 53 +-- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 + 2 files changed, 52 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 1659940..56bd097 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -3638,7 +3638,9 @@ int bnxt_hwrm_func_rgtr_async_events(struct bnxt *bp, unsigned long *bmap, static int bnxt_hwrm_func_drv_rgtr(struct bnxt *bp) { + struct hwrm_func_drv_rgtr_output *resp = bp->hwrm_cmd_resp_addr; struct hwrm_func_drv_rgtr_input req = {0}; + int rc; bnxt_hwrm_cmd_hdr_init(bp, , HWRM_FUNC_DRV_RGTR, -1, -1); @@ -3676,7 +3678,15 @@ static int bnxt_hwrm_func_drv_rgtr(struct bnxt *bp) cpu_to_le32(FUNC_DRV_RGTR_REQ_ENABLES_VF_REQ_FWD); } - return hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); + mutex_lock(>hwrm_cmd_lock); + rc = _hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); + if (rc) + rc = -EIO; + else if (resp->flags & +cpu_to_le32(FUNC_DRV_RGTR_RESP_FLAGS_IF_CHANGE_SUPPORTED)) + bp->fw_cap |= BNXT_FW_CAP_IF_CHANGE; + mutex_unlock(>hwrm_cmd_lock); + return rc; } static int bnxt_hwrm_func_drv_unrgtr(struct bnxt *bp) @@ -6637,6 +6647,39 @@ static int bnxt_hwrm_shutdown_link(struct bnxt *bp) return hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); } +static int bnxt_hwrm_if_change(struct bnxt *bp, bool up) +{ + struct hwrm_func_drv_if_change_output *resp = bp->hwrm_cmd_resp_addr; + struct hwrm_func_drv_if_change_input req = {0}; + bool resc_reinit = false; + int rc; + + if (!(bp->fw_cap & BNXT_FW_CAP_IF_CHANGE)) + return 0; + + bnxt_hwrm_cmd_hdr_init(bp, , HWRM_FUNC_DRV_IF_CHANGE, -1, -1); + if (up) + req.flags = cpu_to_le32(FUNC_DRV_IF_CHANGE_REQ_FLAGS_UP); + mutex_lock(>hwrm_cmd_lock); + rc = _hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); + if (!rc && (resp->flags & + cpu_to_le32(FUNC_DRV_IF_CHANGE_RESP_FLAGS_RESC_CHANGE))) + resc_reinit = true; + mutex_unlock(>hwrm_cmd_lock); + + if (up && resc_reinit && BNXT_NEW_RM(bp)) { + struct bnxt_hw_resc *hw_resc = >hw_resc; + + rc = bnxt_hwrm_func_resc_qcaps(bp, true); + hw_resc->resv_cp_rings = 0; + hw_resc->resv_tx_rings = 0; + hw_resc->resv_rx_rings = 0; + hw_resc->resv_hw_ring_grps = 0; + hw_resc->resv_vnics = 0; + } + return rc; +} + static int bnxt_hwrm_port_led_qcaps(struct bnxt *bp) { struct hwrm_port_led_qcaps_output *resp = bp->hwrm_cmd_resp_addr; @@ -6991,8 +7034,13 @@ void bnxt_half_close_nic(struct bnxt *bp) static int bnxt_open(struct net_device *dev) { struct bnxt *bp = netdev_priv(dev); + int rc; - return __bnxt_open_nic(bp, true, true); + bnxt_hwrm_if_change(bp, true); + rc = __bnxt_open_nic(bp, true, true); + if (rc) + bnxt_hwrm_if_change(bp, false); + return rc; } static bool bnxt_drv_busy(struct bnxt *bp) @@ -7056,6 +7104,7 @@ static int bnxt_close(struct net_device *dev) bnxt_close_nic(bp, true, true); bnxt_hwrm_shutdown_link(bp); + bnxt_hwrm_if_change(bp, false); return 0; } diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index ded2aff..6c40b257 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -1290,6 +1290,7 @@ struct bnxt { #define BNXT_FW_CAP_LLDP_AGENT 0x0002 #define BNXT_FW_CAP_DCBX_AGENT 0x0004 #define BNXT_FW_CAP_NEW_RM 0x0008 + #define BNXT_FW_CAP_IF_CHANGE 0x0010 #define BNXT_NEW_RM(bp)((bp)->fw_cap & BNXT_FW_CAP_NEW_RM) u32 hwrm_spec_code; -- 2.5.1
[PATCH net-next 06/13] bnxt_en: Update RSS setup and GRO-HW logic according to the latest spec.
Set the default hash mode flag in HWRM_VNIC_RSS_CFG to signal to the firmware that the driver is compliant with the latest spec. With that, the firmware can return expanded RSS profile IDs that the driver checks to setup the proper gso_type for GRO-HW packets. But instead of checking for the new profile IDs, we check the IP_TYPE flag in TPA_START which is more straight forward than checking a list of profile IDs. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 ++- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 4 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index e0e3b4b..1714850 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -1115,7 +1115,7 @@ static void bnxt_tpa_start(struct bnxt *bp, struct bnxt_rx_ring_info *rxr, tpa_info->hash_type = PKT_HASH_TYPE_L4; tpa_info->gso_type = SKB_GSO_TCPV4; /* RSS profiles 1 and 3 with extract code 0 for inner 4-tuple */ - if (hash_type == 3) + if (hash_type == 3 || TPA_START_IS_IPV6(tpa_start1)) tpa_info->gso_type = SKB_GSO_TCPV6; tpa_info->rss_hash = le32_to_cpu(tpa_start->rx_tpa_start_cmp_rss_hash); @@ -3981,6 +3981,7 @@ static int bnxt_hwrm_vnic_set_rss(struct bnxt *bp, u16 vnic_id, bool set_rss) bnxt_hwrm_cmd_hdr_init(bp, , HWRM_VNIC_RSS_CFG, -1, -1); if (set_rss) { req.hash_type = cpu_to_le32(bp->rss_hash_cfg); + req.hash_mode_flags = VNIC_RSS_CFG_REQ_HASH_MODE_FLAGS_DEFAULT; if (vnic->flags & BNXT_VNIC_RSS_FLAG) { if (BNXT_CHIP_TYPE_NITRO_A0(bp)) max_rings = bp->rx_nr_rings - 1; diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index b44a758..7ea022d 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -326,6 +326,10 @@ struct rx_tpa_start_cmp_ext { ((le32_to_cpu((rx_tpa_start)->rx_tpa_start_cmp_cfa_code_v2) & \ RX_TPA_START_CMP_CFA_CODE) >> RX_TPA_START_CMPL_CFA_CODE_SHIFT) +#define TPA_START_IS_IPV6(rx_tpa_start)\ + (!!((rx_tpa_start)->rx_tpa_start_cmp_flags2 & \ + cpu_to_le32(RX_TPA_START_CMP_FLAGS2_IP_TYPE))) + struct rx_tpa_end_cmp { __le32 rx_tpa_end_cmp_len_flags_type; #define RX_TPA_END_CMP_TYPE (0x3f << 0) -- 2.5.1
[PATCH net-next 07/13] bnxt_en: Add support for ethtool get dump.
From: Vasundhara Volam Add support to collect live firmware coredump via ethtool. Signed-off-by: Vasundhara Volam Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.h | 66 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 333 + drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h | 37 +++ 3 files changed, 436 insertions(+) create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.h diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.h new file mode 100644 index 000..09c22f8 --- /dev/null +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.h @@ -0,0 +1,66 @@ +/* Broadcom NetXtreme-C/E network driver. + * + * Copyright (c) 2018 Broadcom Inc + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation. + */ + +#ifndef BNXT_COREDUMP_H +#define BNXT_COREDUMP_H + +struct bnxt_coredump_segment_hdr { + __u8 signature[4]; + __le32 component_id; + __le32 segment_id; + __le32 flags; + __u8 low_version; + __u8 high_version; + __le16 function_id; + __le32 offset; + __le32 length; + __le32 status; + __le32 duration; + __le32 data_offset; + __le32 instance; + __le32 rsvd[5]; +}; + +struct bnxt_coredump_record { + __u8 signature[4]; + __le32 flags; + __u8 low_version; + __u8 high_version; + __u8 asic_state; + __u8 rsvd0[5]; + char system_name[32]; + __le16 year; + __le16 month; + __le16 day; + __le16 hour; + __le16 minute; + __le16 second; + __le16 utc_bias; + __le16 rsvd1; + char commandline[256]; + __le32 total_segments; + __le32 os_ver_major; + __le32 os_ver_minor; + __le32 rsvd2; + char os_name[32]; + __le16 end_year; + __le16 end_month; + __le16 end_day; + __le16 end_hour; + __le16 end_minute; + __le16 end_second; + __le16 end_utc_bias; + __le32 asic_id1; + __le32 asic_id2; + __le32 coredump_status; + __u8 ioctl_low_version; + __u8 ioctl_high_version; + __le16 rsvd3[313]; +}; +#endif diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c index 9517633..3fc7c74 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c @@ -16,12 +16,15 @@ #include #include #include +#include +#include #include "bnxt_hsi.h" #include "bnxt.h" #include "bnxt_xdp.h" #include "bnxt_ethtool.h" #include "bnxt_nvm_defs.h" /* NVRAM content constant and structure defs */ #include "bnxt_fw_hdr.h" /* Firmware hdr constant and structure defs */ +#include "bnxt_coredump.h" #define FLASH_NVRAM_TIMEOUT((HWRM_CMD_TIMEOUT) * 100) #define FLASH_PACKAGE_TIMEOUT ((HWRM_CMD_TIMEOUT) * 200) #define INSTALL_PACKAGE_TIMEOUT((HWRM_CMD_TIMEOUT) * 200) @@ -2685,6 +2688,334 @@ static int bnxt_reset(struct net_device *dev, u32 *flags) return rc; } +static int bnxt_hwrm_dbg_dma_data(struct bnxt *bp, void *msg, int msg_len, + struct bnxt_hwrm_dbg_dma_info *info) +{ + struct hwrm_dbg_cmn_output *cmn_resp = bp->hwrm_cmd_resp_addr; + struct hwrm_dbg_cmn_input *cmn_req = msg; + __le16 *seq_ptr = msg + info->seq_off; + u16 seq = 0, len, segs_off; + void *resp = cmn_resp; + dma_addr_t dma_handle; + int rc, off = 0; + void *dma_buf; + + dma_buf = dma_alloc_coherent(>pdev->dev, info->dma_len, _handle, +GFP_KERNEL); + if (!dma_buf) + return -ENOMEM; + + segs_off = offsetof(struct hwrm_dbg_coredump_list_output, + total_segments); + cmn_req->host_dest_addr = cpu_to_le64(dma_handle); + cmn_req->host_buf_len = cpu_to_le32(info->dma_len); + mutex_lock(>hwrm_cmd_lock); + while (1) { + *seq_ptr = cpu_to_le16(seq); + rc = _hwrm_send_message(bp, msg, msg_len, HWRM_CMD_TIMEOUT); + if (rc) + break; + + len = le16_to_cpu(*((__le16 *)(resp + info->data_len_off))); + if (!seq && + cmn_req->req_type == cpu_to_le16(HWRM_DBG_COREDUMP_LIST)) { + info->segs = le16_to_cpu(*((__le16 *)(resp + + segs_off))); + if (!info->segs) { + rc = -EIO; + break; + } + +
[PATCH net-next 03/13] bnxt_en: Add external loopback test to ethtool selftest.
Add code to detect firmware support for external loopback and the extra test entry for external loopback. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 4 +++ drivers/net/ethernet/broadcom/bnxt/bnxt.h | 2 ++ drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 32 ++- 3 files changed, 32 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index c612d74..d9fc905 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -6337,6 +6337,10 @@ static int bnxt_hwrm_phy_qcaps(struct bnxt *bp) bp->lpi_tmr_hi = le32_to_cpu(resp->valid_tx_lpi_timer_high) & PORT_PHY_QCAPS_RESP_TX_LPI_TIMER_HIGH_MASK; } + if (resp->flags & PORT_PHY_QCAPS_RESP_FLAGS_EXTERNAL_LPBK_SUPPORTED) { + if (bp->test_info) + bp->test_info->flags |= BNXT_TEST_FL_EXT_LPBK; + } if (resp->supported_speeds_auto_mode) link_info->support_auto_speeds = le16_to_cpu(resp->supported_speeds_auto_mode); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 3b5a55c..0d49fe0 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -990,6 +990,8 @@ struct bnxt_led_info { struct bnxt_test_info { u8 offline_mask; + u8 flags; +#define BNXT_TEST_FL_EXT_LPBK 0x1 u16 timeout; char string[BNXT_MAX_TEST][ETH_GSTRING_LEN]; }; diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c index 1f626af..9517633 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c @@ -2397,7 +2397,7 @@ static int bnxt_disable_an_for_lpbk(struct bnxt *bp, return rc; } -static int bnxt_hwrm_phy_loopback(struct bnxt *bp, bool enable) +static int bnxt_hwrm_phy_loopback(struct bnxt *bp, bool enable, bool ext) { struct hwrm_port_phy_cfg_input req = {0}; @@ -2405,7 +2405,10 @@ static int bnxt_hwrm_phy_loopback(struct bnxt *bp, bool enable) if (enable) { bnxt_disable_an_for_lpbk(bp, ); - req.lpbk = PORT_PHY_CFG_REQ_LPBK_LOCAL; + if (ext) + req.lpbk = PORT_PHY_CFG_REQ_LPBK_EXTERNAL; + else + req.lpbk = PORT_PHY_CFG_REQ_LPBK_LOCAL; } else { req.lpbk = PORT_PHY_CFG_REQ_LPBK_NONE; } @@ -2538,15 +2541,17 @@ static int bnxt_run_fw_tests(struct bnxt *bp, u8 test_mask, u8 *test_results) return rc; } -#define BNXT_DRV_TESTS 3 +#define BNXT_DRV_TESTS 4 #define BNXT_MACLPBK_TEST_IDX (bp->num_tests - BNXT_DRV_TESTS) #define BNXT_PHYLPBK_TEST_IDX (BNXT_MACLPBK_TEST_IDX + 1) -#define BNXT_IRQ_TEST_IDX (BNXT_MACLPBK_TEST_IDX + 2) +#define BNXT_EXTLPBK_TEST_IDX (BNXT_MACLPBK_TEST_IDX + 2) +#define BNXT_IRQ_TEST_IDX (BNXT_MACLPBK_TEST_IDX + 3) static void bnxt_self_test(struct net_device *dev, struct ethtool_test *etest, u64 *buf) { struct bnxt *bp = netdev_priv(dev); + bool do_ext_lpbk = false; bool offline = false; u8 test_results = 0; u8 test_mask = 0; @@ -2560,6 +2565,10 @@ static void bnxt_self_test(struct net_device *dev, struct ethtool_test *etest, return; } + if ((etest->flags & ETH_TEST_FL_EXTERNAL_LB) && + (bp->test_info->flags & BNXT_TEST_FL_EXT_LPBK)) + do_ext_lpbk = true; + if (etest->flags & ETH_TEST_FL_OFFLINE) { if (bp->pf.active_vfs) { etest->flags |= ETH_TEST_FL_FAILED; @@ -2600,13 +2609,22 @@ static void bnxt_self_test(struct net_device *dev, struct ethtool_test *etest, buf[BNXT_MACLPBK_TEST_IDX] = 0; bnxt_hwrm_mac_loopback(bp, false); - bnxt_hwrm_phy_loopback(bp, true); + bnxt_hwrm_phy_loopback(bp, true, false); msleep(1000); if (bnxt_run_loopback(bp)) { buf[BNXT_PHYLPBK_TEST_IDX] = 1; etest->flags |= ETH_TEST_FL_FAILED; } - bnxt_hwrm_phy_loopback(bp, false); + if (do_ext_lpbk) { + etest->flags |= ETH_TEST_FL_EXTERNAL_LB_DONE; + bnxt_hwrm_phy_loopback(bp, true, true); + msleep(1000); + if (bnxt_run_loopback(bp)) { + buf[BNXT_EXTLPBK
[PATCH net-next 01/13] bnxt_en: Update firmware interface version to 1.9.2.25.
New interface has firmware core dump support, new extended port statistics, and IF state change notifications to the firmware. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.h |4 +- drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c |8 +- drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c |6 +- drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h | 1227 +++-- 4 files changed, 924 insertions(+), 321 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 934aa11..3b5a55c 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -12,11 +12,11 @@ #define BNXT_H #define DRV_MODULE_NAME"bnxt_en" -#define DRV_MODULE_VERSION "1.9.1" +#define DRV_MODULE_VERSION "1.9.2" #define DRV_VER_MAJ1 #define DRV_VER_MIN9 -#define DRV_VER_UPD1 +#define DRV_VER_UPD2 #include #include diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c index 7bd96ab..f3b9fbc 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c @@ -29,7 +29,7 @@ static const struct bnxt_dl_nvm_param nvm_params[] = { static int bnxt_hwrm_nvm_req(struct bnxt *bp, u32 param_id, void *msg, int msg_len, union devlink_param_value *val) { - struct hwrm_nvm_variable_input *req = msg; + struct hwrm_nvm_get_variable_input *req = msg; void *data_addr = NULL, *buf = NULL; struct bnxt_dl_nvm_param nvm_param; int bytesize, idx = 0, rc, i; @@ -60,18 +60,18 @@ static int bnxt_hwrm_nvm_req(struct bnxt *bp, u32 param_id, void *msg, if (!data_addr) return -ENOMEM; - req->data_addr = cpu_to_le64(data_dma_addr); + req->dest_data_addr = cpu_to_le64(data_dma_addr); req->data_len = cpu_to_le16(nvm_param.num_bits); req->option_num = cpu_to_le16(nvm_param.offset); req->index_0 = cpu_to_le16(idx); if (idx) req->dimensions = cpu_to_le16(1); - if (req->req_type == HWRM_NVM_SET_VARIABLE) + if (req->req_type == cpu_to_le16(HWRM_NVM_SET_VARIABLE)) memcpy(data_addr, buf, bytesize); rc = hwrm_send_message(bp, msg, msg_len, HWRM_CMD_TIMEOUT); - if (!rc && req->req_type == HWRM_NVM_GET_VARIABLE) + if (!rc && req->req_type == cpu_to_le16(HWRM_NVM_GET_VARIABLE)) memcpy(buf, data_addr, bytesize); dma_free_coherent(>pdev->dev, bytesize, data_addr, data_dma_addr); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c index 7270c8b..3d40e49 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c @@ -162,7 +162,7 @@ static const struct { BNXT_RX_STATS_ENTRY(rx_128b_255b_frames), BNXT_RX_STATS_ENTRY(rx_256b_511b_frames), BNXT_RX_STATS_ENTRY(rx_512b_1023b_frames), - BNXT_RX_STATS_ENTRY(rx_1024b_1518_frames), + BNXT_RX_STATS_ENTRY(rx_1024b_1518b_frames), BNXT_RX_STATS_ENTRY(rx_good_vlan_frames), BNXT_RX_STATS_ENTRY(rx_1519b_2047b_frames), BNXT_RX_STATS_ENTRY(rx_2048b_4095b_frames), @@ -205,9 +205,9 @@ static const struct { BNXT_TX_STATS_ENTRY(tx_128b_255b_frames), BNXT_TX_STATS_ENTRY(tx_256b_511b_frames), BNXT_TX_STATS_ENTRY(tx_512b_1023b_frames), - BNXT_TX_STATS_ENTRY(tx_1024b_1518_frames), + BNXT_TX_STATS_ENTRY(tx_1024b_1518b_frames), BNXT_TX_STATS_ENTRY(tx_good_vlan_frames), - BNXT_TX_STATS_ENTRY(tx_1519b_2047_frames), + BNXT_TX_STATS_ENTRY(tx_1519b_2047b_frames), BNXT_TX_STATS_ENTRY(tx_2048b_4095b_frames), BNXT_TX_STATS_ENTRY(tx_4096b_9216b_frames), BNXT_TX_STATS_ENTRY(tx_9217b_16383b_frames), diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h index c75d7fa..971ace5d 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h @@ -96,6 +96,7 @@ struct hwrm_short_input { struct cmd_nums { __le16 req_type; #define HWRM_VER_GET 0x0UL + #define HWRM_FUNC_DRV_IF_CHANGE 0xdUL #define HWRM_FUNC_BUF_UNRGTR 0xeUL #define HWRM_FUNC_VF_CFG 0xfUL #define HWRM_RESERVED10x10UL @@ -159,6 +160,7 @@ struct cmd_nums { #define HWRM_RING_FREE0x51UL #define HWRM_RING_CMPL_RING_QAGGINT_PARAMS0x52UL #define HWRM_RING_CMPL_RING_CFG_AGGINT_PARAMS 0x53UL + #defin
[PATCH net-next 12/13] bnxt_en: Add DCBNL DSCP application protocol support.
Expand the .ieee_setapp() and ieee_delapp() DCBNL methods to support DSCP. This allows DSCP values to user priority mappings instead of using VLAN priorities. Each DSCP mapping is added or deleted one entry at a time using the firmware API. The firmware call can only be made from a PF. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 + drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c | 83 ++- drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h | 6 ++ 3 files changed, 89 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 006726c..fefa011 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -1281,6 +1281,7 @@ struct bnxt { struct ieee_ets *ieee_ets; u8 dcbx_cap; u8 default_pri; + u8 max_dscp_value; #endif /* CONFIG_BNXT_DCB */ u32 msg_enable; diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c index 00dd26d..ddc98c3 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c @@ -385,6 +385,61 @@ static int bnxt_hwrm_set_dcbx_app(struct bnxt *bp, struct dcb_app *app, return rc; } +static int bnxt_hwrm_queue_dscp_qcaps(struct bnxt *bp) +{ + struct hwrm_queue_dscp_qcaps_output *resp = bp->hwrm_cmd_resp_addr; + struct hwrm_queue_dscp_qcaps_input req = {0}; + int rc; + + if (bp->hwrm_spec_code < 0x10800 || BNXT_VF(bp)) + return 0; + + bnxt_hwrm_cmd_hdr_init(bp, , HWRM_QUEUE_DSCP_QCAPS, -1, -1); + mutex_lock(>hwrm_cmd_lock); + rc = _hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); + if (!rc) { + bp->max_dscp_value = (1 << resp->num_dscp_bits) - 1; + if (bp->max_dscp_value < 0x3f) + bp->max_dscp_value = 0; + } + + mutex_unlock(>hwrm_cmd_lock); + return rc; +} + +static int bnxt_hwrm_queue_dscp2pri_cfg(struct bnxt *bp, struct dcb_app *app, + bool add) +{ + struct hwrm_queue_dscp2pri_cfg_input req = {0}; + struct bnxt_dscp2pri_entry *dscp2pri; + dma_addr_t mapping; + int rc; + + if (bp->hwrm_spec_code < 0x10800) + return 0; + + bnxt_hwrm_cmd_hdr_init(bp, , HWRM_QUEUE_DSCP2PRI_CFG, -1, -1); + dscp2pri = dma_alloc_coherent(>pdev->dev, sizeof(*dscp2pri), + , GFP_KERNEL); + if (!dscp2pri) + return -ENOMEM; + + req.src_data_addr = cpu_to_le64(mapping); + dscp2pri->dscp = app->protocol; + if (add) + dscp2pri->mask = 0x3f; + else + dscp2pri->mask = 0; + dscp2pri->pri = app->priority; + req.entry_cnt = cpu_to_le16(1); + rc = hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); + if (rc) + rc = -EIO; + dma_free_coherent(>pdev->dev, sizeof(*dscp2pri), dscp2pri, + mapping); + return rc; +} + static int bnxt_ets_validate(struct bnxt *bp, struct ieee_ets *ets, u8 *tc) { int total_ets_bw = 0; @@ -551,15 +606,30 @@ static int bnxt_dcbnl_ieee_setpfc(struct net_device *dev, struct ieee_pfc *pfc) return rc; } +static int bnxt_dcbnl_ieee_dscp_app_prep(struct bnxt *bp, struct dcb_app *app) +{ + if (app->selector == IEEE_8021QAZ_APP_SEL_DSCP) { + if (!bp->max_dscp_value) + return -ENOTSUPP; + if (app->protocol > bp->max_dscp_value) + return -EINVAL; + } + return 0; +} + static int bnxt_dcbnl_ieee_setapp(struct net_device *dev, struct dcb_app *app) { struct bnxt *bp = netdev_priv(dev); - int rc = -EINVAL; + int rc; if (!(bp->dcbx_cap & DCB_CAP_DCBX_VER_IEEE) || !(bp->dcbx_cap & DCB_CAP_DCBX_HOST)) return -EINVAL; + rc = bnxt_dcbnl_ieee_dscp_app_prep(bp, app); + if (rc) + return rc; + rc = dcb_ieee_setapp(dev, app); if (rc) return rc; @@ -570,6 +640,9 @@ static int bnxt_dcbnl_ieee_setapp(struct net_device *dev, struct dcb_app *app) app->protocol == ROCE_V2_UDP_DPORT)) rc = bnxt_hwrm_set_dcbx_app(bp, app, true); + if (app->selector == IEEE_8021QAZ_APP_SEL_DSCP) + rc = bnxt_hwrm_queue_dscp2pri_cfg(bp, app, true); + return rc; } @@ -582,6 +655,10 @@ static int bnxt_dcbnl_ieee_delapp(struct net_device *dev, struct dcb_app *app) !(bp->dcbx_cap & DCB_CAP_DCBX_HOST))
[PATCH net-next 11/13] bnxt_en: Add hwmon sysfs support to read temperature
From: Vasundhara Volam Export temperature sensor reading via hwmon sysfs. Signed-off-by: Vasundhara Volam Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/Kconfig | 8 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 62 +++ drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 + 3 files changed, 71 insertions(+) diff --git a/drivers/net/ethernet/broadcom/Kconfig b/drivers/net/ethernet/broadcom/Kconfig index b7aa8ad..c1d3ee9b 100644 --- a/drivers/net/ethernet/broadcom/Kconfig +++ b/drivers/net/ethernet/broadcom/Kconfig @@ -230,4 +230,12 @@ config BNXT_DCB If unsure, say N. +config BNXT_HWMON + bool "Broadcom NetXtreme-C/E HWMON support" + default y + depends on BNXT && HWMON && !(BNXT=y && HWMON=m) + ---help--- + Say Y if you want to expose the thermal sensor data on NetXtreme-C/E + devices, via the hwmon sysfs interface. + endif # NET_VENDOR_BROADCOM diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 56bd097..dde904b 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -51,6 +51,8 @@ #include #include #include +#include +#include #include "bnxt_hsi.h" #include "bnxt.h" @@ -6789,6 +6791,62 @@ static void bnxt_get_wol_settings(struct bnxt *bp) } while (handle && handle != 0x); } +#ifdef CONFIG_BNXT_HWMON +static ssize_t bnxt_show_temp(struct device *dev, + struct device_attribute *devattr, char *buf) +{ + struct hwrm_temp_monitor_query_input req = {0}; + struct hwrm_temp_monitor_query_output *resp; + struct bnxt *bp = dev_get_drvdata(dev); + u32 temp = 0; + + resp = bp->hwrm_cmd_resp_addr; + bnxt_hwrm_cmd_hdr_init(bp, , HWRM_TEMP_MONITOR_QUERY, -1, -1); + mutex_lock(>hwrm_cmd_lock); + if (!_hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT)) + temp = resp->temp * 1000; /* display millidegree */ + mutex_unlock(>hwrm_cmd_lock); + + return sprintf(buf, "%u\n", temp); +} +static SENSOR_DEVICE_ATTR(temp1_input, 0444, bnxt_show_temp, NULL, 0); + +static struct attribute *bnxt_attrs[] = { + _dev_attr_temp1_input.dev_attr.attr, + NULL +}; +ATTRIBUTE_GROUPS(bnxt); + +static void bnxt_hwmon_close(struct bnxt *bp) +{ + if (bp->hwmon_dev) { + hwmon_device_unregister(bp->hwmon_dev); + bp->hwmon_dev = NULL; + } +} + +static void bnxt_hwmon_open(struct bnxt *bp) +{ + struct pci_dev *pdev = bp->pdev; + + bp->hwmon_dev = hwmon_device_register_with_groups(>dev, + DRV_MODULE_NAME, bp, + bnxt_groups); + if (IS_ERR(bp->hwmon_dev)) { + bp->hwmon_dev = NULL; + dev_warn(>dev, "Cannot register hwmon device\n"); + } +} +#else +static void bnxt_hwmon_close(struct bnxt *bp) +{ +} + +static void bnxt_hwmon_open(struct bnxt *bp) +{ +} +#endif + static bool bnxt_eee_config_ok(struct bnxt *bp) { struct ethtool_eee *eee = >eee; @@ -7040,6 +7098,9 @@ static int bnxt_open(struct net_device *dev) rc = __bnxt_open_nic(bp, true, true); if (rc) bnxt_hwrm_if_change(bp, false); + + bnxt_hwmon_open(bp); + return rc; } @@ -7102,6 +7163,7 @@ static int bnxt_close(struct net_device *dev) { struct bnxt *bp = netdev_priv(dev); + bnxt_hwmon_close(bp); bnxt_close_nic(bp, true, true); bnxt_hwrm_shutdown_link(bp); bnxt_hwrm_if_change(bp, false); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 6c40b257..006726c 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -1411,6 +1411,7 @@ struct bnxt { struct bnxt_tc_info *tc_info; struct dentry *debugfs_pdev; struct dentry *debugfs_dim; + struct device *hwmon_dev; }; #define BNXT_RX_STATS_OFFSET(counter) \ -- 2.5.1
[PATCH net-next 00/13] bnxt_en: Updates for net-next.
This series includes the usual firmware spec update. The driver has added external phy loopback test and phy setup retry logic that is needed during hotplug. In the SRIOV space, the driver has added a new VF resource allocation mode that requires the VF driver to reserve resources during IFUP. IF state changes are now propagated to firmware so that firmware can release some resources during IFDOWN. ethtool method to get firmware core dump and hwmon temperature reading have been added. DSCP to user priority support has been added to the driver's DCBNL interface, and the CoS queue logic has been refined to make sure that the special RDMA Congestion Notification hardware CoS queue will not be used for networking traffic. Michael Chan (11): bnxt_en: Update firmware interface version to 1.9.2.25. bnxt_en: Adjust timer based on ethtool stats-block-usecs settings. bnxt_en: Add external loopback test to ethtool selftest. bnxt_en: Add PHY retry logic. bnxt_en: Add new VF resource allocation strategy mode. bnxt_en: Update RSS setup and GRO-HW logic according to the latest spec. bnxt_en: Add BNXT_NEW_RM() macro. bnxt_en: Move firmware related flags to a new fw_cap field in struct bnxt. bnxt_en: Notify firmware about IF state changes. bnxt_en: Add DCBNL DSCP application protocol support. bnxt_en: Do not use the CNP CoS queue for networking traffic. Vasundhara Volam (2): bnxt_en: Add support for ethtool get dump. bnxt_en: Add hwmon sysfs support to read temperature drivers/net/ethernet/broadcom/Kconfig |8 + drivers/net/ethernet/broadcom/bnxt/bnxt.c | 216 +++- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 30 +- drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.h | 66 ++ drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c | 89 +- drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h | 10 + drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c |8 +- drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 378 +- drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h | 37 + drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h | 1227 +++- drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c| 25 +- drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c |4 +- 12 files changed, 1716 insertions(+), 382 deletions(-) create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.h -- 2.5.1
[PATCH net-next 08/13] bnxt_en: Add BNXT_NEW_RM() macro.
The BNXT_FLAG_NEW_RM flag is checked a lot in the code to determine if the new resource manager is in effect. Define a macro to perform this check. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 27 +++ drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 + drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 2 +- drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 2 +- drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 4 ++-- 5 files changed, 18 insertions(+), 18 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 1714850..5c9ee3c 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -4579,7 +4579,7 @@ static int bnxt_hwrm_get_rings(struct bnxt *bp) } hw_resc->resv_tx_rings = le16_to_cpu(resp->alloc_tx_rings); - if (bp->flags & BNXT_FLAG_NEW_RM) { + if (BNXT_NEW_RM(bp)) { u16 cp, stats; hw_resc->resv_rx_rings = le16_to_cpu(resp->alloc_rx_rings); @@ -4625,7 +4625,7 @@ __bnxt_hwrm_reserve_pf_rings(struct bnxt *bp, struct hwrm_func_cfg_input *req, req->fid = cpu_to_le16(0x); enables |= tx_rings ? FUNC_CFG_REQ_ENABLES_NUM_TX_RINGS : 0; req->num_tx_rings = cpu_to_le16(tx_rings); - if (bp->flags & BNXT_FLAG_NEW_RM) { + if (BNXT_NEW_RM(bp)) { enables |= rx_rings ? FUNC_CFG_REQ_ENABLES_NUM_RX_RINGS : 0; enables |= cp_rings ? FUNC_CFG_REQ_ENABLES_NUM_CMPL_RINGS | FUNC_CFG_REQ_ENABLES_NUM_STAT_CTXS : 0; @@ -4698,7 +4698,7 @@ bnxt_hwrm_reserve_vf_rings(struct bnxt *bp, int tx_rings, int rx_rings, struct hwrm_func_vf_cfg_input req = {0}; int rc; - if (!(bp->flags & BNXT_FLAG_NEW_RM)) { + if (!BNXT_NEW_RM(bp)) { bp->hw_resc.resv_tx_rings = tx_rings; return 0; } @@ -4758,7 +4758,7 @@ static bool bnxt_need_reserve_rings(struct bnxt *bp) vnic = rx + 1; if (bp->flags & BNXT_FLAG_AGG_RINGS) rx <<= 1; - if ((bp->flags & BNXT_FLAG_NEW_RM) && + if (BNXT_NEW_RM(bp) && (hw_resc->resv_rx_rings != rx || hw_resc->resv_cp_rings != cp || hw_resc->resv_hw_ring_grps != grp || hw_resc->resv_vnics != vnic)) return true; @@ -4794,7 +4794,7 @@ static int __bnxt_reserve_rings(struct bnxt *bp) return rc; tx = hw_resc->resv_tx_rings; - if (bp->flags & BNXT_FLAG_NEW_RM) { + if (BNXT_NEW_RM(bp)) { rx = hw_resc->resv_rx_rings; cp = hw_resc->resv_cp_rings; grp = hw_resc->resv_hw_ring_grps; @@ -4838,7 +4838,7 @@ static int bnxt_hwrm_check_vf_rings(struct bnxt *bp, int tx_rings, int rx_rings, u32 flags; int rc; - if (!(bp->flags & BNXT_FLAG_NEW_RM)) + if (!BNXT_NEW_RM(bp)) return 0; __bnxt_hwrm_reserve_vf_rings(bp, , tx_rings, rx_rings, ring_grps, @@ -4867,7 +4867,7 @@ static int bnxt_hwrm_check_pf_rings(struct bnxt *bp, int tx_rings, int rx_rings, __bnxt_hwrm_reserve_pf_rings(bp, , tx_rings, rx_rings, ring_grps, cp_rings, vnics); flags = FUNC_CFG_REQ_FLAGS_TX_ASSETS_TEST; - if (bp->flags & BNXT_FLAG_NEW_RM) + if (BNXT_NEW_RM(bp)) flags |= FUNC_CFG_REQ_FLAGS_RX_ASSETS_TEST | FUNC_CFG_REQ_FLAGS_CMPL_ASSETS_TEST | FUNC_CFG_REQ_FLAGS_RING_GRP_ASSETS_TEST | @@ -5921,7 +5921,7 @@ int bnxt_get_avail_msix(struct bnxt *bp, int num) max_idx = min_t(int, bp->total_irqs, max_cp); avail_msix = max_idx - bp->cp_nr_rings; - if (!(bp->flags & BNXT_FLAG_NEW_RM) || avail_msix >= num) + if (!BNXT_NEW_RM(bp) || avail_msix >= num) return avail_msix; if (max_irq < total_req) { @@ -5934,7 +5934,7 @@ int bnxt_get_avail_msix(struct bnxt *bp, int num) static int bnxt_get_num_msix(struct bnxt *bp) { - if (!(bp->flags & BNXT_FLAG_NEW_RM)) + if (!BNXT_NEW_RM(bp)) return bnxt_get_max_func_irqs(bp); return bnxt_cp_rings_in_use(bp); @@ -6057,8 +6057,7 @@ int bnxt_reserve_rings(struct bnxt *bp) netdev_err(bp->dev, "ring reservation failure rc: %d\n", rc); return rc; } - if ((bp->flags & BNXT_FLAG_NEW_RM) && - (bnxt_get_num_msix(bp) != bp->total_irqs)) { + if (BNXT_NEW_RM(bp) && (bnxt_get_num_msix(bp) != bp->total_irqs)) { bnxt_ulp_irq_stop(bp); bnxt_clear_int_mode(bp); rc = bnxt_init_int_mode(b
[PATCH net-next 09/13] bnxt_en: Move firmware related flags to a new fw_cap field in struct bnxt.
The flags field is almost getting full. Move firmware capability flags to a new fw_cap field to better organize these firmware flags. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 12 ++-- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 13 +++-- drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c | 6 +++--- 3 files changed, 16 insertions(+), 15 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 5c9ee3c..1659940 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -3445,7 +3445,7 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void *msg, u32 msg_len, cp_ring_id = le16_to_cpu(req->cmpl_ring); intr_process = (cp_ring_id == INVALID_HW_RING_ID) ? 0 : 1; - if (bp->flags & BNXT_FLAG_SHORT_CMD) { + if (bp->fw_cap & BNXT_FW_CAP_SHORT_CMD) { void *short_cmd_req = bp->hwrm_short_cmd_req_addr; memcpy(short_cmd_req, req, msg_len); @@ -5089,9 +5089,9 @@ static int bnxt_hwrm_func_qcfg(struct bnxt *bp) flags = le16_to_cpu(resp->flags); if (flags & (FUNC_QCFG_RESP_FLAGS_FW_DCBX_AGENT_ENABLED | FUNC_QCFG_RESP_FLAGS_FW_LLDP_AGENT_ENABLED)) { - bp->flags |= BNXT_FLAG_FW_LLDP_AGENT; + bp->fw_cap |= BNXT_FW_CAP_LLDP_AGENT; if (flags & FUNC_QCFG_RESP_FLAGS_FW_DCBX_AGENT_ENABLED) - bp->flags |= BNXT_FLAG_FW_DCBX_AGENT; + bp->fw_cap |= BNXT_FW_CAP_DCBX_AGENT; } if (BNXT_PF(bp) && (flags & FUNC_QCFG_RESP_FLAGS_MULTI_HOST)) bp->flags |= BNXT_FLAG_MULTI_HOST; @@ -5249,7 +5249,7 @@ static int bnxt_hwrm_func_qcaps(struct bnxt *bp) if (bp->hwrm_spec_code >= 0x10803) { rc = bnxt_hwrm_func_resc_qcaps(bp, true); if (!rc) - bp->flags |= BNXT_FLAG_NEW_RM; + bp->fw_cap |= BNXT_FW_CAP_NEW_RM; } return 0; } @@ -5352,7 +5352,7 @@ static int bnxt_hwrm_ver_get(struct bnxt *bp) dev_caps_cfg = le32_to_cpu(resp->dev_caps_cfg); if ((dev_caps_cfg & VER_GET_RESP_DEV_CAPS_CFG_SHORT_CMD_SUPPORTED) && (dev_caps_cfg & VER_GET_RESP_DEV_CAPS_CFG_SHORT_CMD_REQUIRED)) - bp->flags |= BNXT_FLAG_SHORT_CMD; + bp->fw_cap |= BNXT_FW_CAP_SHORT_CMD; hwrm_ver_get_exit: mutex_unlock(>hwrm_cmd_lock); @@ -8760,7 +8760,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) if (rc) goto init_err_pci_clean; - if (bp->flags & BNXT_FLAG_SHORT_CMD) { + if (bp->fw_cap & BNXT_FW_CAP_SHORT_CMD) { rc = bnxt_alloc_hwrm_short_cmd_req(bp); if (rc) goto init_err_pci_clean; diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 37dc896..ded2aff 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -1144,7 +1144,6 @@ struct bnxt { atomic_tintr_sem; u32 flags; - #define BNXT_FLAG_DCB_ENABLED 0x1 #define BNXT_FLAG_VF0x2 #define BNXT_FLAG_LRO 0x4 #ifdef CONFIG_INET @@ -1173,15 +1172,11 @@ struct bnxt { BNXT_FLAG_ROCEV2_CAP) #define BNXT_FLAG_NO_AGG_RINGS 0x2 #define BNXT_FLAG_RX_PAGE_MODE 0x4 - #define BNXT_FLAG_FW_LLDP_AGENT 0x8 #define BNXT_FLAG_MULTI_HOST0x10 - #define BNXT_FLAG_SHORT_CMD 0x20 #define BNXT_FLAG_DOUBLE_DB 0x40 - #define BNXT_FLAG_FW_DCBX_AGENT 0x80 #define BNXT_FLAG_CHIP_NITRO_A0 0x100 #define BNXT_FLAG_DIM 0x200 #define BNXT_FLAG_ROCE_MIRROR_CAP 0x400 - #define BNXT_FLAG_NEW_RM0x800 #define BNXT_FLAG_PORT_STATS_EXT0x1000 #define BNXT_FLAG_ALL_CONFIG_FEATS (BNXT_FLAG_TPA | \ @@ -1195,7 +1190,6 @@ struct bnxt { #define BNXT_SINGLE_PF(bp) (BNXT_PF(bp) && !BNXT_NPAR(bp) && !BNXT_MH(bp)) #define BNXT_CHIP_TYPE_NITRO_A0(bp) ((bp)->flags & BNXT_FLAG_CHIP_NITRO_A0) #define BNXT_RX_PAGE_MODE(bp) ((bp)->flags & BNXT_FLAG_RX_PAGE_MODE) -#define BNXT_NEW_RM(bp)((bp)->flags & BNXT_FLAG_NEW_RM) /* Chip class phase 4 and later */ #define BNXT_CHIP_P4_PLUS(bp) \ @@ -1291,6 +1285,13 @@ struct bnxt { u32 msg_enable; + u32 fw_cap; + #define BNXT_FW_CAP_SHORT_CMD 0x0001 + #define BNXT_FW_CAP_L
[PATCH net 1/6] bnxt_en: Fix the vlan_tci exact match check.
From: Venkat Duvvuru It is possible that OVS may set don’t care for DEI/CFI bit in vlan_tci mask. Hence, checking for vlan_tci exact match will endup in a vlan flow rejection. This patch fixes the problem by checking for vlan_pcp and vid separately, instead of checking for the entire vlan_tci. Fixes: e85a9be93cf1 (bnxt_en: do not allow wildcard matches for L2 flows) Signed-off-by: Venkat Duvvuru Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c | 30 +--- 1 file changed, 27 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c index 795f450..491bd40 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c @@ -27,6 +27,15 @@ #define BNXT_FID_INVALID 0x #define VLAN_TCI(vid, prio)((vid) | ((prio) << VLAN_PRIO_SHIFT)) +#define is_vlan_pcp_wildcarded(vlan_tci_mask) \ + ((ntohs(vlan_tci_mask) & VLAN_PRIO_MASK) == 0x) +#define is_vlan_pcp_exactmatch(vlan_tci_mask) \ + ((ntohs(vlan_tci_mask) & VLAN_PRIO_MASK) == VLAN_PRIO_MASK) +#define is_vlan_pcp_zero(vlan_tci) \ + ((ntohs(vlan_tci) & VLAN_PRIO_MASK) == 0x) +#define is_vid_exactmatch(vlan_tci_mask) \ + ((ntohs(vlan_tci_mask) & VLAN_VID_MASK) == VLAN_VID_MASK) + /* Return the dst fid of the func for flow forwarding * For PFs: src_fid is the fid of the PF * For VF-reps: src_fid the fid of the VF @@ -389,6 +398,21 @@ static bool is_exactmatch(void *mask, int len) return true; } +static bool is_vlan_tci_allowed(__be16 vlan_tci_mask, + __be16 vlan_tci) +{ + /* VLAN priority must be either exactly zero or fully wildcarded and +* VLAN id must be exact match. +*/ + if (is_vid_exactmatch(vlan_tci_mask) && + ((is_vlan_pcp_exactmatch(vlan_tci_mask) && + is_vlan_pcp_zero(vlan_tci)) || +is_vlan_pcp_wildcarded(vlan_tci_mask))) + return true; + + return false; +} + static bool bits_set(void *key, int len) { const u8 *p = key; @@ -803,9 +827,9 @@ static bool bnxt_tc_can_offload(struct bnxt *bp, struct bnxt_tc_flow *flow) /* Currently VLAN fields cannot be partial wildcard */ if (bits_set(>l2_key.inner_vlan_tci, sizeof(flow->l2_key.inner_vlan_tci)) && - !is_exactmatch(>l2_mask.inner_vlan_tci, - sizeof(flow->l2_mask.inner_vlan_tci))) { - netdev_info(bp->dev, "Wildcard match unsupported for VLAN TCI\n"); + !is_vlan_tci_allowed(flow->l2_mask.inner_vlan_tci, +flow->l2_key.inner_vlan_tci)) { + netdev_info(bp->dev, "Unsupported VLAN TCI\n"); return false; } if (bits_set(>l2_key.inner_vlan_tpid, -- 1.8.3.1
[PATCH net 6/6] bnxt_en: Fix for system hang if request_irq fails
From: Vikas Gupta Fix bug in the error code path when bnxt_request_irq() returns failure. bnxt_disable_napi() should not be called in this error path because NAPI has not been enabled yet. Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Vikas Gupta Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 11b21ad..4394c11 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -6890,7 +6890,7 @@ static int __bnxt_open_nic(struct bnxt *bp, bool irq_re_init, bool link_re_init) rc = bnxt_request_irq(bp); if (rc) { netdev_err(bp->dev, "bnxt_request_irq err: %x\n", rc); - goto open_err; + goto open_err_irq; } } @@ -6930,6 +6930,8 @@ static int __bnxt_open_nic(struct bnxt *bp, bool irq_re_init, bool link_re_init) open_err: bnxt_debug_dev_exit(bp); bnxt_disable_napi(bp); + +open_err_irq: bnxt_del_napi(bp); open_err_free_mem: -- 1.8.3.1
[PATCH net 4/6] bnxt_en: Support clearing of the IFF_BROADCAST flag.
Currently, the driver assumes IFF_BROADCAST is always set and always sets the broadcast filter. Modify the code to set or clear the broadcast filter according to the IFF_BROADCAST flag. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 5a47607..fac1285 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -5712,7 +5712,9 @@ static int bnxt_init_chip(struct bnxt *bp, bool irq_re_init) } vnic->uc_filter_count = 1; - vnic->rx_mask = CFA_L2_SET_RX_MASK_REQ_MASK_BCAST; + vnic->rx_mask = 0; + if (bp->dev->flags & IFF_BROADCAST) + vnic->rx_mask |= CFA_L2_SET_RX_MASK_REQ_MASK_BCAST; if ((bp->dev->flags & IFF_PROMISC) && bnxt_promisc_ok(bp)) vnic->rx_mask |= CFA_L2_SET_RX_MASK_REQ_MASK_PROMISCUOUS; @@ -7214,13 +7216,16 @@ static void bnxt_set_rx_mode(struct net_device *dev) mask &= ~(CFA_L2_SET_RX_MASK_REQ_MASK_PROMISCUOUS | CFA_L2_SET_RX_MASK_REQ_MASK_MCAST | - CFA_L2_SET_RX_MASK_REQ_MASK_ALL_MCAST); + CFA_L2_SET_RX_MASK_REQ_MASK_ALL_MCAST | + CFA_L2_SET_RX_MASK_REQ_MASK_BCAST); if ((dev->flags & IFF_PROMISC) && bnxt_promisc_ok(bp)) mask |= CFA_L2_SET_RX_MASK_REQ_MASK_PROMISCUOUS; uc_update = bnxt_uc_list_updated(bp); + if (dev->flags & IFF_BROADCAST) + mask |= CFA_L2_SET_RX_MASK_REQ_MASK_BCAST; if (dev->flags & IFF_ALLMULTI) { mask |= CFA_L2_SET_RX_MASK_REQ_MASK_ALL_MCAST; vnic->mc_list_count = 0; -- 1.8.3.1
[PATCH net 3/6] bnxt_en: Always set output parameters in bnxt_get_max_rings().
The current code returns -ENOMEM and does not bother to set the output parameters to 0 when no rings are available. Some callers, such as bnxt_get_channels() will display garbage ring numbers when that happens. Fix it by always setting the output parameters. Fixes: 6e6c5a57fbe1 ("bnxt_en: Modify bnxt_get_max_rings() to support shared or non shared rings.") Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 5d95d78..5a47607 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -8502,11 +8502,11 @@ int bnxt_get_max_rings(struct bnxt *bp, int *max_rx, int *max_tx, bool shared) int rx, tx, cp; _bnxt_get_max_rings(bp, , , ); + *max_rx = rx; + *max_tx = tx; if (!rx || !tx || !cp) return -ENOMEM; - *max_rx = rx; - *max_tx = tx; return bnxt_trim_rings(bp, max_rx, max_tx, cp, shared); } -- 1.8.3.1
[PATCH net 0/6] bnxt_en: Bug fixes.
These are bug fixes in error code paths, TC Flower VLAN TCI flow checking bug fix, proper filtering of Broadcast packets if IFF_BROADCAST is not set, and a bug fix in bnxt_get_max_rings() to return 0 ring parameters when the return value is -ENOMEM. Michael Chan (4): bnxt_en: Fix inconsistent BNXT_FLAG_AGG_RINGS logic. bnxt_en: Always set output parameters in bnxt_get_max_rings(). bnxt_en: Support clearing of the IFF_BROADCAST flag. bnxt_en: Do not modify max IRQ count after RDMA driver requests/frees IRQs. Venkat Duvvuru (1): bnxt_en: Fix the vlan_tci exact match check. Vikas Gupta (1): bnxt_en: Fix for system hang if request_irq fails drivers/net/ethernet/broadcom/bnxt/bnxt.c | 24 ++--- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 - drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c | 30 --- drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 2 -- 4 files changed, 44 insertions(+), 13 deletions(-) -- 1.8.3.1
[PATCH net 5/6] bnxt_en: Do not modify max IRQ count after RDMA driver requests/frees IRQs.
Calling bnxt_set_max_func_irqs() to modify the max IRQ count requested or freed by the RDMA driver is flawed. The max IRQ count is checked when re-initializing the IRQ vectors and this can happen multiple times during ifup or ethtool -L. If the max IRQ is reduced and the RDMA driver is operational, we may not initailize IRQs correctly. This problem shows up on VFs with very small number of MSIX. There is no other logic that relies on the IRQ count excluding the ones used by RDMA. So we fix it by just removing the call to subtract or add the IRQs used by RDMA. Fixes: a588e4580a7e ("bnxt_en: Add interface to support RDMA driver.") Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 - drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 2 -- 3 files changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index fac1285..11b21ad 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -5919,7 +5919,7 @@ unsigned int bnxt_get_max_func_irqs(struct bnxt *bp) return min_t(unsigned int, hw_resc->max_irqs, hw_resc->max_cp_rings); } -void bnxt_set_max_func_irqs(struct bnxt *bp, unsigned int max_irqs) +static void bnxt_set_max_func_irqs(struct bnxt *bp, unsigned int max_irqs) { bp->hw_resc.max_irqs = max_irqs; } diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 9b14eb6..91575ef 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -1470,7 +1470,6 @@ int bnxt_hwrm_func_rgtr_async_events(struct bnxt *bp, unsigned long *bmap, unsigned int bnxt_get_max_func_cp_rings(struct bnxt *bp); void bnxt_set_max_func_cp_rings(struct bnxt *bp, unsigned int max); unsigned int bnxt_get_max_func_irqs(struct bnxt *bp); -void bnxt_set_max_func_irqs(struct bnxt *bp, unsigned int max); int bnxt_get_avail_msix(struct bnxt *bp, int num); int bnxt_reserve_rings(struct bnxt *bp); void bnxt_tx_disable(struct bnxt *bp); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c index 347e4f9..840f6e5 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c @@ -169,7 +169,6 @@ static int bnxt_req_msix_vecs(struct bnxt_en_dev *edev, int ulp_id, edev->ulp_tbl[ulp_id].msix_requested = avail_msix; } bnxt_fill_msix_vecs(bp, ent); - bnxt_set_max_func_irqs(bp, bnxt_get_max_func_irqs(bp) - avail_msix); bnxt_set_max_func_cp_rings(bp, max_cp_rings - avail_msix); edev->flags |= BNXT_EN_FLAG_MSIX_REQUESTED; return avail_msix; @@ -192,7 +191,6 @@ static int bnxt_free_msix_vecs(struct bnxt_en_dev *edev, int ulp_id) msix_requested = edev->ulp_tbl[ulp_id].msix_requested; bnxt_set_max_func_cp_rings(bp, max_cp_rings + msix_requested); edev->ulp_tbl[ulp_id].msix_requested = 0; - bnxt_set_max_func_irqs(bp, bnxt_get_max_func_irqs(bp) + msix_requested); edev->flags &= ~BNXT_EN_FLAG_MSIX_REQUESTED; if (netif_running(dev)) { bnxt_close_nic(bp, true, false); -- 1.8.3.1
[PATCH net 2/6] bnxt_en: Fix inconsistent BNXT_FLAG_AGG_RINGS logic.
If there aren't enough RX rings available, the driver will attempt to use a single RX ring without the aggregation ring. If that also fails, the BNXT_FLAG_AGG_RINGS flag is cleared but the other ring parameters are not set consistently to reflect that. If more RX rings become available at the next open, the RX rings will be in an inconsistent state and may crash when freeing the RX rings. Fix it by restoring the BNXT_FLAG_AGG_RINGS if not enough RX rings are available to run without aggregation rings. Fixes: bdbd1eb59c56 ("bnxt_en: Handle no aggregation ring gracefully.") Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 176fc9f..5d95d78 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -8520,8 +8520,11 @@ static int bnxt_get_dflt_rings(struct bnxt *bp, int *max_rx, int *max_tx, /* Not enough rings, try disabling agg rings. */ bp->flags &= ~BNXT_FLAG_AGG_RINGS; rc = bnxt_get_max_rings(bp, max_rx, max_tx, shared); - if (rc) + if (rc) { + /* set BNXT_FLAG_AGG_RINGS back for consistency */ + bp->flags |= BNXT_FLAG_AGG_RINGS; return rc; + } bp->flags |= BNXT_FLAG_NO_AGG_RINGS; bp->dev->hw_features &= ~(NETIF_F_LRO | NETIF_F_GRO_HW); bp->dev->features &= ~(NETIF_F_LRO | NETIF_F_GRO_HW); -- 1.8.3.1
Re: [PATCH net-next 1/3] net: Add support to configure SR-IOV VF minimum and maximum queues.
On Tue, May 29, 2018 at 11:33 PM, Jakub Kicinski wrote: > > At some points you (Broadcom) were working whole bunch of devlink > configuration options for the PCIe side of the ASIC. The number of > queues relates to things like number of allocated MSI-X vectors, which > if memory serves me was in your devlink patch set. In an ideal world > we would try to keep all those in one place :) Yeah, another colleague is now working with Mellanox on something similar. One difference between those devlink parameters and these queue parameters is that the former are more permanent and global settings. For example, number of VFs or number of MSIX per VF are persistent settings once they are set and after PCIe reset. On the other hand, these queue settings are pure run-time settings and may be unique for each VF. These are not stored as there is no room in NVRAM to store 128 sets or more of these parameters. Anyway, let me discuss this with my colleague to see if there is a natural fit for these queue parameters in the devlink infrastructure that they are working on. > > For PCIe config there is always the question of what can be configured > at runtime, and what requires a HW reset. Therefore that devlink API > which could configure current as well as persistent device settings was > quite nice. I'm not sure if reallocating queues would ever require > PCIe block reset but maybe... Certainly it seems the notion of min > queues would make more sense in PCIe configuration devlink API than > ethtool channel API to me as well. > > Queues are in the grey area between netdev and non-netdev constructs. > They make sense both from PCIe resource allocation perspective (i.e. > devlink PCIe settings) and netdev perspective (ethtool) because they > feed into things like qdisc offloads, maybe per-queue stats etc. > > So yes... IMHO it would be nice to add this to a devlink SR-IOV config > API and/or switchdev representors. But neither of those are really an > option for you today so IDK :)
Re: [PATCH net-next 1/3] net: Add support to configure SR-IOV VF minimum and maximum queues.
On Tue, May 29, 2018 at 10:56 PM, Jakub Kicinski wrote: > On Tue, 29 May 2018 20:19:54 -0700, Michael Chan wrote: >> On Tue, May 29, 2018 at 1:46 PM, Samudrala, Sridhar wrote: >> > Isn't ndo_set_vf_xxx() considered a legacy interface and not planned to be >> > extended? > > +1 it's painful to see this feature being added to the legacy > API :( Another duplicated configuration knob. > >> I didn't know about that. >> >> > Shouldn't we enable this via ethtool on the port representor netdev? >> >> We discussed about this. ethtool on the VF representor will only work >> in switchdev mode and also will not support min/max values. > > Ethtool channel API may be overdue a rewrite in devlink anyway, but I > feel like implementing switchdev mode and rewriting features in devlink > may be too much to ask. Totally agreed. And switchdev mode doesn't seem to be that widely used at the moment. Do you have other suggestions besides NDO?
Re: [PATCH net-next 1/3] net: Add support to configure SR-IOV VF minimum and maximum queues.
On Tue, May 29, 2018 at 1:46 PM, Samudrala, Sridhar wrote: > > Isn't ndo_set_vf_xxx() considered a legacy interface and not planned to be > extended? I didn't know about that. > Shouldn't we enable this via ethtool on the port representor netdev? > > We discussed about this. ethtool on the VF representor will only work in switchdev mode and also will not support min/max values.
[PATCH net-next 0/3] net: Add support to configure SR-IOV VF queues.
VF Queue resources are always limited and there is currently no infrastructure to allow the admin. on the host to add or reduce queue resources for any particular VF. This series adds the infrastructure to do that and adds the functionality to the bnxt_en driver. The "ip link set" command will subsequently be patched to support the new operation. v1: - Changed the meaning of the min parameters to be strictly the minimum guaranteed value, suggested by Jakub Kicinsky. - More complete implementation in the bnxt_en driver. Michael Chan (3): net: Add support to configure SR-IOV VF minimum and maximum queues. bnxt_en: Store min/max tx/rx rings for individual VFs. bnxt_en: Implement .ndo_set_vf_queues(). drivers/net/ethernet/broadcom/bnxt/bnxt.c | 1 + drivers/net/ethernet/broadcom/bnxt/bnxt.h | 9 ++ drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 157 +++- drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h | 2 + include/linux/if_link.h | 4 + include/linux/netdevice.h | 6 + include/uapi/linux/if_link.h| 9 ++ net/core/rtnetlink.c| 32 - 8 files changed, 213 insertions(+), 7 deletions(-) -- 1.8.3.1
[PATCH net-next 2/3] bnxt_en: Store min/max tx/rx rings for individual VFs.
With new infrastructure to configure queues differently for each VF, we need to store the current min/max rx/tx rings and other resources for each VF. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 9 + drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 27 + 2 files changed, 32 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 9b14eb6..531c77d 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -837,6 +837,14 @@ struct bnxt_vf_info { u32 func_flags; /* func cfg flags */ u32 min_tx_rate; u32 max_tx_rate; + u16 min_tx_rings; + u16 max_tx_rings; + u16 min_rx_rings; + u16 max_rx_rings; + u16 min_cp_rings; + u16 min_stat_ctxs; + u16 min_ring_grps; + u16 min_vnics; void*hwrm_cmd_req_addr; dma_addr_t hwrm_cmd_req_dma_addr; }; @@ -1351,6 +1359,7 @@ struct bnxt { #ifdef CONFIG_BNXT_SRIOV int nr_vfs; struct bnxt_vf_info vf; + struct hwrm_func_vf_resource_cfg_input vf_resc_cfg_input; wait_queue_head_t sriov_cfg_wait; boolsriov_cfg; #define BNXT_SRIOV_CFG_WAIT_TMOmsecs_to_jiffies(1) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c index a649108..7a92125 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c @@ -171,6 +171,10 @@ int bnxt_get_vf_config(struct net_device *dev, int vf_id, ivi->linkstate = IFLA_VF_LINK_STATE_ENABLE; else ivi->linkstate = IFLA_VF_LINK_STATE_DISABLE; + ivi->min_tx_queues = vf->min_tx_rings; + ivi->max_tx_queues = vf->max_tx_rings; + ivi->min_rx_queues = vf->min_rx_rings; + ivi->max_rx_queues = vf->max_rx_rings; return 0; } @@ -498,6 +502,8 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, int num_vfs) mutex_lock(>hwrm_cmd_lock); for (i = 0; i < num_vfs; i++) { + struct bnxt_vf_info *vf = >vf[i]; + req.vf_id = cpu_to_le16(pf->first_vf_id + i); rc = _hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); @@ -506,7 +512,15 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, int num_vfs) break; } pf->active_vfs = i + 1; - pf->vf[i].fw_fid = pf->first_vf_id + i; + vf->fw_fid = pf->first_vf_id + i; + vf->min_tx_rings = le16_to_cpu(req.min_tx_rings); + vf->max_tx_rings = vf_tx_rings; + vf->min_rx_rings = le16_to_cpu(req.min_rx_rings); + vf->max_rx_rings = vf_rx_rings; + vf->min_cp_rings = le16_to_cpu(req.min_cmpl_rings); + vf->min_stat_ctxs = le16_to_cpu(req.min_stat_ctx); + vf->min_ring_grps = le16_to_cpu(req.min_hw_ring_grps); + vf->min_vnics = le16_to_cpu(req.min_vnics); } mutex_unlock(>hwrm_cmd_lock); if (pf->active_vfs) { @@ -521,6 +535,7 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, int num_vfs) hw_resc->max_stat_ctxs -= le16_to_cpu(req.min_stat_ctx) * n; hw_resc->max_vnics -= le16_to_cpu(req.min_vnics) * n; + memcpy(>vf_resc_cfg_input, , sizeof(req)); rc = pf->active_vfs; } return rc; @@ -585,6 +600,7 @@ static int bnxt_hwrm_func_cfg(struct bnxt *bp, int num_vfs) mutex_lock(>hwrm_cmd_lock); for (i = 0; i < num_vfs; i++) { + struct bnxt_vf_info *vf = >vf[i]; int vf_tx_rsvd = vf_tx_rings; req.fid = cpu_to_le16(pf->first_vf_id + i); @@ -593,12 +609,15 @@ static int bnxt_hwrm_func_cfg(struct bnxt *bp, int num_vfs) if (rc) break; pf->active_vfs = i + 1; - pf->vf[i].fw_fid = le16_to_cpu(req.fid); - rc = __bnxt_hwrm_get_tx_rings(bp, pf->vf[i].fw_fid, - _tx_rsvd); + vf->fw_fid = le16_to_cpu(req.fid); + rc = __bnxt_hwrm_get_tx_rings(bp, vf->fw_fid, _tx_rsvd); if (rc) break; total_vf_tx_rings += vf_tx_rsvd; + vf->min_tx_rings = vf_tx_rsvd; + vf->max_tx_rings = vf_tx_rsvd; + vf->min_rx_rings = vf_rx_rings; + vf->max_rx_rings = vf_rx_rings; } mutex_unlock(>hwrm_cmd_lock); if (rc) -- 1.8.3.1
[PATCH net-next 3/3] bnxt_en: Implement .ndo_set_vf_queues().
Implement .ndo_set_vf_queues() on the PF driver to configure the queues parameters for individual VFs. This allows the admin. on the host to increase or decrease queues for individual VFs. Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 1 + drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 130 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h | 2 + 3 files changed, 133 insertions(+) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index dfa0839..2ce9779 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -8373,6 +8373,7 @@ static int bnxt_swdev_port_attr_get(struct net_device *dev, .ndo_set_vf_link_state = bnxt_set_vf_link_state, .ndo_set_vf_spoofchk= bnxt_set_vf_spoofchk, .ndo_set_vf_trust = bnxt_set_vf_trust, + .ndo_set_vf_queues = bnxt_set_vf_queues, #endif #ifdef CONFIG_NET_POLL_CONTROLLER .ndo_poll_controller= bnxt_poll_controller, diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c index 7a92125..a34a32f 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c @@ -138,6 +138,136 @@ int bnxt_set_vf_trust(struct net_device *dev, int vf_id, bool trusted) return 0; } +static bool bnxt_param_ok(int new, u16 curr, u16 avail) +{ + int delta; + + if (new <= curr) + return true; + + delta = new - curr; + if (delta <= avail) + return true; + return false; +} + +static void bnxt_adjust_ring_resc(struct bnxt *bp, struct bnxt_vf_info *vf, + struct hwrm_func_vf_resource_cfg_input *req) +{ + struct bnxt_hw_resc *hw_resc = >hw_resc; + u16 avail_cp_rings, avail_stat_ctx; + u16 avail_vnics, avail_ring_grps; + u16 cp, grp, stat, vnic; + u16 min_tx, min_rx; + + min_tx = le16_to_cpu(req->min_tx_rings); + min_rx = le16_to_cpu(req->min_rx_rings); + avail_cp_rings = hw_resc->max_cp_rings - bp->cp_nr_rings; + avail_stat_ctx = hw_resc->max_stat_ctxs - bp->num_stat_ctxs; + avail_ring_grps = hw_resc->max_hw_ring_grps - bp->rx_nr_rings; + avail_vnics = hw_resc->max_vnics - bp->nr_vnics; + + cp = max_t(u16, 2 * min_tx, min_rx); + if (cp > vf->min_cp_rings) + cp = min_t(u16, cp, avail_cp_rings + vf->min_cp_rings); + grp = min_tx; + if (grp > vf->min_ring_grps) + grp = min_t(u16, grp, avail_ring_grps + vf->min_ring_grps); + stat = min_rx; + if (stat > vf->min_stat_ctxs) + stat = min_t(u16, stat, avail_stat_ctx + vf->min_stat_ctxs); + vnic = min_rx; + if (vnic > vf->min_vnics) + vnic = min_t(u16, vnic, avail_vnics + vf->min_vnics); + + req->min_cmpl_rings = req->max_cmpl_rings = cpu_to_le16(cp); + req->min_hw_ring_grps = req->max_hw_ring_grps = cpu_to_le16(grp); + req->min_stat_ctx = req->max_stat_ctx = cpu_to_le16(stat); + req->min_vnics = req->max_vnics = cpu_to_le16(vnic); +} + +static void bnxt_record_ring_resc(struct bnxt *bp, struct bnxt_vf_info *vf, + struct hwrm_func_vf_resource_cfg_input *req) +{ + struct bnxt_hw_resc *hw_resc = >hw_resc; + + hw_resc->max_tx_rings += vf->min_tx_rings; + hw_resc->max_rx_rings += vf->min_rx_rings; + vf->min_tx_rings = le16_to_cpu(req->min_tx_rings); + vf->max_tx_rings = le16_to_cpu(req->max_tx_rings); + vf->min_rx_rings = le16_to_cpu(req->min_rx_rings); + vf->max_rx_rings = le16_to_cpu(req->max_rx_rings); + hw_resc->max_tx_rings -= vf->min_tx_rings; + hw_resc->max_rx_rings -= vf->min_rx_rings; + if (bp->pf.vf_resv_strategy == BNXT_VF_RESV_STRATEGY_MAXIMAL) { + hw_resc->max_cp_rings += vf->min_cp_rings; + hw_resc->max_hw_ring_grps += vf->min_ring_grps; + hw_resc->max_stat_ctxs += vf->min_stat_ctxs; + hw_resc->max_vnics += vf->min_vnics; + vf->min_cp_rings = le16_to_cpu(req->min_cmpl_rings); + vf->min_ring_grps = le16_to_cpu(req->min_hw_ring_grps); + vf->min_stat_ctxs = le16_to_cpu(req->min_stat_ctx); + vf->min_vnics = le16_to_cpu(req->min_vnics); + hw_resc->max_cp_rings -= vf->min_cp_rings; + hw_resc->max_hw_ring_grps -= vf->min_ring_grps; + hw_resc->max_stat_ctxs -= vf->min_stat_ctxs; + hw_resc->max_vnics -= vf->min_vnics; + } +} + +int bnxt_set_vf_queues(struct net_devic
[PATCH net-next 1/3] net: Add support to configure SR-IOV VF minimum and maximum queues.
VF Queue resources are always limited and there is currently no infrastructure to allow the admin. on the host to add or reduce queue resources for any particular VF. With ever increasing number of VFs being supported, it is desirable to allow the admin. to configure queue resources differently for the VFs. Some VFs may require more or fewer queues due to different bandwidth requirements or different number of vCPUs in the VM. This patch adds the infrastructure to do that by adding IFLA_VF_QUEUES netlink attribute and a new .ndo_set_vf_queues() to the net_device_ops. Four parameters are exposed for each VF: o min_tx_queues - Guaranteed tx queues available to the VF. o max_tx_queues - Maximum but not necessarily guaranteed tx queues available to the VF. o min_rx_queues - Guaranteed rx queues available to the VF. o max_rx_queues - Maximum but not necessarily guaranteed rx queues available to the VF. The "ip link set" command will subsequently be patched to support the new operation to set the above parameters. After the admin. makes a change to the above parameters, the corresponding VF will have a new range of channels to set using ethtool -L. The VF may have to go through IF down/up before the new queues will take effect. Up to the min values are guaranteed. Up to the max values are possible but not guaranteed. Signed-off-by: Michael Chan --- include/linux/if_link.h | 4 include/linux/netdevice.h| 6 ++ include/uapi/linux/if_link.h | 9 + net/core/rtnetlink.c | 32 +--- 4 files changed, 48 insertions(+), 3 deletions(-) diff --git a/include/linux/if_link.h b/include/linux/if_link.h index 622658d..8e81121 100644 --- a/include/linux/if_link.h +++ b/include/linux/if_link.h @@ -29,5 +29,9 @@ struct ifla_vf_info { __u32 rss_query_en; __u32 trusted; __be16 vlan_proto; + __u32 min_tx_queues; + __u32 max_tx_queues; + __u32 min_rx_queues; + __u32 max_rx_queues; }; #endif /* _LINUX_IF_LINK_H */ diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 8452f72..17f5892 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1023,6 +1023,8 @@ struct dev_ifalias { * with PF and querying it may introduce a theoretical security risk. * int (*ndo_set_vf_rss_query_en)(struct net_device *dev, int vf, bool setting); * int (*ndo_get_vf_port)(struct net_device *dev, int vf, struct sk_buff *skb); + * int (*ndo_set_vf_queues)(struct net_device *dev, int vf, int min_txq, + * int max_txq, int min_rxq, int max_rxq); * int (*ndo_setup_tc)(struct net_device *dev, enum tc_setup_type type, *void *type_data); * Called to setup any 'tc' scheduler, classifier or action on @dev. @@ -1276,6 +1278,10 @@ struct net_device_ops { int (*ndo_set_vf_rss_query_en)( struct net_device *dev, int vf, bool setting); + int (*ndo_set_vf_queues)(struct net_device *dev, +int vf, +int min_txq, int max_txq, +int min_rxq, int max_rxq); int (*ndo_setup_tc)(struct net_device *dev, enum tc_setup_type type, void *type_data); diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index cf01b68..81bbc4e 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -659,6 +659,7 @@ enum { IFLA_VF_IB_NODE_GUID, /* VF Infiniband node GUID */ IFLA_VF_IB_PORT_GUID, /* VF Infiniband port GUID */ IFLA_VF_VLAN_LIST, /* nested list of vlans, option for QinQ */ + IFLA_VF_QUEUES, /* Min and Max TX/RX queues */ __IFLA_VF_MAX, }; @@ -749,6 +750,14 @@ struct ifla_vf_trust { __u32 setting; }; +struct ifla_vf_queues { + __u32 vf; + __u32 min_tx_queues;/* min guaranteed tx queues */ + __u32 max_tx_queues;/* max non guaranteed tx queues */ + __u32 min_rx_queues;/* min guaranteed rx queues */ + __u32 max_rx_queues;/* max non guaranteed rx queues */ +}; + /* VF ports management section * * Nested layout of set/get msg is: diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 8080254..e21ab8a 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -921,7 +921,8 @@ static inline int rtnl_vfinfo_size(const struct net_device *dev, nla_total_size_64bit(sizeof(__u64)) + /* IFLA_VF_STATS_TX_DROPPED */ nla_total_size_64bit(sizeof(__u64)) + -nla_total_s
Re: [PATCH net-next RFC 1/3] net: Add support to configure SR-IOV VF minimum and maximum queues.
On Wed, May 9, 2018 at 6:10 PM, Jakub Kicinski <jakub.kicin...@netronome.com> wrote: > On Wed, 9 May 2018 17:22:50 -0700, Michael Chan wrote: >> On Wed, May 9, 2018 at 4:15 PM, Jakub Kicinski wrote: >> > On Wed, 9 May 2018 07:21:41 -0400, Michael Chan wrote: >> >> VF Queue resources are always limited and there is currently no >> >> infrastructure to allow the admin. on the host to add or reduce queue >> >> resources for any particular VF. With ever increasing number of VFs >> >> being supported, it is desirable to allow the admin. to configure queue >> >> resources differently for the VFs. Some VFs may require more or fewer >> >> queues due to different bandwidth requirements or different number of >> >> vCPUs in the VM. This patch adds the infrastructure to do that by >> >> adding IFLA_VF_QUEUES netlink attribute and a new .ndo_set_vf_queues() >> >> to the net_device_ops. >> >> >> >> Four parameters are exposed for each VF: >> >> >> >> o min_tx_queues - Guaranteed or current tx queues assigned to the VF. >> > >> > This muxing of semantics may be a little awkward and unnecessary, would >> > it make sense for struct ifla_vf_info to have a separate fields for >> > current number of queues and the admin-set guaranteed min? >> >> The loose semantics is mainly to allow some flexibility in >> implementation. Sure, we can tighten the definitions or add >> additional fields. > > I would appreciate that, if others don't disagree. I personally don't > see the need for flexibility (AKA per-vendor behaviour) here, quite the > opposite, min/max/current number of queues seems quite self-explanatory. > > Or at least don't allow min to mean current? Otherwise the API gets a > bit asymmetrical :( Sure, will do. > >> > Is there a real world use case for the min value or are you trying to >> > make the API feature complete? >> >> In this proposal, these parameters are mainly viewed as the bounds for >> the queues that each VF can potentially allocate. The actual number >> of queues chosen by the VF driver or modified by the VF user can be >> any number within the bounds. > > Perhaps you have misspoken here - these are not allowed bounds, right? > min is the guarantee that queues will be available, not requirement. > Similar to bandwidth allocation. > > IOW if the bounds are set [4, 16] the VF may still choose to use 1 > queue, event thought that's not within bounds. Yes, you are absolutely right. The VF can allocate 1 queue. Up to min is guaranteed. Up to max is not guaranteed.
Re: [PATCH net-next RFC 1/3] net: Add support to configure SR-IOV VF minimum and maximum queues.
On Wed, May 9, 2018 at 4:15 PM, Jakub Kicinski <jakub.kicin...@netronome.com> wrote: > On Wed, 9 May 2018 07:21:41 -0400, Michael Chan wrote: >> VF Queue resources are always limited and there is currently no >> infrastructure to allow the admin. on the host to add or reduce queue >> resources for any particular VF. With ever increasing number of VFs >> being supported, it is desirable to allow the admin. to configure queue >> resources differently for the VFs. Some VFs may require more or fewer >> queues due to different bandwidth requirements or different number of >> vCPUs in the VM. This patch adds the infrastructure to do that by >> adding IFLA_VF_QUEUES netlink attribute and a new .ndo_set_vf_queues() >> to the net_device_ops. >> >> Four parameters are exposed for each VF: >> >> o min_tx_queues - Guaranteed or current tx queues assigned to the VF. > > This muxing of semantics may be a little awkward and unnecessary, would > it make sense for struct ifla_vf_info to have a separate fields for > current number of queues and the admin-set guaranteed min? The loose semantics is mainly to allow some flexibility in implementation. Sure, we can tighten the definitions or add additional fields. > > Is there a real world use case for the min value or are you trying to > make the API feature complete? In this proposal, these parameters are mainly viewed as the bounds for the queues that each VF can potentially allocate. The actual number of queues chosen by the VF driver or modified by the VF user can be any number within the bounds. We currently need to have min and max parameters to support the different modes we use to distribute the queue resources to the VFs. In one mode, for example, resources are statically divided and each VF has a small number of guaranteed queues (min = max). In a different mode, we allow more flexible resource allocation with each VF having a small number of guaranteed queues but a higher number of non-guaranteed queues (min < max). Some VFs may be able to allocate queues much higher than min when resources are still available, while others may only be able to allocate min queues when resources are used up. With min and max exposed, the PF user can properly tweak the resources for each VF described above. > >> o max_tx_queues - Maximum but not necessarily guaranteed tx queues >> available to the VF. >> >> o min_rx_queues - Guaranteed or current rx queues assigned to the VF. >> >> o max_rx_queues - Maximum but not necessarily guaranteed rx queues >> available to the VF. >> >> The "ip link set" command will subsequently be patched to support the new >> operation to set the above parameters. >> >> After the admin. makes a change to the above parameters, the corresponding >> VF will have a new range of channels to set using ethtool -L. >> >> Signed-off-by: Michael Chan <michael.c...@broadcom.com> > > In switchdev mode we can use number of queues on the representor as a > proxy for max number of queues allowed for the ASIC port. This works > better when representors are muxed in the first place than when they > have actual queues backing them. WDYT about such scheme, Or? A very > pleasant side-effect is that one can configure qdiscs and get stats > per-HW queue. This is an interesting approach. But it doesn't have the min and max for each VF, and also only works in switchdev mode.
[PATCH net-next RFC 2/3] bnxt_en: Store min/max tx/rx rings for individual VFs.
With new infrastructure to configure queues differently for each VF, we need to store the current min/max rx/tx rings for each VF. Signed-off-by: Michael Chan <michael.c...@broadcom.com> --- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 5 + drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 23 +++ 2 files changed, 24 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 9b14eb6..2f5a23c 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -837,6 +837,10 @@ struct bnxt_vf_info { u32 func_flags; /* func cfg flags */ u32 min_tx_rate; u32 max_tx_rate; + u16 min_tx_rings; + u16 max_tx_rings; + u16 min_rx_rings; + u16 max_rx_rings; void*hwrm_cmd_req_addr; dma_addr_t hwrm_cmd_req_dma_addr; }; @@ -1351,6 +1355,7 @@ struct bnxt { #ifdef CONFIG_BNXT_SRIOV int nr_vfs; struct bnxt_vf_info vf; + struct hwrm_func_vf_resource_cfg_input vf_resc_cfg_input; wait_queue_head_t sriov_cfg_wait; boolsriov_cfg; #define BNXT_SRIOV_CFG_WAIT_TMOmsecs_to_jiffies(1) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c index a649108..489e534 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c @@ -171,6 +171,10 @@ int bnxt_get_vf_config(struct net_device *dev, int vf_id, ivi->linkstate = IFLA_VF_LINK_STATE_ENABLE; else ivi->linkstate = IFLA_VF_LINK_STATE_DISABLE; + ivi->min_tx_queues = vf->min_tx_rings; + ivi->max_tx_queues = vf->max_tx_rings; + ivi->min_rx_queues = vf->min_rx_rings; + ivi->max_rx_queues = vf->max_rx_rings; return 0; } @@ -498,6 +502,8 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, int num_vfs) mutex_lock(>hwrm_cmd_lock); for (i = 0; i < num_vfs; i++) { + struct bnxt_vf_info *vf = >vf[i]; + req.vf_id = cpu_to_le16(pf->first_vf_id + i); rc = _hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); @@ -506,7 +512,11 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, int num_vfs) break; } pf->active_vfs = i + 1; - pf->vf[i].fw_fid = pf->first_vf_id + i; + vf->fw_fid = pf->first_vf_id + i; + vf->min_tx_rings = le16_to_cpu(req.min_tx_rings); + vf->max_tx_rings = vf_tx_rings; + vf->min_rx_rings = le16_to_cpu(req.min_rx_rings); + vf->max_rx_rings = vf_rx_rings; } mutex_unlock(>hwrm_cmd_lock); if (pf->active_vfs) { @@ -521,6 +531,7 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, int num_vfs) hw_resc->max_stat_ctxs -= le16_to_cpu(req.min_stat_ctx) * n; hw_resc->max_vnics -= le16_to_cpu(req.min_vnics) * n; + memcpy(>vf_resc_cfg_input, , sizeof(req)); rc = pf->active_vfs; } return rc; @@ -585,6 +596,7 @@ static int bnxt_hwrm_func_cfg(struct bnxt *bp, int num_vfs) mutex_lock(>hwrm_cmd_lock); for (i = 0; i < num_vfs; i++) { + struct bnxt_vf_info *vf = >vf[i]; int vf_tx_rsvd = vf_tx_rings; req.fid = cpu_to_le16(pf->first_vf_id + i); @@ -593,12 +605,15 @@ static int bnxt_hwrm_func_cfg(struct bnxt *bp, int num_vfs) if (rc) break; pf->active_vfs = i + 1; - pf->vf[i].fw_fid = le16_to_cpu(req.fid); - rc = __bnxt_hwrm_get_tx_rings(bp, pf->vf[i].fw_fid, - _tx_rsvd); + vf->fw_fid = le16_to_cpu(req.fid); + rc = __bnxt_hwrm_get_tx_rings(bp, vf->fw_fid, _tx_rsvd); if (rc) break; total_vf_tx_rings += vf_tx_rsvd; + vf->min_tx_rings = vf_tx_rsvd; + vf->max_tx_rings = vf_tx_rsvd; + vf->min_rx_rings = vf_rx_rings; + vf->max_rx_rings = vf_rx_rings; } mutex_unlock(>hwrm_cmd_lock); if (rc) -- 1.8.3.1
[PATCH net-next RFC 3/3] bnxt_en: Implement .ndo_set_vf_queues().
Implement .ndo_set_vf_queues() on the PF driver to configure the queues parameters for individual VFs. This allows the admin. on the host to increase or decrease queues for individual VFs. Signed-off-by: Michael Chan <michael.c...@broadcom.com> --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 1 + drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 67 + drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h | 2 + 3 files changed, 70 insertions(+) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index dfa0839..2ce9779 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -8373,6 +8373,7 @@ static int bnxt_swdev_port_attr_get(struct net_device *dev, .ndo_set_vf_link_state = bnxt_set_vf_link_state, .ndo_set_vf_spoofchk= bnxt_set_vf_spoofchk, .ndo_set_vf_trust = bnxt_set_vf_trust, + .ndo_set_vf_queues = bnxt_set_vf_queues, #endif #ifdef CONFIG_NET_POLL_CONTROLLER .ndo_poll_controller= bnxt_poll_controller, diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c index 489e534..f0d938c 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c @@ -138,6 +138,73 @@ int bnxt_set_vf_trust(struct net_device *dev, int vf_id, bool trusted) return 0; } +static bool bnxt_param_ok(int new, u16 curr, u16 avail) +{ + int delta; + + if (new <= curr) + return true; + + delta = new - curr; + if (delta <= avail) + return true; + return false; +} + +int bnxt_set_vf_queues(struct net_device *dev, int vf_id, int min_txq, + int max_txq, int min_rxq, int max_rxq) +{ + struct hwrm_func_vf_resource_cfg_input req = {0}; + struct bnxt *bp = netdev_priv(dev); + u16 avail_tx_rings, avail_rx_rings; + struct bnxt_hw_resc *hw_resc; + struct bnxt_vf_info *vf; + int rc; + + if (bnxt_vf_ndo_prep(bp, vf_id)) + return -EINVAL; + + if (!(bp->flags & BNXT_FLAG_NEW_RM)) + return -EOPNOTSUPP; + + vf = >pf.vf[vf_id]; + hw_resc = >hw_resc; + + avail_tx_rings = hw_resc->max_tx_rings - bp->tx_nr_rings; + if (bp->flags & BNXT_FLAG_AGG_RINGS) + avail_rx_rings = hw_resc->max_rx_rings - bp->rx_nr_rings * 2; + else + avail_rx_rings = hw_resc->max_rx_rings - bp->rx_nr_rings; + if (!bnxt_param_ok(min_txq, vf->min_tx_rings, avail_tx_rings)) + return -ENOBUFS; + if (!bnxt_param_ok(min_rxq, vf->min_rx_rings, avail_rx_rings)) + return -ENOBUFS; + if (!bnxt_param_ok(max_txq, vf->max_tx_rings, avail_tx_rings)) + return -ENOBUFS; + if (!bnxt_param_ok(max_rxq, vf->max_rx_rings, avail_rx_rings)) + return -ENOBUFS; + + bnxt_hwrm_cmd_hdr_init(bp, , HWRM_FUNC_VF_RESOURCE_CFG, -1, -1); + memcpy(, >vf_resc_cfg_input, sizeof(req)); + req.min_tx_rings = cpu_to_le16(min_txq); + req.min_rx_rings = cpu_to_le16(min_rxq); + req.max_tx_rings = cpu_to_le16(max_txq); + req.max_rx_rings = cpu_to_le16(max_rxq); + rc = hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT); + if (rc) + return -EIO; + + hw_resc->max_tx_rings += vf->min_tx_rings; + hw_resc->max_rx_rings += vf->min_rx_rings; + vf->min_tx_rings = min_txq; + vf->max_tx_rings = max_txq; + vf->min_rx_rings = min_rxq; + vf->max_rx_rings = max_rxq; + hw_resc->max_tx_rings -= vf->min_tx_rings; + hw_resc->max_rx_rings -= vf->min_rx_rings; + return 0; +} + int bnxt_get_vf_config(struct net_device *dev, int vf_id, struct ifla_vf_info *ivi) { diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h index e9b20cd..325b412 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h @@ -35,6 +35,8 @@ int bnxt_set_vf_link_state(struct net_device *, int, int); int bnxt_set_vf_spoofchk(struct net_device *, int, bool); int bnxt_set_vf_trust(struct net_device *dev, int vf_id, bool trust); +int bnxt_set_vf_queues(struct net_device *dev, int vf_id, int min_txq, + int max_txq, int min_rxq, int max_rxq); int bnxt_sriov_configure(struct pci_dev *pdev, int num_vfs); void bnxt_sriov_disable(struct bnxt *); void bnxt_hwrm_exec_fwd_req(struct bnxt *); -- 1.8.3.1
[PATCH net-next RFC 0/3] net: Add support to configure SR-IOV VF queues.
VF Queue resources are always limited and there is currently no infrastructure to allow the admin. on the host to add or reduce queue resources for any particular VF. This RFC series adds the infrastructure to do that and adds the functionality to the bnxt_en driver. The "ip link set" command will subsequently be patched to support the new operation. Michael Chan (3): net: Add support to configure SR-IOV VF minimum and maximum queues. bnxt_en: Store min/max tx/rx rings for individual VFs. bnxt_en: Implement .ndo_set_vf_queues(). drivers/net/ethernet/broadcom/bnxt/bnxt.c | 1 + drivers/net/ethernet/broadcom/bnxt/bnxt.h | 5 ++ drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 90 +++-- drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h | 2 + include/linux/if_link.h | 4 ++ include/linux/netdevice.h | 6 ++ include/uapi/linux/if_link.h| 9 +++ net/core/rtnetlink.c| 28 +++- 8 files changed, 138 insertions(+), 7 deletions(-) -- 1.8.3.1
[PATCH net-next RFC 1/3] net: Add support to configure SR-IOV VF minimum and maximum queues.
VF Queue resources are always limited and there is currently no infrastructure to allow the admin. on the host to add or reduce queue resources for any particular VF. With ever increasing number of VFs being supported, it is desirable to allow the admin. to configure queue resources differently for the VFs. Some VFs may require more or fewer queues due to different bandwidth requirements or different number of vCPUs in the VM. This patch adds the infrastructure to do that by adding IFLA_VF_QUEUES netlink attribute and a new .ndo_set_vf_queues() to the net_device_ops. Four parameters are exposed for each VF: o min_tx_queues - Guaranteed or current tx queues assigned to the VF. o max_tx_queues - Maximum but not necessarily guaranteed tx queues available to the VF. o min_rx_queues - Guaranteed or current rx queues assigned to the VF. o max_rx_queues - Maximum but not necessarily guaranteed rx queues available to the VF. The "ip link set" command will subsequently be patched to support the new operation to set the above parameters. After the admin. makes a change to the above parameters, the corresponding VF will have a new range of channels to set using ethtool -L. Signed-off-by: Michael Chan <michael.c...@broadcom.com> --- include/linux/if_link.h | 4 include/linux/netdevice.h| 6 ++ include/uapi/linux/if_link.h | 9 + net/core/rtnetlink.c | 28 +--- 4 files changed, 44 insertions(+), 3 deletions(-) diff --git a/include/linux/if_link.h b/include/linux/if_link.h index 622658d..8e81121 100644 --- a/include/linux/if_link.h +++ b/include/linux/if_link.h @@ -29,5 +29,9 @@ struct ifla_vf_info { __u32 rss_query_en; __u32 trusted; __be16 vlan_proto; + __u32 min_tx_queues; + __u32 max_tx_queues; + __u32 min_rx_queues; + __u32 max_rx_queues; }; #endif /* _LINUX_IF_LINK_H */ diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 03ed492..30a3caf 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1023,6 +1023,8 @@ struct dev_ifalias { * with PF and querying it may introduce a theoretical security risk. * int (*ndo_set_vf_rss_query_en)(struct net_device *dev, int vf, bool setting); * int (*ndo_get_vf_port)(struct net_device *dev, int vf, struct sk_buff *skb); + * int (*ndo_set_vf_queues)(struct net_device *dev, int vf, int min_txq, + * int max_txq, int min_rxq, int max_rxq); * int (*ndo_setup_tc)(struct net_device *dev, enum tc_setup_type type, *void *type_data); * Called to setup any 'tc' scheduler, classifier or action on @dev. @@ -1272,6 +1274,10 @@ struct net_device_ops { int (*ndo_set_vf_rss_query_en)( struct net_device *dev, int vf, bool setting); + int (*ndo_set_vf_queues)(struct net_device *dev, +int vf, +int min_txq, int max_txq, +int min_rxq, int max_rxq); int (*ndo_setup_tc)(struct net_device *dev, enum tc_setup_type type, void *type_data); diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index b852664..fc56a47 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -658,6 +658,7 @@ enum { IFLA_VF_IB_NODE_GUID, /* VF Infiniband node GUID */ IFLA_VF_IB_PORT_GUID, /* VF Infiniband port GUID */ IFLA_VF_VLAN_LIST, /* nested list of vlans, option for QinQ */ + IFLA_VF_QUEUES, /* Min and Max TX/RX queues */ __IFLA_VF_MAX, }; @@ -748,6 +749,14 @@ struct ifla_vf_trust { __u32 setting; }; +struct ifla_vf_queues { + __u32 vf; + __u32 min_tx_queues; + __u32 max_tx_queues; + __u32 min_rx_queues; + __u32 max_rx_queues; +}; + /* VF ports management section * * Nested layout of set/get msg is: diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 8080254..7cf3582 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -921,7 +921,8 @@ static inline int rtnl_vfinfo_size(const struct net_device *dev, nla_total_size_64bit(sizeof(__u64)) + /* IFLA_VF_STATS_TX_DROPPED */ nla_total_size_64bit(sizeof(__u64)) + -nla_total_size(sizeof(struct ifla_vf_trust))); +nla_total_size(sizeof(struct ifla_vf_trust)) + +nla_total_size(sizeof(struct ifla_vf_queues))); return size; } else return 0; @@ -1181,6 +
[PATCH net-next 0/4] bnxt_en: Fixes for net-next.
This series includes a bug fix for a regression in firmware message polling introduced recently on net-next. There are 3 additional minor fixes for unsupported link speed checking, VF MAC address handling, and setting PHY eeprom length. Michael Chan (3): bnxt_en: Fix firmware message delay loop regression. bnxt_en: Check unsupported speeds in bnxt_update_link() on PF only. bnxt_en: Always forward VF MAC address to the PF. Vasundhara Volam (1): bnxt_en: Read phy eeprom A2h address only when optical diagnostics is supported. drivers/net/ethernet/broadcom/bnxt/bnxt.c | 17 - drivers/net/ethernet/broadcom/bnxt/bnxt.h | 10 -- drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 20 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 3 ++- 4 files changed, 30 insertions(+), 20 deletions(-) -- 1.8.3.1
[PATCH net-next 1/4] bnxt_en: Fix firmware message delay loop regression.
A recent change to reduce delay granularity waiting for firmware reponse has caused a regression. With a tighter delay loop, the driver may see the beginning part of the response faster. The original 5 usec delay to wait for the rest of the message is not long enough and some messages are detected as invalid. Increase the maximum wait time from 5 usec to 20 usec. Also, fix the debug message that shows the total delay time for the response when the message times out. With the new logic, the delay time is not fixed per iteration of the loop, so we define a macro to show the total delay time. Fixes: 9751e8e71487 ("bnxt_en: reduce timeout on initial HWRM calls") Signed-off-by: Michael Chan <michael.c...@broadcom.com> --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 12 drivers/net/ethernet/broadcom/bnxt/bnxt.h | 7 +++ 2 files changed, 15 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index efe5c72..168342a 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -3530,6 +3530,8 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void *msg, u32 msg_len, HWRM_RESP_LEN_SFT; valid = bp->hwrm_cmd_resp_addr + len - 1; } else { + int j; + /* Check if response len is updated */ for (i = 0; i < tmo_count; i++) { len = (le32_to_cpu(*resp_len) & HWRM_RESP_LEN_MASK) >> @@ -3547,14 +3549,15 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void *msg, u32 msg_len, if (i >= tmo_count) { netdev_err(bp->dev, "Error (timeout: %d) msg {0x%x 0x%x} len:%d\n", - timeout, le16_to_cpu(req->req_type), + HWRM_TOTAL_TIMEOUT(i), + le16_to_cpu(req->req_type), le16_to_cpu(req->seq_id), len); return -1; } /* Last byte of resp contains valid bit */ valid = bp->hwrm_cmd_resp_addr + len - 1; - for (i = 0; i < 5; i++) { + for (j = 0; j < HWRM_VALID_BIT_DELAY_USEC; j++) { /* make sure we read from updated DMA memory */ dma_rmb(); if (*valid) @@ -3562,9 +3565,10 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void *msg, u32 msg_len, udelay(1); } - if (i >= 5) { + if (j >= HWRM_VALID_BIT_DELAY_USEC) { netdev_err(bp->dev, "Error (timeout: %d) msg {0x%x 0x%x} len:%d v:%d\n", - timeout, le16_to_cpu(req->req_type), + HWRM_TOTAL_TIMEOUT(i), + le16_to_cpu(req->req_type), le16_to_cpu(req->seq_id), len, *valid); return -1; } diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 8df1d8b..a9c210e 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -539,6 +539,13 @@ struct rx_tpa_end_cmp_ext { #define HWRM_MIN_TIMEOUT 25 #define HWRM_MAX_TIMEOUT 40 +#define HWRM_TOTAL_TIMEOUT(n) (((n) <= HWRM_SHORT_TIMEOUT_COUNTER) ? \ + ((n) * HWRM_SHORT_MIN_TIMEOUT) :\ + (HWRM_SHORT_TIMEOUT_COUNTER * HWRM_SHORT_MIN_TIMEOUT + \ +((n) - HWRM_SHORT_TIMEOUT_COUNTER) * HWRM_MIN_TIMEOUT)) + +#define HWRM_VALID_BIT_DELAY_USEC 20 + #define BNXT_RX_EVENT 1 #define BNXT_AGG_EVENT 2 #define BNXT_TX_EVENT 4 -- 1.8.3.1
[PATCH net-next 4/4] bnxt_en: Always forward VF MAC address to the PF.
The current code already forwards the VF MAC address to the PF, except in one case. If the VF driver gets a valid MAC address from the firmware during probe time, it will not forward the MAC address to the PF, incorrectly assuming that the PF already knows the MAC address. This causes "ip link show" to show zero VF MAC addresses for this case. This assumption is not correct. Newer firmware remembers the VF MAC address last used by the VF and provides it to the VF driver during probe. So we need to always forward the VF MAC address to the PF. The forwarded MAC address may now be the PF assigned MAC address and so we need to make sure we approve it for this case. Signed-off-by: Michael Chan <michael.c...@broadcom.com> --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +- drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index cd3ab78..dfa0839 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -8678,8 +8678,8 @@ static int bnxt_init_mac_addr(struct bnxt *bp) memcpy(bp->dev->dev_addr, vf->mac_addr, ETH_ALEN); } else { eth_hw_addr_random(bp->dev); - rc = bnxt_approve_mac(bp, bp->dev->dev_addr); } + rc = bnxt_approve_mac(bp, bp->dev->dev_addr); #endif } return rc; diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c index cc21d87..a649108 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c @@ -923,7 +923,8 @@ static int bnxt_vf_configure_mac(struct bnxt *bp, struct bnxt_vf_info *vf) if (req->enables & cpu_to_le32(FUNC_VF_CFG_REQ_ENABLES_DFLT_MAC_ADDR)) { if (is_valid_ether_addr(req->dflt_mac_addr) && ((vf->flags & BNXT_VF_TRUST) || -(!is_valid_ether_addr(vf->mac_addr { +!is_valid_ether_addr(vf->mac_addr) || +ether_addr_equal(req->dflt_mac_addr, vf->mac_addr))) { ether_addr_copy(vf->vf_mac_addr, req->dflt_mac_addr); return bnxt_hwrm_exec_fwd_resp(bp, vf, msg_size); } -- 1.8.3.1