Re: [PATCH net-next RFC 7/7] bnxt_en: Add bnxt_en initial port params table and register it

2018-12-06 Thread Michael Chan
On Thu, Dec 6, 2018 at 2:37 AM Jakub Kicinski
 wrote:
>
> On Thu, 6 Dec 2018 00:57:05 -0800, Michael Chan wrote:
> > On Wed, Dec 5, 2018 at 11:11 PM Jakub Kicinski wrote:
> > >
> > > On Wed, 5 Dec 2018 22:41:43 -0800, Michael Chan wrote:
> > > >
> > > > It will be in the BIOS only for a LOM, I think.  For a NIC, it should
> > > > be in the NIC's NVRAM.
> > >
> > > This is all vague.  Could you please clearly state the use case.
> > >
> > Well, the WoL setting's use case should be quite simple, right?  If
> > the card's NVRAM WoL setting is ON, when you plug the card in a slot
> > that has Vaux power, it will assert PME# when a magic packet is
> > received.  Again, the WoL setting in this context is similar to other
> > power up settings such as PCIe Gen2 or Gen3.
>
> If there was some configuration of PME# involved, maybe, but
> basic networking configuration has its APIs already.
>
> > Let's say the power up setting is ON and it boots up to Linux for the
> > first time after receiving a magic packet.  The Linux user can then
> > run ethtool -s to set the driver's non persistent WoL setting.  It can
> > be the same as the NVRAM's power up setting, or different.  Ethtool
> > may support additional WoL packet types that the power up setting does
> > not support.  Let's say the Linux user sets the ethtool WoL setting to
> > OFF and shuts down the system.  That card now will not wake up the
> > system.  But if there is a power failure and power comes back on
> > later, the card will lose the ethtool setting and go back to the power
> > up WoL setting, which is ON in this example.
>
> So in your example there is a machine with a 25/40/100G NIC that
> doesn't have any remote BMC control, and connected to a L2 network
> where a magic packet can be received.
>
> In my experience machines are either low end/embedded and they just
> boot on power on fully (to Linux), or they are proper machines which
> support IPMI etc.
>
> If you could illuminate the use case some more I'd really appreciate
> that.  In your hypothetical scenario you still have to get the link
> up, so if we apply this patch a logical extension would be to add all
> ethtool link settings as devlink parameters as well.  Florian recently
> added an option to wake based on a packet that matched an n-tuple
> filter.  If your use case is legit, doing the same thing with n-tuple
> filters instead of Magic Packets is very much legit, too.  So we will
> poke n-tuple filters via devlink params?

We only store a magic packet WoL bit in the NVRAM for basic power up
WoL setting.  I doubt that people will store the entire n-tuple WoL
pattern in NVRAM for basic power up WoL.  The whole idea is to have a
basic method to wake up the machine after power up with Vaux.  If the
cable is connected, the NIC will autoneg to some lower speed that Vaux
can support.  I think we've been supporting this since the tg3 days.


Re: [PATCH net-next RFC 7/7] bnxt_en: Add bnxt_en initial port params table and register it

2018-12-06 Thread Michael Chan
On Wed, Dec 5, 2018 at 11:11 PM Jakub Kicinski
 wrote:
>
> On Wed, 5 Dec 2018 22:41:43 -0800, Michael Chan wrote:
> >
> > It will be in the BIOS only for a LOM, I think.  For a NIC, it should
> > be in the NIC's NVRAM.
>
> This is all vague.  Could you please clearly state the use case.
>
Well, the WoL setting's use case should be quite simple, right?  If
the card's NVRAM WoL setting is ON, when you plug the card in a slot
that has Vaux power, it will assert PME# when a magic packet is
received.  Again, the WoL setting in this context is similar to other
power up settings such as PCIe Gen2 or Gen3.

Let's say the power up setting is ON and it boots up to Linux for the
first time after receiving a magic packet.  The Linux user can then
run ethtool -s to set the driver's non persistent WoL setting.  It can
be the same as the NVRAM's power up setting, or different.  Ethtool
may support additional WoL packet types that the power up setting does
not support.  Let's say the Linux user sets the ethtool WoL setting to
OFF and shuts down the system.  That card now will not wake up the
system.  But if there is a power failure and power comes back on
later, the card will lose the ethtool setting and go back to the power
up WoL setting, which is ON in this example.


Re: [PATCH net-next RFC 7/7] bnxt_en: Add bnxt_en initial port params table and register it

2018-12-05 Thread Michael Chan
On Wed, Dec 5, 2018 at 10:00 PM Jakub Kicinski
 wrote:
>
> On Wed, 5 Dec 2018 17:18:52 -0800, Michael Chan wrote:
> > On Wed, Dec 5, 2018 at 4:42 PM Jakub Kicinski wrote:
> > > On Wed, 5 Dec 2018 16:01:08 -0800, Michael Chan wrote:
> > > > On Wed, Dec 5, 2018 at 3:33 PM Jakub Kicinski wrote:
> > > > > On Wed,  5 Dec 2018 11:27:00 +0530, Vasundhara Volam wrote:
> > > > > > Register devlink_port with devlink and create initial port params
> > > > > > table for bnxt_en. The table consists of a generic parameter:
> > > > > >
> > > > > > wake-on-lan: Enables Wake on Lan for this port when magic packet
> > > > > > is received with this port's MAC address using ACPI pattern.
> > > > > > If enabled, the controller asserts a wake pin upon reception of
> > > > > > WoL packet.  ACPI (Advanced Configuration and Power Interface) is
> > > > > > an industry specification for the efficient handling of power
> > > > > > consumption in desktop and mobile computers.
> > > > > >
> > > > > > Cc: Michael Chan 
> > > > > > Signed-off-by: Vasundhara Volam 
> > > > >
> > > > > Why do we need a WoL as a devlink parameter (rather than ethtool -s)?
> > > >
> > > > I believe ethtool -s for WoL is a non-persistent setting, meaning that
> > > > if you power cycle the system, the WoL setting will go back to
> > > > default.
> > > >
> > > > devlink on the other hand is a permanent setting.  ethtool should
> > > > initially report the default WoL setting and it can then be changed
> > > > (in a non permanent way) using ethtool -s.
> > >
> > > All network configuration settings in Linux are non-persistent AFAIK.
> > > That's why network configuration daemons exist:
> > >
> > > https://wiki.debian.org/WakeOnLan
> > >
> > > Perhaps the objective to move more of the network configuration into the
> > > firmware?  That'd be a bleak scenario, so probably not..
> > >
> > > My understanding was the persistent devlink settings are for things
> > > which have to be set at device init time.  Like say PCI endpoint
> > > configuration.  FW loading configuration.
> > >
> > > Besides, the parameter you add is just true/false, when ethtool has
> > > multiple options.
> > >
> > > It feels to me like we moved from ioctls to Netlink, and now even
> > > before ethtool was converted to Netlink we may move to unstructured
> > > strings.  That's not a step forward, if you ask me.
> >
> > We do have a parameter in NVRAM that controls default WoL.  I think
> > this is to expose that parameter so it can be set one way or the
> > other. There are scenarios where Linux has not booted yet (and so
> > there is no opportunity to run ethtool -s or any daemons yet) and this
> > parameter will control whether the machine will wake up or not.
>
> Isn't that set in BIOS/setup?  The config before any OS boots?  Because
> the BMC or whatnot has to actually configure the board to power
> appropriate things up.  Please clarify.

It will be in the BIOS only for a LOM, I think.  For a NIC, it should
be in the NIC's NVRAM.

>
> And *if* it is proven this config is more than just setting the default
> IMHO the setting belongs in the ethtool API.  We can't just add devlink
> params for all existing config APIs just because it has persistence.

I'm not sure I understand your point.  I believe the NIC firmware will
set up the NIC's WoL setting right after power up based on this NVRAM
parameter.  Similar to how the firmware will setup PCIe Gen2 or Gen3
right after power up, for example.  So why would this belong to
ethtool?  I understand the confusion that ethtool -s has a similar WoL
setting.  But again, that's different.  This one is the power up
setting that impacts whether a magic packet can or cannot wake up the
system right after power up (before booting up to Linux or other OS).


Re: [PATCH net-next RFC 7/7] bnxt_en: Add bnxt_en initial port params table and register it

2018-12-05 Thread Michael Chan
On Wed, Dec 5, 2018 at 4:42 PM Jakub Kicinski
 wrote:
>
> On Wed, 5 Dec 2018 16:01:08 -0800, Michael Chan wrote:
> > On Wed, Dec 5, 2018 at 3:33 PM Jakub Kicinski
> >  wrote:
> > >
> > > On Wed,  5 Dec 2018 11:27:00 +0530, Vasundhara Volam wrote:
> > > > Register devlink_port with devlink and create initial port params
> > > > table for bnxt_en. The table consists of a generic parameter:
> > > >
> > > > wake-on-lan: Enables Wake on Lan for this port when magic packet
> > > > is received with this port's MAC address using ACPI pattern.
> > > > If enabled, the controller asserts a wake pin upon reception of
> > > > WoL packet.  ACPI (Advanced Configuration and Power Interface) is
> > > > an industry specification for the efficient handling of power
> > > > consumption in desktop and mobile computers.
> > > >
> > > > Cc: Michael Chan 
> > > > Signed-off-by: Vasundhara Volam 
> > >
> > > Why do we need a WoL as a devlink parameter (rather than ethtool -s)?
> >
> > I believe ethtool -s for WoL is a non-persistent setting, meaning that
> > if you power cycle the system, the WoL setting will go back to
> > default.
> >
> > devlink on the other hand is a permanent setting.  ethtool should
> > initially report the default WoL setting and it can then be changed
> > (in a non permanent way) using ethtool -s.
>
> All network configuration settings in Linux are non-persistent AFAIK.
> That's why network configuration daemons exist:
>
> https://wiki.debian.org/WakeOnLan
>
> Perhaps the objective to move more of the network configuration into the
> firmware?  That'd be a bleak scenario, so probably not..
>
> My understanding was the persistent devlink settings are for things
> which have to be set at device init time.  Like say PCI endpoint
> configuration.  FW loading configuration.
>
> Besides, the parameter you add is just true/false, when ethtool has
> multiple options.
>
> It feels to me like we moved from ioctls to Netlink, and now even
> before ethtool was converted to Netlink we may move to unstructured
> strings.  That's not a step forward, if you ask me.

We do have a parameter in NVRAM that controls default WoL.  I think
this is to expose that parameter so it can be set one way or the
other. There are scenarios where Linux has not booted yet (and so
there is no opportunity to run ethtool -s or any daemons yet) and this
parameter will control whether the machine will wake up or not.


Re: [PATCH net-next RFC 7/7] bnxt_en: Add bnxt_en initial port params table and register it

2018-12-05 Thread Michael Chan
On Wed, Dec 5, 2018 at 3:33 PM Jakub Kicinski
 wrote:
>
> On Wed,  5 Dec 2018 11:27:00 +0530, Vasundhara Volam wrote:
> > Register devlink_port with devlink and create initial port params
> > table for bnxt_en. The table consists of a generic parameter:
> >
> > wake-on-lan: Enables Wake on Lan for this port when magic packet
> > is received with this port's MAC address using ACPI pattern.
> > If enabled, the controller asserts a wake pin upon reception of
> > WoL packet.  ACPI (Advanced Configuration and Power Interface) is
> > an industry specification for the efficient handling of power
> > consumption in desktop and mobile computers.
> >
> > Cc: Michael Chan 
> > Signed-off-by: Vasundhara Volam 
>
> Why do we need a WoL as a devlink parameter (rather than ethtool -s)?

I believe ethtool -s for WoL is a non-persistent setting, meaning that
if you power cycle the system, the WoL setting will go back to
default.

devlink on the other hand is a permanent setting.  ethtool should
initially report the default WoL setting and it can then be changed
(in a non permanent way) using ethtool -s.


[PATCH net 2/6] bnxt_en: Fix rx_l4_csum_errors counter on 57500 devices.

2018-11-15 Thread Michael Chan
The software counter structure is defined in both the CP ring's structure
and the NQ ring's structure on the new devices.  The legacy code adds the
counter to the CP ring's structure and the counter won't get displayed
since the ethtool code is looking at the NQ ring's structure.

Since all other counters are contained in the NQ ring's structure, it
makes more sense to count rx_l4_csum_errors in the NQ.

Fixes: 50e3ab7836b5 ("bnxt_en: Allocate completion ring structures for 57500 
series chips.")
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 4a45a2b..5856099 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1675,7 +1675,7 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct 
bnxt_cp_ring_info *cpr,
} else {
if (rxcmp1->rx_cmp_cfa_code_errors_v2 & RX_CMP_L4_CS_ERR_BITS) {
if (dev->features & NETIF_F_RXCSUM)
-   cpr->rx_l4_csum_errors++;
+   bnapi->cp_ring.rx_l4_csum_errors++;
}
}
 
-- 
2.5.1



[PATCH net 0/6] bnxt_en: Bug fixes.

2018-11-15 Thread Michael Chan
Most of the bug fixes are related to the new 57500 chips, including some
initialization and counter fixes, disabling RDMA support, and a
workaround for occasional missing interrupts.  The last patch from
Vasundhara fixes the year/month parameters for firmware coredump.

Michael Chan (5):
  bnxt_en: Fix RSS context allocation.
  bnxt_en: Fix rx_l4_csum_errors counter on 57500 devices.
  bnxt_en: Disable RDMA support on the 57500 chips.
  bnxt_en: Workaround occasional TX timeout on 57500 A0.
  bnxt_en: Add software "missed_irqs" counter.

Vasundhara Volam (1):
  bnxt_en: Fix filling time in bnxt_fill_coredump_record()

 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 70 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  4 ++
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c |  9 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c |  3 +
 4 files changed, 81 insertions(+), 5 deletions(-)

-- 
2.5.1



[PATCH net 4/6] bnxt_en: Workaround occasional TX timeout on 57500 A0.

2018-11-15 Thread Michael Chan
Hardware can sometimes not generate NQ MSIX with a single pending
CP ring entry.  This seems to always happen at the last entry of
the CP ring before it wraps.  Add logic to check all the CP rings for
pending entries without the CP ring consumer index advancing.  Calling
HWRM_DBG_RING_INFO_GET to read the context of the CP ring will flush
out the NQ entry and MSIX.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 65 +++
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  3 ++
 2 files changed, 68 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 5856099..5d4147a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -8714,6 +8714,26 @@ static int bnxt_set_features(struct net_device *dev, 
netdev_features_t features)
return rc;
 }
 
+static int bnxt_dbg_hwrm_ring_info_get(struct bnxt *bp, u8 ring_type,
+  u32 ring_id, u32 *prod, u32 *cons)
+{
+   struct hwrm_dbg_ring_info_get_output *resp = bp->hwrm_cmd_resp_addr;
+   struct hwrm_dbg_ring_info_get_input req = {0};
+   int rc;
+
+   bnxt_hwrm_cmd_hdr_init(bp, , HWRM_DBG_RING_INFO_GET, -1, -1);
+   req.ring_type = ring_type;
+   req.fw_ring_id = cpu_to_le32(ring_id);
+   mutex_lock(>hwrm_cmd_lock);
+   rc = _hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
+   if (!rc) {
+   *prod = le32_to_cpu(resp->producer_index);
+   *cons = le32_to_cpu(resp->consumer_index);
+   }
+   mutex_unlock(>hwrm_cmd_lock);
+   return rc;
+}
+
 static void bnxt_dump_tx_sw_state(struct bnxt_napi *bnapi)
 {
struct bnxt_tx_ring_info *txr = bnapi->tx_ring;
@@ -8821,6 +8841,11 @@ static void bnxt_timer(struct timer_list *t)
bnxt_queue_sp_work(bp);
}
}
+
+   if ((bp->flags & BNXT_FLAG_CHIP_P5) && netif_carrier_ok(dev)) {
+   set_bit(BNXT_RING_COAL_NOW_SP_EVENT, >sp_event);
+   bnxt_queue_sp_work(bp);
+   }
 bnxt_restart_timer:
mod_timer(>timer, jiffies + bp->current_interval);
 }
@@ -8851,6 +8876,43 @@ static void bnxt_reset(struct bnxt *bp, bool silent)
bnxt_rtnl_unlock_sp(bp);
 }
 
+static void bnxt_chk_missed_irq(struct bnxt *bp)
+{
+   int i;
+
+   if (!(bp->flags & BNXT_FLAG_CHIP_P5))
+   return;
+
+   for (i = 0; i < bp->cp_nr_rings; i++) {
+   struct bnxt_napi *bnapi = bp->bnapi[i];
+   struct bnxt_cp_ring_info *cpr;
+   u32 fw_ring_id;
+   int j;
+
+   if (!bnapi)
+   continue;
+
+   cpr = >cp_ring;
+   for (j = 0; j < 2; j++) {
+   struct bnxt_cp_ring_info *cpr2 = cpr->cp_ring_arr[j];
+   u32 val[2];
+
+   if (!cpr2 || cpr2->has_more_work ||
+   !bnxt_has_work(bp, cpr2))
+   continue;
+
+   if (cpr2->cp_raw_cons != cpr2->last_cp_raw_cons) {
+   cpr2->last_cp_raw_cons = cpr2->cp_raw_cons;
+   continue;
+   }
+   fw_ring_id = cpr2->cp_ring_struct.fw_ring_id;
+   bnxt_dbg_hwrm_ring_info_get(bp,
+   DBG_RING_INFO_GET_REQ_RING_TYPE_L2_CMPL,
+   fw_ring_id, [0], [1]);
+   }
+   }
+}
+
 static void bnxt_cfg_ntp_filters(struct bnxt *);
 
 static void bnxt_sp_task(struct work_struct *work)
@@ -8930,6 +8992,9 @@ static void bnxt_sp_task(struct work_struct *work)
if (test_and_clear_bit(BNXT_FLOW_STATS_SP_EVENT, >sp_event))
bnxt_tc_flow_stats_work(bp);
 
+   if (test_and_clear_bit(BNXT_RING_COAL_NOW_SP_EVENT, >sp_event))
+   bnxt_chk_missed_irq(bp);
+
/* These functions below will clear BNXT_STATE_IN_SP_TASK.  They
 * must be the last functions to be called before exiting.
 */
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 498b373..00bd17e 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -798,6 +798,8 @@ struct bnxt_cp_ring_info {
u8  had_work_done:1;
u8  has_more_work:1;
 
+   u32 last_cp_raw_cons;
+
struct bnxt_coalrx_ring_coal;
u64 rx_packets;
u64 rx_bytes;
@@ -1527,6 +1529,7 @@ struct bnxt {
 #define BNXT_LINK_SPEED_CHNG_SP_EVENT  14
 #define BNXT_FLOW_STATS_SP_EVENT   15
 #define BNXT_UPDATE_PHY_SP_EVENT   16
+#define 

[PATCH net 1/6] bnxt_en: Fix RSS context allocation.

2018-11-15 Thread Michael Chan
Recent commit has added the reservation of RSS context.  This requires
bnxt_hwrm_vnic_qcaps() to be called before allocating any RSS contexts.
The bnxt_hwrm_vnic_qcaps() call sets up proper flags that will
determine how many RSS contexts to allocate to support NTUPLE.

This causes a regression that too many RSS contexts are being reserved
and causing resource shortage when enabling many VFs.  Fix it by calling
bnxt_hwrm_vnic_qcaps() earlier.

Fixes: 41e8d7983752 ("bnxt_en: Modify the ring reservation functions for 57500 
series chips.")
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index dd85d79..4a45a2b 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -10087,6 +10087,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
}
 
bnxt_hwrm_func_qcfg(bp);
+   bnxt_hwrm_vnic_qcaps(bp);
bnxt_hwrm_port_led_qcaps(bp);
bnxt_ethtool_init(bp);
bnxt_dcb_init(bp);
@@ -10120,7 +10121,6 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
VNIC_RSS_CFG_REQ_HASH_TYPE_UDP_IPV6;
}
 
-   bnxt_hwrm_vnic_qcaps(bp);
if (bnxt_rfs_supported(bp)) {
dev->hw_features |= NETIF_F_NTUPLE;
if (bnxt_rfs_capable(bp)) {
-- 
2.5.1



[PATCH net 3/6] bnxt_en: Disable RDMA support on the 57500 chips.

2018-11-15 Thread Michael Chan
There is no RDMA support on 57500 chips yet, so prevent bnxt_re from
registering on these chips.  There is intermittent failure if bnxt_re
is allowed to register and proceed with RDMA operations.

Fixes: 1ab968d2f1d6 ("bnxt_en: Add PCI ID for BCM57508 device.")
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
index beee612..b59b382 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
@@ -43,6 +43,9 @@ static int bnxt_register_dev(struct bnxt_en_dev *edev, int 
ulp_id,
if (ulp_id == BNXT_ROCE_ULP) {
unsigned int max_stat_ctxs;
 
+   if (bp->flags & BNXT_FLAG_CHIP_P5)
+   return -EOPNOTSUPP;
+
max_stat_ctxs = bnxt_get_max_func_stat_ctxs(bp);
if (max_stat_ctxs <= BNXT_MIN_ROCE_STAT_CTXS ||
bp->num_stat_ctxs == max_stat_ctxs)
-- 
2.5.1



[PATCH net 5/6] bnxt_en: Add software "missed_irqs" counter.

2018-11-15 Thread Michael Chan
To keep track of the number of times the workaround code for 57500 A0
has been triggered.  This is a per NQ counter.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 5 -
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 5d4147a..d4c3001 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -8909,6 +8909,7 @@ static void bnxt_chk_missed_irq(struct bnxt *bp)
bnxt_dbg_hwrm_ring_info_get(bp,
DBG_RING_INFO_GET_REQ_RING_TYPE_L2_CMPL,
fw_ring_id, [0], [1]);
+   cpr->missed_irqs++;
}
}
 }
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 00bd17e..9e99d4a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -818,6 +818,7 @@ struct bnxt_cp_ring_info {
dma_addr_t  hw_stats_map;
u32 hw_stats_ctx_id;
u64 rx_l4_csum_errors;
+   u64 missed_irqs;
 
struct bnxt_ring_struct cp_ring_struct;
 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 4807856..4b734cd 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -137,7 +137,7 @@ static int bnxt_set_coalesce(struct net_device *dev,
return rc;
 }
 
-#define BNXT_NUM_STATS 21
+#define BNXT_NUM_STATS 22
 
 #define BNXT_RX_STATS_ENTRY(counter)   \
{ BNXT_RX_STATS_OFFSET(counter), __stringify(counter) }
@@ -384,6 +384,7 @@ static void bnxt_get_ethtool_stats(struct net_device *dev,
for (k = 0; k < stat_fields; j++, k++)
buf[j] = le64_to_cpu(hw_stats[k]);
buf[j++] = cpr->rx_l4_csum_errors;
+   buf[j++] = cpr->missed_irqs;
 
bnxt_sw_func_stats[RX_TOTAL_DISCARDS].counter +=
le64_to_cpu(cpr->hw_stats->rx_discard_pkts);
@@ -468,6 +469,8 @@ static void bnxt_get_strings(struct net_device *dev, u32 
stringset, u8 *buf)
buf += ETH_GSTRING_LEN;
sprintf(buf, "[%d]: rx_l4_csum_errors", i);
buf += ETH_GSTRING_LEN;
+   sprintf(buf, "[%d]: missed_irqs", i);
+   buf += ETH_GSTRING_LEN;
}
for (i = 0; i < BNXT_NUM_SW_FUNC_STATS; i++) {
strcpy(buf, bnxt_sw_func_stats[i].string);
-- 
2.5.1



[PATCH net 6/6] bnxt_en: Fix filling time in bnxt_fill_coredump_record()

2018-11-15 Thread Michael Chan
From: Vasundhara Volam 

Fix the year and month offset while storing it in
bnxt_fill_coredump_record().

Fixes: 6c5657d085ae ("bnxt_en: Add support for ethtool get dump.")
Signed-off-by: Vasundhara Volam 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 4b734cd..6cc69a5 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -2945,8 +2945,8 @@ bnxt_fill_coredump_record(struct bnxt *bp, struct 
bnxt_coredump_record *record,
record->asic_state = 0;
strlcpy(record->system_name, utsname()->nodename,
sizeof(record->system_name));
-   record->year = cpu_to_le16(tm.tm_year);
-   record->month = cpu_to_le16(tm.tm_mon);
+   record->year = cpu_to_le16(tm.tm_year + 1900);
+   record->month = cpu_to_le16(tm.tm_mon + 1);
record->day = cpu_to_le16(tm.tm_mday);
record->hour = cpu_to_le16(tm.tm_hour);
record->minute = cpu_to_le16(tm.tm_min);
-- 
2.5.1



Re: [PATCH net-next] bnxt_en: Copy and paste bug in extended tx_stats

2018-10-18 Thread Michael Chan
On Thu, Oct 18, 2018 at 1:02 AM Dan Carpenter  wrote:
>
> The struct type was copied from the line before but it should be "tx"
> instead of "rx".  I have reviewed the code and I can't immediately see
> that this bug causes a runtime issue.
>
> Fixes: 36e53349b60b ("bnxt_en: Add additional extended port statistics.")
> Signed-off-by: Dan Carpenter 

Thanks.  Luckily, we did not use sizeof(*bp->hw_tx_port_stats_ext) to
allocate the memory, so there is no run-time issue.

Acked-by: Michael Chan 


[PATCH net-next 22/23] bnxt_en: Add new NAPI poll function for 57500 chips.

2018-10-14 Thread Michael Chan
Add a new poll function that polls for NQ events.  If the NQ event is
a CQ notification, we locate the CP ring from the cq_handle and call
__bnxt_poll_work() to handle RX/TX events on the CP ring.

Add a new has_more_work field in struct bnxt_cp_ring_info to indicate
budget has been reached.  __bnxt_poll_cqs_done() is called to update or
ARM the CP rings if budget has not been reached or not.  If budget
has been reached, the next bnxt_poll_p5() call will continue to poll
from the CQ rings directly.  Otherwise, the NQ will be ARMed for the
next IRQ.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 114 --
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |   4 ++
 2 files changed, 114 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 10d713aa..f518119 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1900,6 +1900,7 @@ static int __bnxt_poll_work(struct bnxt *bp, struct 
bnxt_cp_ring_info *cpr,
u8 event = 0;
struct tx_cmp *txcmp;
 
+   cpr->has_more_work = 0;
while (1) {
int rc;
 
@@ -1920,6 +1921,8 @@ static int __bnxt_poll_work(struct bnxt *bp, struct 
bnxt_cp_ring_info *cpr,
if (unlikely(tx_pkts > bp->tx_wake_thresh)) {
rx_pkts = budget;
raw_cons = NEXT_RAW_CMP(raw_cons);
+   if (budget)
+   cpr->has_more_work = 1;
break;
}
} else if ((TX_CMP_TYPE(txcmp) & 0x30) == 0x10) {
@@ -1949,8 +1952,10 @@ static int __bnxt_poll_work(struct bnxt *bp, struct 
bnxt_cp_ring_info *cpr,
}
raw_cons = NEXT_RAW_CMP(raw_cons);
 
-   if (rx_pkts && rx_pkts == budget)
+   if (rx_pkts && rx_pkts == budget) {
+   cpr->has_more_work = 1;
break;
+   }
}
 
if (event & BNXT_TX_EVENT) {
@@ -2106,6 +2111,104 @@ static int bnxt_poll(struct napi_struct *napi, int 
budget)
return work_done;
 }
 
+static int __bnxt_poll_cqs(struct bnxt *bp, struct bnxt_napi *bnapi, int 
budget)
+{
+   struct bnxt_cp_ring_info *cpr = >cp_ring;
+   int i, work_done = 0;
+
+   for (i = 0; i < 2; i++) {
+   struct bnxt_cp_ring_info *cpr2 = cpr->cp_ring_arr[i];
+
+   if (cpr2) {
+   work_done += __bnxt_poll_work(bp, cpr2,
+ budget - work_done);
+   cpr->has_more_work |= cpr2->has_more_work;
+   }
+   }
+   return work_done;
+}
+
+static void __bnxt_poll_cqs_done(struct bnxt *bp, struct bnxt_napi *bnapi,
+u64 dbr_type, bool all)
+{
+   struct bnxt_cp_ring_info *cpr = >cp_ring;
+   int i;
+
+   for (i = 0; i < 2; i++) {
+   struct bnxt_cp_ring_info *cpr2 = cpr->cp_ring_arr[i];
+   struct bnxt_db_info *db;
+
+   if (cpr2 && (all || cpr2->had_work_done)) {
+   db = >cp_db;
+   writeq(db->db_key64 | dbr_type |
+  RING_CMP(cpr2->cp_raw_cons), db->doorbell);
+   cpr2->had_work_done = 0;
+   }
+   }
+   __bnxt_poll_work_done(bp, bnapi);
+}
+
+static int bnxt_poll_p5(struct napi_struct *napi, int budget)
+{
+   struct bnxt_napi *bnapi = container_of(napi, struct bnxt_napi, napi);
+   struct bnxt_cp_ring_info *cpr = >cp_ring;
+   u32 raw_cons = cpr->cp_raw_cons;
+   struct bnxt *bp = bnapi->bp;
+   struct nqe_cn *nqcmp;
+   int work_done = 0;
+   u32 cons;
+
+   if (cpr->has_more_work) {
+   cpr->has_more_work = 0;
+   work_done = __bnxt_poll_cqs(bp, bnapi, budget);
+   if (cpr->has_more_work) {
+   __bnxt_poll_cqs_done(bp, bnapi, DBR_TYPE_CQ, false);
+   return work_done;
+   }
+   __bnxt_poll_cqs_done(bp, bnapi, DBR_TYPE_CQ_ARMALL, true);
+   if (napi_complete_done(napi, work_done))
+   BNXT_DB_NQ_ARM_P5(>cp_db, cpr->cp_raw_cons);
+   return work_done;
+   }
+   while (1) {
+   cons = RING_CMP(raw_cons);
+   nqcmp = >nq_desc_ring[CP_RING(cons)][CP_IDX(cons)];
+
+   if (!NQ_CMP_VALID(nqcmp, raw_cons)) {
+   __bnxt_poll_cqs_done(bp, bnapi, DBR_TYPE_CQ_ARMALL,
+false);
+   cpr->cp_raw_cons = raw_cons;
+ 

[PATCH net-next 20/23] bnxt_en: Add coalescing setup for 57500 chips.

2018-10-14 Thread Michael Chan
On legacy chips, the CP ring may be shared between RX and TX and so only
setup the RX coalescing parameters in such a case.  On 57500 chips, we
always have a dedicated CP ring for TX so we can always set up the
TX coalescing parameters in bnxt_hwrm_set_coal().

Also, the min_timer coalescing parameter applies to the NQ on the new
chips and a separate firmware call needs to be made to set it up.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 46 +++
 1 file changed, 46 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 5ec477f..065f4c2 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -5424,6 +5424,7 @@ static void bnxt_hwrm_coal_params_qcaps(struct bnxt *bp)
rc = _hwrm_send_message_silent(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
if (!rc) {
coal_cap->cmpl_params = le32_to_cpu(resp->cmpl_params);
+   coal_cap->nq_params = le32_to_cpu(resp->nq_params);
coal_cap->num_cmpl_dma_aggr_max =
le16_to_cpu(resp->num_cmpl_dma_aggr_max);
coal_cap->num_cmpl_dma_aggr_during_int_max =
@@ -5508,6 +5509,32 @@ static void bnxt_hwrm_set_coal_params(struct bnxt *bp,
req->enables |= cpu_to_le16(BNXT_COAL_CMPL_ENABLES);
 }
 
+/* Caller holds bp->hwrm_cmd_lock */
+static int __bnxt_hwrm_set_coal_nq(struct bnxt *bp, struct bnxt_napi *bnapi,
+  struct bnxt_coal *hw_coal)
+{
+   struct hwrm_ring_cmpl_ring_cfg_aggint_params_input req = {0};
+   struct bnxt_cp_ring_info *cpr = >cp_ring;
+   struct bnxt_coal_cap *coal_cap = >coal_cap;
+   u32 nq_params = coal_cap->nq_params;
+   u16 tmr;
+
+   if (!(nq_params & RING_AGGINT_QCAPS_RESP_NQ_PARAMS_INT_LAT_TMR_MIN))
+   return 0;
+
+   bnxt_hwrm_cmd_hdr_init(bp, , HWRM_RING_CMPL_RING_CFG_AGGINT_PARAMS,
+  -1, -1);
+   req.ring_id = cpu_to_le16(cpr->cp_ring_struct.fw_ring_id);
+   req.flags =
+   cpu_to_le16(RING_CMPL_RING_CFG_AGGINT_PARAMS_REQ_FLAGS_IS_NQ);
+
+   tmr = bnxt_usec_to_coal_tmr(bp, hw_coal->coal_ticks) / 2;
+   tmr = clamp_t(u16, tmr, 1, coal_cap->int_lat_tmr_min_max);
+   req.int_lat_tmr_min = cpu_to_le16(tmr);
+   req.enables |= cpu_to_le16(BNXT_COAL_CMPL_MIN_TMR_ENABLE);
+   return _hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
+}
+
 int bnxt_hwrm_set_ring_coal(struct bnxt *bp, struct bnxt_napi *bnapi)
 {
struct hwrm_ring_cmpl_ring_cfg_aggint_params_input req_rx = {0};
@@ -5553,6 +5580,7 @@ int bnxt_hwrm_set_coal(struct bnxt *bp)
mutex_lock(>hwrm_cmd_lock);
for (i = 0; i < bp->cp_nr_rings; i++) {
struct bnxt_napi *bnapi = bp->bnapi[i];
+   struct bnxt_coal *hw_coal;
u16 ring_id;
 
req = _rx;
@@ -5568,6 +5596,24 @@ int bnxt_hwrm_set_coal(struct bnxt *bp)
HWRM_CMD_TIMEOUT);
if (rc)
break;
+
+   if (!(bp->flags & BNXT_FLAG_CHIP_P5))
+   continue;
+
+   if (bnapi->rx_ring && bnapi->tx_ring) {
+   req = _tx;
+   ring_id = bnxt_cp_ring_for_tx(bp, bnapi->tx_ring);
+   req->ring_id = cpu_to_le16(ring_id);
+   rc = _hwrm_send_message(bp, req, sizeof(*req),
+   HWRM_CMD_TIMEOUT);
+   if (rc)
+   break;
+   }
+   if (bnapi->rx_ring)
+   hw_coal = >rx_coal;
+   else
+   hw_coal = >tx_coal;
+   __bnxt_hwrm_set_coal_nq(bp, bnapi, hw_coal);
}
mutex_unlock(>hwrm_cmd_lock);
return rc;
-- 
2.5.1



[PATCH net-next 17/23] bnxt_en: Increase RSS context array count and skip ring groups on 57500 chips.

2018-10-14 Thread Michael Chan
On the new 57500 chips, we need to allocate one RSS context for every
64 RX rings.  In previous chips, only one RSS context per vnic is
required regardless of the number of RX rings.  So increase the max
RSS context array count to 8.

Hardware ring groups are not used on the new chips.  Note that the
software ring group structure is still maintained in the driver to
keep track of the rings associated with the vnic.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 30 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  2 +-
 2 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 7952100..1a31328 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -2881,10 +2881,12 @@ static void bnxt_init_vnics(struct bnxt *bp)
 
for (i = 0; i < bp->nr_vnics; i++) {
struct bnxt_vnic_info *vnic = >vnic_info[i];
+   int j;
 
vnic->fw_vnic_id = INVALID_HW_RING_ID;
-   vnic->fw_rss_cos_lb_ctx[0] = INVALID_HW_RING_ID;
-   vnic->fw_rss_cos_lb_ctx[1] = INVALID_HW_RING_ID;
+   for (j = 0; j < BNXT_MAX_CTX_PER_VNIC; j++)
+   vnic->fw_rss_cos_lb_ctx[j] = INVALID_HW_RING_ID;
+
vnic->fw_l2_ctx_id = INVALID_HW_RING_ID;
 
if (bp->vnic_info[i].rss_hash_key) {
@@ -3098,6 +3100,9 @@ static int bnxt_alloc_vnic_attributes(struct bnxt *bp)
}
}
 
+   if (bp->flags & BNXT_FLAG_CHIP_P5)
+   goto vnic_skip_grps;
+
if (vnic->flags & BNXT_VNIC_RSS_FLAG)
max_rings = bp->rx_nr_rings;
else
@@ -3108,7 +3113,7 @@ static int bnxt_alloc_vnic_attributes(struct bnxt *bp)
rc = -ENOMEM;
goto out;
}
-
+vnic_skip_grps:
if ((bp->flags & BNXT_FLAG_NEW_RSS_CAP) &&
!(vnic->flags & BNXT_VNIC_RSS_FLAG))
continue;
@@ -4397,6 +4402,10 @@ static int bnxt_hwrm_vnic_alloc(struct bnxt *bp, u16 
vnic_id,
unsigned int i, j, grp_idx, end_idx = start_rx_ring_idx + nr_rings;
struct hwrm_vnic_alloc_input req = {0};
struct hwrm_vnic_alloc_output *resp = bp->hwrm_cmd_resp_addr;
+   struct bnxt_vnic_info *vnic = >vnic_info[vnic_id];
+
+   if (bp->flags & BNXT_FLAG_CHIP_P5)
+   goto vnic_no_ring_grps;
 
/* map ring groups to this vnic */
for (i = start_rx_ring_idx, j = 0; i < end_idx; i++, j++) {
@@ -4406,12 +4415,12 @@ static int bnxt_hwrm_vnic_alloc(struct bnxt *bp, u16 
vnic_id,
   j, nr_rings);
break;
}
-   bp->vnic_info[vnic_id].fw_grp_ids[j] =
-   bp->grp_info[grp_idx].fw_grp_id;
+   vnic->fw_grp_ids[j] = bp->grp_info[grp_idx].fw_grp_id;
}
 
-   bp->vnic_info[vnic_id].fw_rss_cos_lb_ctx[0] = INVALID_HW_RING_ID;
-   bp->vnic_info[vnic_id].fw_rss_cos_lb_ctx[1] = INVALID_HW_RING_ID;
+vnic_no_ring_grps:
+   for (i = 0; i < BNXT_MAX_CTX_PER_VNIC; i++)
+   vnic->fw_rss_cos_lb_ctx[i] = INVALID_HW_RING_ID;
if (vnic_id == 0)
req.flags = cpu_to_le32(VNIC_ALLOC_REQ_FLAGS_DEFAULT);
 
@@ -4420,7 +4429,7 @@ static int bnxt_hwrm_vnic_alloc(struct bnxt *bp, u16 
vnic_id,
mutex_lock(>hwrm_cmd_lock);
rc = _hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
if (!rc)
-   bp->vnic_info[vnic_id].fw_vnic_id = le32_to_cpu(resp->vnic_id);
+   vnic->fw_vnic_id = le32_to_cpu(resp->vnic_id);
mutex_unlock(>hwrm_cmd_lock);
return rc;
 }
@@ -4456,6 +4465,9 @@ static int bnxt_hwrm_ring_grp_alloc(struct bnxt *bp)
u16 i;
u32 rc = 0;
 
+   if (bp->flags & BNXT_FLAG_CHIP_P5)
+   return 0;
+
mutex_lock(>hwrm_cmd_lock);
for (i = 0; i < bp->rx_nr_rings; i++) {
struct hwrm_ring_grp_alloc_input req = {0};
@@ -4488,7 +4500,7 @@ static int bnxt_hwrm_ring_grp_free(struct bnxt *bp)
u32 rc = 0;
struct hwrm_ring_grp_free_input req = {0};
 
-   if (!bp->grp_info)
+   if (!bp->grp_info || (bp->flags & BNXT_FLAG_CHIP_P5))
return 0;
 
bnxt_hwrm_cmd_hdr_init(bp, , HWRM_RING_GRP_FREE, -1, -1);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 560e8b7..50b129e 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -862,7 +862,7 @@ struct bnxt_ri

[PATCH net-next 23/23] bnxt_en: Add PCI ID for BCM57508 device.

2018-10-14 Thread Michael Chan
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index f518119..de987cc 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -111,6 +111,7 @@ enum board_idx {
BCM57452,
BCM57454,
BCM5745x_NPAR,
+   BCM57508,
BCM58802,
BCM58804,
BCM58808,
@@ -152,6 +153,7 @@ static const struct {
[BCM57452] = { "Broadcom BCM57452 NetXtreme-E 10Gb/25Gb/40Gb/50Gb 
Ethernet" },
[BCM57454] = { "Broadcom BCM57454 NetXtreme-E 10Gb/25Gb/40Gb/50Gb/100Gb 
Ethernet" },
[BCM5745x_NPAR] = { "Broadcom BCM5745x NetXtreme-E Ethernet Partition" 
},
+   [BCM57508] = { "Broadcom BCM57508 NetXtreme-E 
10Gb/25Gb/50Gb/100Gb/200Gb Ethernet" },
[BCM58802] = { "Broadcom BCM58802 NetXtreme-S 10Gb/25Gb/40Gb/50Gb 
Ethernet" },
[BCM58804] = { "Broadcom BCM58804 NetXtreme-S 10Gb/25Gb/40Gb/50Gb/100Gb 
Ethernet" },
[BCM58808] = { "Broadcom BCM58808 NetXtreme-S 10Gb/25Gb/40Gb/50Gb/100Gb 
Ethernet" },
@@ -196,6 +198,7 @@ static const struct pci_device_id bnxt_pci_tbl[] = {
{ PCI_VDEVICE(BROADCOM, 0x16ef), .driver_data = BCM57416_NPAR },
{ PCI_VDEVICE(BROADCOM, 0x16f0), .driver_data = BCM58808 },
{ PCI_VDEVICE(BROADCOM, 0x16f1), .driver_data = BCM57452 },
+   { PCI_VDEVICE(BROADCOM, 0x1750), .driver_data = BCM57508 },
{ PCI_VDEVICE(BROADCOM, 0xd802), .driver_data = BCM58802 },
{ PCI_VDEVICE(BROADCOM, 0xd804), .driver_data = BCM58804 },
 #ifdef CONFIG_BNXT_SRIOV
-- 
2.5.1



[PATCH net-next 21/23] bnxt_en: Refactor bnxt_poll_work().

2018-10-14 Thread Michael Chan
Separate the CP ring polling logic in bnxt_poll_work() into 2 separate
functions __bnxt_poll_work() and __bnxt_poll_work_done().  Since the logic
is separated, we need to add tx_pkts and events fields to struct bnxt_napi
to keep track of the events to handle between the 2 functions.  We also
add had_work_done field to struct bnxt_cp_ring_info to indicate whether
some work was performed on the CP ring.

This is needed to better support the 57500 chips.  We need to poll up to
2 separate CP rings before we update or ARM the CP rings on the 57500 chips.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 44 +++
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  5 
 2 files changed, 38 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 065f4c2..10d713aa 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1889,8 +1889,8 @@ static irqreturn_t bnxt_inta(int irq, void *dev_instance)
return IRQ_HANDLED;
 }
 
-static int bnxt_poll_work(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
- int budget)
+static int __bnxt_poll_work(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
+   int budget)
 {
struct bnxt_napi *bnapi = cpr->bnapi;
u32 raw_cons = cpr->cp_raw_cons;
@@ -1913,6 +1913,7 @@ static int bnxt_poll_work(struct bnxt *bp, struct 
bnxt_cp_ring_info *cpr,
 * reading any further.
 */
dma_rmb();
+   cpr->had_work_done = 1;
if (TX_CMP_TYPE(txcmp) == CMP_TYPE_TX_L2_CMP) {
tx_pkts++;
/* return full budget so NAPI will complete. */
@@ -1963,22 +1964,43 @@ static int bnxt_poll_work(struct bnxt *bp, struct 
bnxt_cp_ring_info *cpr,
}
 
cpr->cp_raw_cons = raw_cons;
-   /* ACK completion ring before freeing tx ring and producing new
-* buffers in rx/agg rings to prevent overflowing the completion
-* ring.
-*/
-   bnxt_db_cq(bp, >cp_db, cpr->cp_raw_cons);
+   bnapi->tx_pkts += tx_pkts;
+   bnapi->events |= event;
+   return rx_pkts;
+}
 
-   if (tx_pkts)
-   bnapi->tx_int(bp, bnapi, tx_pkts);
+static void __bnxt_poll_work_done(struct bnxt *bp, struct bnxt_napi *bnapi)
+{
+   if (bnapi->tx_pkts) {
+   bnapi->tx_int(bp, bnapi, bnapi->tx_pkts);
+   bnapi->tx_pkts = 0;
+   }
 
-   if (event & BNXT_RX_EVENT) {
+   if (bnapi->events & BNXT_RX_EVENT) {
struct bnxt_rx_ring_info *rxr = bnapi->rx_ring;
 
bnxt_db_write(bp, >rx_db, rxr->rx_prod);
-   if (event & BNXT_AGG_EVENT)
+   if (bnapi->events & BNXT_AGG_EVENT)
bnxt_db_write(bp, >rx_agg_db, rxr->rx_agg_prod);
}
+   bnapi->events = 0;
+}
+
+static int bnxt_poll_work(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
+ int budget)
+{
+   struct bnxt_napi *bnapi = cpr->bnapi;
+   int rx_pkts;
+
+   rx_pkts = __bnxt_poll_work(bp, cpr, budget);
+
+   /* ACK completion ring before freeing tx ring and producing new
+* buffers in rx/agg rings to prevent overflowing the completion
+* ring.
+*/
+   bnxt_db_cq(bp, >cp_db, cpr->cp_raw_cons);
+
+   __bnxt_poll_work_done(bp, bnapi);
return rx_pkts;
 }
 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 50b129e..48cb2d5 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -792,6 +792,8 @@ struct bnxt_cp_ring_info {
u32 cp_raw_cons;
struct bnxt_db_info cp_db;
 
+   u8  had_work_done:1;
+
struct bnxt_coalrx_ring_coal;
u64 rx_packets;
u64 rx_bytes;
@@ -829,6 +831,9 @@ struct bnxt_napi {
 
void(*tx_int)(struct bnxt *, struct bnxt_napi *,
  int);
+   int tx_pkts;
+   u8  events;
+
u32 flags;
 #define BNXT_NAPI_FLAG_XDP 0x1
 
-- 
2.5.1



[PATCH net-next 16/23] bnxt_en: Allocate/Free CP rings for 57500 series chips.

2018-10-14 Thread Michael Chan
On the new 57500 chips, we allocate/free one CP ring for each RX ring or
TX ring separately.  Using separate CP rings for RX/TX is an improvement
as TX events will no longer be stuck behind RX events.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 71 ---
 1 file changed, 66 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index db1dbad..7952100 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -2758,7 +2758,7 @@ static int bnxt_init_one_rx_ring(struct bnxt *bp, int 
ring_nr)
 
 static void bnxt_init_cp_rings(struct bnxt *bp)
 {
-   int i;
+   int i, j;
 
for (i = 0; i < bp->cp_nr_rings; i++) {
struct bnxt_cp_ring_info *cpr = >bnapi[i]->cp_ring;
@@ -2767,6 +2767,17 @@ static void bnxt_init_cp_rings(struct bnxt *bp)
ring->fw_ring_id = INVALID_HW_RING_ID;
cpr->rx_ring_coal.coal_ticks = bp->rx_coal.coal_ticks;
cpr->rx_ring_coal.coal_bufs = bp->rx_coal.coal_bufs;
+   for (j = 0; j < 2; j++) {
+   struct bnxt_cp_ring_info *cpr2 = cpr->cp_ring_arr[j];
+
+   if (!cpr2)
+   continue;
+
+   ring = >cp_ring_struct;
+   ring->fw_ring_id = INVALID_HW_RING_ID;
+   cpr2->rx_ring_coal.coal_ticks = bp->rx_coal.coal_ticks;
+   cpr2->rx_ring_coal.coal_bufs = bp->rx_coal.coal_bufs;
+   }
}
 }
 
@@ -4711,9 +4722,28 @@ static int bnxt_hwrm_ring_alloc(struct bnxt *bp)
type = HWRM_RING_ALLOC_TX;
for (i = 0; i < bp->tx_nr_rings; i++) {
struct bnxt_tx_ring_info *txr = >tx_ring[i];
-   struct bnxt_ring_struct *ring = >tx_ring_struct;
-   u32 map_idx = i;
+   struct bnxt_ring_struct *ring;
+   u32 map_idx;
 
+   if (bp->flags & BNXT_FLAG_CHIP_P5) {
+   struct bnxt_napi *bnapi = txr->bnapi;
+   struct bnxt_cp_ring_info *cpr, *cpr2;
+   u32 type2 = HWRM_RING_ALLOC_CMPL;
+
+   cpr = >cp_ring;
+   cpr2 = cpr->cp_ring_arr[BNXT_TX_HDL];
+   ring = >cp_ring_struct;
+   ring->handle = BNXT_TX_HDL;
+   map_idx = bnapi->index;
+   rc = hwrm_ring_alloc_send_msg(bp, ring, type2, map_idx);
+   if (rc)
+   goto err_out;
+   bnxt_set_db(bp, >cp_db, type2, map_idx,
+   ring->fw_ring_id);
+   bnxt_db_cq(bp, >cp_db, cpr2->cp_raw_cons);
+   }
+   ring = >tx_ring_struct;
+   map_idx = i;
rc = hwrm_ring_alloc_send_msg(bp, ring, type, map_idx);
if (rc)
goto err_out;
@@ -4724,7 +4754,8 @@ static int bnxt_hwrm_ring_alloc(struct bnxt *bp)
for (i = 0; i < bp->rx_nr_rings; i++) {
struct bnxt_rx_ring_info *rxr = >rx_ring[i];
struct bnxt_ring_struct *ring = >rx_ring_struct;
-   u32 map_idx = rxr->bnapi->index;
+   struct bnxt_napi *bnapi = rxr->bnapi;
+   u32 map_idx = bnapi->index;
 
rc = hwrm_ring_alloc_send_msg(bp, ring, type, map_idx);
if (rc)
@@ -4732,6 +4763,21 @@ static int bnxt_hwrm_ring_alloc(struct bnxt *bp)
bnxt_set_db(bp, >rx_db, type, map_idx, ring->fw_ring_id);
bnxt_db_write(bp, >rx_db, rxr->rx_prod);
bp->grp_info[map_idx].rx_fw_ring_id = ring->fw_ring_id;
+   if (bp->flags & BNXT_FLAG_CHIP_P5) {
+   struct bnxt_cp_ring_info *cpr = >cp_ring;
+   u32 type2 = HWRM_RING_ALLOC_CMPL;
+   struct bnxt_cp_ring_info *cpr2;
+
+   cpr2 = cpr->cp_ring_arr[BNXT_RX_HDL];
+   ring = >cp_ring_struct;
+   ring->handle = BNXT_RX_HDL;
+   rc = hwrm_ring_alloc_send_msg(bp, ring, type2, map_idx);
+   if (rc)
+   goto err_out;
+   bnxt_set_db(bp, >cp_db, type2, map_idx,
+   ring->fw_ring_id);
+   bnxt_db_cq(bp, >cp_db, cpr2->cp_raw_cons);
+   }
}
 
if (bp->flags & BNXT_FLAG_AGG_RINGS) {
@@ -4858,8 +4904,23 @@ static void bnxt_hwrm_ring_free(struct bnxt *bp, bool 
close_path)
for (i = 0; i < bp->cp_nr_ri

[PATCH net-next 03/23] bnxt_en: Add maximum extended request length fw message support.

2018-10-14 Thread Michael Chan
Support the max_ext_req_len field from the HWRM_VER_GET_RESPONSE.
If this field is valid and greater than the mailbox size, use the
short command format to send firmware messages greater than the
mailbox size.  Newer devices use this method to send larger messages
to the firmware.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 34 ---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  1 +
 2 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 84c1e6c..4c068e6 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -3042,7 +3042,7 @@ static void bnxt_free_hwrm_short_cmd_req(struct bnxt *bp)
if (bp->hwrm_short_cmd_req_addr) {
struct pci_dev *pdev = bp->pdev;
 
-   dma_free_coherent(>dev, BNXT_HWRM_MAX_REQ_LEN,
+   dma_free_coherent(>dev, bp->hwrm_max_ext_req_len,
  bp->hwrm_short_cmd_req_addr,
  bp->hwrm_short_cmd_req_dma_addr);
bp->hwrm_short_cmd_req_addr = NULL;
@@ -3054,7 +3054,7 @@ static int bnxt_alloc_hwrm_short_cmd_req(struct bnxt *bp)
struct pci_dev *pdev = bp->pdev;
 
bp->hwrm_short_cmd_req_addr =
-   dma_alloc_coherent(>dev, BNXT_HWRM_MAX_REQ_LEN,
+   dma_alloc_coherent(>dev, bp->hwrm_max_ext_req_len,
   >hwrm_short_cmd_req_dma_addr,
   GFP_KERNEL);
if (!bp->hwrm_short_cmd_req_addr)
@@ -3469,12 +3469,27 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void 
*msg, u32 msg_len,
cp_ring_id = le16_to_cpu(req->cmpl_ring);
intr_process = (cp_ring_id == INVALID_HW_RING_ID) ? 0 : 1;
 
-   if (bp->fw_cap & BNXT_FW_CAP_SHORT_CMD) {
+   if (msg_len > BNXT_HWRM_MAX_REQ_LEN) {
+   if (msg_len > bp->hwrm_max_ext_req_len ||
+   !bp->hwrm_short_cmd_req_addr)
+   return -EINVAL;
+   }
+
+   if ((bp->fw_cap & BNXT_FW_CAP_SHORT_CMD) ||
+   msg_len > BNXT_HWRM_MAX_REQ_LEN) {
void *short_cmd_req = bp->hwrm_short_cmd_req_addr;
+   u16 max_msg_len;
+
+   /* Set boundary for maximum extended request length for short
+* cmd format. If passed up from device use the max supported
+* internal req length.
+*/
+   max_msg_len = bp->hwrm_max_ext_req_len;
 
memcpy(short_cmd_req, req, msg_len);
-   memset(short_cmd_req + msg_len, 0, BNXT_HWRM_MAX_REQ_LEN -
-  msg_len);
+   if (msg_len < max_msg_len)
+   memset(short_cmd_req + msg_len, 0,
+  max_msg_len - msg_len);
 
short_input.req_type = req->req_type;
short_input.signature =
@@ -5381,8 +5396,12 @@ static int bnxt_hwrm_ver_get(struct bnxt *bp)
if (!bp->hwrm_cmd_timeout)
bp->hwrm_cmd_timeout = DFLT_HWRM_CMD_TIMEOUT;
 
-   if (resp->hwrm_intf_maj_8b >= 1)
+   if (resp->hwrm_intf_maj_8b >= 1) {
bp->hwrm_max_req_len = le16_to_cpu(resp->max_req_win_len);
+   bp->hwrm_max_ext_req_len = le16_to_cpu(resp->max_ext_req_len);
+   }
+   if (bp->hwrm_max_ext_req_len < HWRM_MAX_REQ_LEN)
+   bp->hwrm_max_ext_req_len = HWRM_MAX_REQ_LEN;
 
bp->chip_num = le16_to_cpu(resp->chip_num);
if (bp->chip_num == CHIP_NUM_58700 && !resp->chip_rev &&
@@ -8908,7 +8927,8 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
if (rc)
goto init_err_pci_clean;
 
-   if (bp->fw_cap & BNXT_FW_CAP_SHORT_CMD) {
+   if ((bp->fw_cap & BNXT_FW_CAP_SHORT_CMD) ||
+   bp->hwrm_max_ext_req_len > BNXT_HWRM_MAX_REQ_LEN) {
rc = bnxt_alloc_hwrm_short_cmd_req(bp);
if (rc)
goto init_err_pci_clean;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 2cd7ee5..8b6874c 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1315,6 +1315,7 @@ struct bnxt {
u16 fw_tx_stats_ext_size;
 
u16 hwrm_max_req_len;
+   u16 hwrm_max_ext_req_len;
int hwrm_cmd_timeout;
struct mutexhwrm_cmd_lock;  /* serialize hwrm messages */
struct hwrm_ver_get_output  ver_resp;
-- 
2.5.1



[PATCH net-next 04/23] bnxt_en: Update interrupt coalescing logic.

2018-10-14 Thread Michael Chan
New firmware spec. allows interrupt coalescing parameters, such as
maximums, timer units, supported features to be queried.  Update
the driver to make use of the new call to query these parameters
and provide the legacy defaults if the call is not available.

Replace the hard-coded values with these parameters.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 107 --
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  39 ++-
 2 files changed, 125 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 4c068e6..83b1313 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4944,46 +4944,113 @@ static int bnxt_hwrm_check_rings(struct bnxt *bp, int 
tx_rings, int rx_rings,
cp_rings, vnics);
 }
 
-static void bnxt_hwrm_set_coal_params(struct bnxt_coal *hw_coal,
+static void bnxt_hwrm_coal_params_qcaps(struct bnxt *bp)
+{
+   struct hwrm_ring_aggint_qcaps_output *resp = bp->hwrm_cmd_resp_addr;
+   struct bnxt_coal_cap *coal_cap = >coal_cap;
+   struct hwrm_ring_aggint_qcaps_input req = {0};
+   int rc;
+
+   coal_cap->cmpl_params = BNXT_LEGACY_COAL_CMPL_PARAMS;
+   coal_cap->num_cmpl_dma_aggr_max = 63;
+   coal_cap->num_cmpl_dma_aggr_during_int_max = 63;
+   coal_cap->cmpl_aggr_dma_tmr_max = 65535;
+   coal_cap->cmpl_aggr_dma_tmr_during_int_max = 65535;
+   coal_cap->int_lat_tmr_min_max = 65535;
+   coal_cap->int_lat_tmr_max_max = 65535;
+   coal_cap->num_cmpl_aggr_int_max = 65535;
+   coal_cap->timer_units = 80;
+
+   if (bp->hwrm_spec_code < 0x10902)
+   return;
+
+   bnxt_hwrm_cmd_hdr_init(bp, , HWRM_RING_AGGINT_QCAPS, -1, -1);
+   mutex_lock(>hwrm_cmd_lock);
+   rc = _hwrm_send_message_silent(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
+   if (!rc) {
+   coal_cap->cmpl_params = le32_to_cpu(resp->cmpl_params);
+   coal_cap->num_cmpl_dma_aggr_max =
+   le16_to_cpu(resp->num_cmpl_dma_aggr_max);
+   coal_cap->num_cmpl_dma_aggr_during_int_max =
+   le16_to_cpu(resp->num_cmpl_dma_aggr_during_int_max);
+   coal_cap->cmpl_aggr_dma_tmr_max =
+   le16_to_cpu(resp->cmpl_aggr_dma_tmr_max);
+   coal_cap->cmpl_aggr_dma_tmr_during_int_max =
+   le16_to_cpu(resp->cmpl_aggr_dma_tmr_during_int_max);
+   coal_cap->int_lat_tmr_min_max =
+   le16_to_cpu(resp->int_lat_tmr_min_max);
+   coal_cap->int_lat_tmr_max_max =
+   le16_to_cpu(resp->int_lat_tmr_max_max);
+   coal_cap->num_cmpl_aggr_int_max =
+   le16_to_cpu(resp->num_cmpl_aggr_int_max);
+   coal_cap->timer_units = le16_to_cpu(resp->timer_units);
+   }
+   mutex_unlock(>hwrm_cmd_lock);
+}
+
+static u16 bnxt_usec_to_coal_tmr(struct bnxt *bp, u16 usec)
+{
+   struct bnxt_coal_cap *coal_cap = >coal_cap;
+
+   return usec * 1000 / coal_cap->timer_units;
+}
+
+static void bnxt_hwrm_set_coal_params(struct bnxt *bp,
+   struct bnxt_coal *hw_coal,
struct hwrm_ring_cmpl_ring_cfg_aggint_params_input *req)
 {
-   u16 val, tmr, max, flags;
+   struct bnxt_coal_cap *coal_cap = >coal_cap;
+   u32 cmpl_params = coal_cap->cmpl_params;
+   u16 val, tmr, max, flags = 0;
 
max = hw_coal->bufs_per_record * 128;
if (hw_coal->budget)
max = hw_coal->bufs_per_record * hw_coal->budget;
+   max = min_t(u16, max, coal_cap->num_cmpl_aggr_int_max);
 
val = clamp_t(u16, hw_coal->coal_bufs, 1, max);
req->num_cmpl_aggr_int = cpu_to_le16(val);
 
-   /* This is a 6-bit value and must not be 0, or we'll get non stop IRQ */
-   val = min_t(u16, val, 63);
+   val = min_t(u16, val, coal_cap->num_cmpl_dma_aggr_max);
req->num_cmpl_dma_aggr = cpu_to_le16(val);
 
-   /* This is a 6-bit value and must not be 0, or we'll get non stop IRQ */
-   val = clamp_t(u16, hw_coal->coal_bufs_irq, 1, 63);
+   val = clamp_t(u16, hw_coal->coal_bufs_irq, 1,
+ coal_cap->num_cmpl_dma_aggr_during_int_max);
req->num_cmpl_dma_aggr_during_int = cpu_to_le16(val);
 
-   tmr = BNXT_USEC_TO_COAL_TIMER(hw_coal->coal_ticks);
-   tmr = max_t(u16, tmr, 1);
+   tmr = bnxt_usec_to_coal_tmr(bp, hw_coal->coal_ticks);
+   tmr = clamp_t(u16, tmr, 1, coal_cap->int_lat_tmr_max_max);
req->int_lat_tmr_max = cpu_to_le16(tmr);
 
/* min timer set to 1/2 of interrupt timer */
-   val = tmr / 2;
-   req->int_lat_tmr_min = cpu_to_le16(val

[PATCH net-next 01/23] bnxt_en: Update firmware interface spec. to 1.10.0.3.

2018-10-14 Thread Michael Chan
Among the new changes are trusted VF support, 200Gbps support, and new
API to dump ring information on the new chips.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |   6 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h | 310 ++
 2 files changed, 224 insertions(+), 92 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index bde3846..766c50b 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -12,11 +12,11 @@
 #define BNXT_H
 
 #define DRV_MODULE_NAME"bnxt_en"
-#define DRV_MODULE_VERSION "1.9.2"
+#define DRV_MODULE_VERSION "1.10.0"
 
 #define DRV_VER_MAJ1
-#define DRV_VER_MIN9
-#define DRV_VER_UPD2
+#define DRV_VER_MIN10
+#define DRV_VER_UPD0
 
 #include 
 #include 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h
index 971ace5d..5dd0860 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h
@@ -37,6 +37,8 @@ struct hwrm_resp_hdr {
 #define TLV_TYPE_HWRM_REQUEST0x1UL
 #define TLV_TYPE_HWRM_RESPONSE   0x2UL
 #define TLV_TYPE_ROCE_SP_COMMAND 0x3UL
+#define TLV_TYPE_QUERY_ROCE_CC_GEN1  0x4UL
+#define TLV_TYPE_MODIFY_ROCE_CC_GEN1 0x5UL
 #define TLV_TYPE_ENGINE_CKV_DEVICE_SERIAL_NUMBER 0x8001UL
 #define TLV_TYPE_ENGINE_CKV_NONCE0x8002UL
 #define TLV_TYPE_ENGINE_CKV_IV   0x8003UL
@@ -186,6 +188,7 @@ struct cmd_nums {
#define HWRM_TUNNEL_DST_PORT_QUERY0xa0UL
#define HWRM_TUNNEL_DST_PORT_ALLOC0xa1UL
#define HWRM_TUNNEL_DST_PORT_FREE 0xa2UL
+   #define HWRM_STAT_CTX_ENG_QUERY   0xafUL
#define HWRM_STAT_CTX_ALLOC   0xb0UL
#define HWRM_STAT_CTX_FREE0xb1UL
#define HWRM_STAT_CTX_QUERY   0xb2UL
@@ -235,6 +238,7 @@ struct cmd_nums {
#define HWRM_CFA_PAIR_INFO0x10fUL
#define HWRM_FW_IPC_MSG   0x110UL
#define HWRM_CFA_REDIRECT_TUNNEL_TYPE_INFO0x111UL
+   #define HWRM_CFA_REDIRECT_QUERY_TUNNEL_TYPE   0x112UL
#define HWRM_ENGINE_CKV_HELLO 0x12dUL
#define HWRM_ENGINE_CKV_STATUS0x12eUL
#define HWRM_ENGINE_CKV_CKEK_ADD  0x12fUL
@@ -295,6 +299,7 @@ struct cmd_nums {
#define HWRM_DBG_COREDUMP_RETRIEVE0xff19UL
#define HWRM_DBG_FW_CLI   0xff1aUL
#define HWRM_DBG_I2C_CMD  0xff1bUL
+   #define HWRM_DBG_RING_INFO_GET0xff1cUL
#define HWRM_NVM_FACTORY_DEFAULTS 0xffeeUL
#define HWRM_NVM_VALIDATE_OPTION  0xffefUL
#define HWRM_NVM_FLUSH0xfff0UL
@@ -320,20 +325,21 @@ struct cmd_nums {
 /* ret_codes (size:64b/8B) */
 struct ret_codes {
__le16  error_code;
-   #define HWRM_ERR_CODE_SUCCESS0x0UL
-   #define HWRM_ERR_CODE_FAIL   0x1UL
-   #define HWRM_ERR_CODE_INVALID_PARAMS 0x2UL
-   #define HWRM_ERR_CODE_RESOURCE_ACCESS_DENIED 0x3UL
-   #define HWRM_ERR_CODE_RESOURCE_ALLOC_ERROR   0x4UL
-   #define HWRM_ERR_CODE_INVALID_FLAGS  0x5UL
-   #define HWRM_ERR_CODE_INVALID_ENABLES0x6UL
-   #define HWRM_ERR_CODE_UNSUPPORTED_TLV0x7UL
-   #define HWRM_ERR_CODE_NO_BUFFER  0x8UL
-   #define HWRM_ERR_CODE_UNSUPPORTED_OPTION_ERR 0x9UL
-   #define HWRM_ERR_CODE_HWRM_ERROR 0xfUL
-   #define HWRM_ERR_CODE_UNKNOWN_ERR0xfffeUL
-   #define HWRM_ERR_CODE_CMD_NOT_SUPPORTED  0xUL
-   #define HWRM_ERR_CODE_LAST  
HWRM_ERR_CODE_CMD_NOT_SUPPORTED
+   #define HWRM_ERR_CODE_SUCCESS   0x0UL
+   #define HWRM_ERR_CODE_FAIL  0x1UL
+   #define HWRM_ERR_CODE_INVALID_PARAMS0x2UL
+   #define HWRM_ERR_CODE_RESOURCE_ACCESS_DENIED0x3UL
+   #define HWRM_ERR_CODE_RESOURCE_ALLOC_ERROR  0x4UL
+   #define HWRM_ERR_CODE_INVALID_FLAGS 0x5UL
+   #define HWRM_ERR_CODE_INVALID_ENABLES   0x6UL
+   #define HWRM_ERR_CODE_UNSUPPORTED_TLV   0x7UL
+   #define HWRM_ERR_CODE_NO_BUFFER 0x8UL
+   #define HWRM_ERR_CODE_UNSUPPORTED_OPTION_ERR0x9UL
+   #define HWRM_ERR_CODE_HWRM_ERROR0xfUL
+   #define HWRM_ERR_CODE_TLV_ENCAPSULATED_RESPONSE 0x8000UL
+   #define HWRM_ERR_CODE_UNKNOWN_ERR   0xfffeUL
+   #define HWRM_ERR_CODE_CMD_

[PATCH net-next 19/23] bnxt_en: Use bnxt_cp_ring_info struct pointer as parameter for RX path.

2018-10-14 Thread Michael Chan
In the RX code path, we current use the bnxt_napi struct pointer to
identify the associated RX/CP rings.  Change it to use the struct
bnxt_cp_ring_info pointer instead since there are now up to 2
CP rings per MSIX.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 69 ---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 19 ---
 2 files changed, 45 insertions(+), 43 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index d1f9130..5ec477f 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -807,11 +807,11 @@ static inline int bnxt_alloc_rx_page(struct bnxt *bp,
return 0;
 }
 
-static void bnxt_reuse_rx_agg_bufs(struct bnxt_napi *bnapi, u16 cp_cons,
+static void bnxt_reuse_rx_agg_bufs(struct bnxt_cp_ring_info *cpr, u16 cp_cons,
   u32 agg_bufs)
 {
+   struct bnxt_napi *bnapi = cpr->bnapi;
struct bnxt *bp = bnapi->bp;
-   struct bnxt_cp_ring_info *cpr = >cp_ring;
struct bnxt_rx_ring_info *rxr = bnapi->rx_ring;
u16 prod = rxr->rx_agg_prod;
u16 sw_prod = rxr->rx_sw_agg_prod;
@@ -934,12 +934,13 @@ static struct sk_buff *bnxt_rx_skb(struct bnxt *bp,
return skb;
 }
 
-static struct sk_buff *bnxt_rx_pages(struct bnxt *bp, struct bnxt_napi *bnapi,
+static struct sk_buff *bnxt_rx_pages(struct bnxt *bp,
+struct bnxt_cp_ring_info *cpr,
 struct sk_buff *skb, u16 cp_cons,
 u32 agg_bufs)
 {
+   struct bnxt_napi *bnapi = cpr->bnapi;
struct pci_dev *pdev = bp->pdev;
-   struct bnxt_cp_ring_info *cpr = >cp_ring;
struct bnxt_rx_ring_info *rxr = bnapi->rx_ring;
u16 prod = rxr->rx_agg_prod;
u32 i;
@@ -986,7 +987,7 @@ static struct sk_buff *bnxt_rx_pages(struct bnxt *bp, 
struct bnxt_napi *bnapi,
 * allocated already.
 */
rxr->rx_agg_prod = prod;
-   bnxt_reuse_rx_agg_bufs(bnapi, cp_cons, agg_bufs - i);
+   bnxt_reuse_rx_agg_bufs(cpr, cp_cons, agg_bufs - i);
return NULL;
}
 
@@ -1043,10 +1044,9 @@ static inline struct sk_buff *bnxt_copy_skb(struct 
bnxt_napi *bnapi, u8 *data,
return skb;
 }
 
-static int bnxt_discard_rx(struct bnxt *bp, struct bnxt_napi *bnapi,
+static int bnxt_discard_rx(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
   u32 *raw_cons, void *cmp)
 {
-   struct bnxt_cp_ring_info *cpr = >cp_ring;
struct rx_cmp *rxcmp = cmp;
u32 tmp_raw_cons = *raw_cons;
u8 cmp_type, agg_bufs = 0;
@@ -1172,11 +1172,11 @@ static void bnxt_tpa_start(struct bnxt *bp, struct 
bnxt_rx_ring_info *rxr,
cons_rx_buf->data = NULL;
 }
 
-static void bnxt_abort_tpa(struct bnxt *bp, struct bnxt_napi *bnapi,
-  u16 cp_cons, u32 agg_bufs)
+static void bnxt_abort_tpa(struct bnxt_cp_ring_info *cpr, u16 cp_cons,
+  u32 agg_bufs)
 {
if (agg_bufs)
-   bnxt_reuse_rx_agg_bufs(bnapi, cp_cons, agg_bufs);
+   bnxt_reuse_rx_agg_bufs(cpr, cp_cons, agg_bufs);
 }
 
 static struct sk_buff *bnxt_gro_func_5731x(struct bnxt_tpa_info *tpa_info,
@@ -1370,13 +1370,13 @@ static struct net_device *bnxt_get_pkt_dev(struct bnxt 
*bp, u16 cfa_code)
 }
 
 static inline struct sk_buff *bnxt_tpa_end(struct bnxt *bp,
-  struct bnxt_napi *bnapi,
+  struct bnxt_cp_ring_info *cpr,
   u32 *raw_cons,
   struct rx_tpa_end_cmp *tpa_end,
   struct rx_tpa_end_cmp_ext *tpa_end1,
   u8 *event)
 {
-   struct bnxt_cp_ring_info *cpr = >cp_ring;
+   struct bnxt_napi *bnapi = cpr->bnapi;
struct bnxt_rx_ring_info *rxr = bnapi->rx_ring;
u8 agg_id = TPA_END_AGG_ID(tpa_end);
u8 *data_ptr, agg_bufs;
@@ -1388,7 +1388,7 @@ static inline struct sk_buff *bnxt_tpa_end(struct bnxt 
*bp,
void *data;
 
if (unlikely(bnapi->in_reset)) {
-   int rc = bnxt_discard_rx(bp, bnapi, raw_cons, tpa_end);
+   int rc = bnxt_discard_rx(bp, cpr, raw_cons, tpa_end);
 
if (rc < 0)
return ERR_PTR(-EBUSY);
@@ -1414,7 +1414,7 @@ static inline struct sk_buff *bnxt_tpa_end(struct bnxt 
*bp,
}
 
if (unlikely(agg_bufs > MAX_SKB_FRAGS || TPA_END_ERRORS(tpa_end1))) {
-   bnxt_abort_tpa(bp, bnapi, cp_cons, agg_bufs);
+   bnxt_abort_tpa(cpr, cp_cons, agg_bufs);
if (a

[PATCH net-next 09/23] bnxt_en: Add 57500 new chip ID and basic structures.

2018-10-14 Thread Michael Chan
57500 series is a new chip class (P5) that requires some driver changes
in the next several patches.  This adds basic chip ID, doorbells, and
the notification queue (NQ) structures.  Each MSIX is associated with an
NQ instead of a CP ring in legacy chips.  Each NQ has up to 2 associated
CP rings for RX and TX.  The same bnxt_cp_ring_info struct will be used
for the NQ.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 48 ---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h | 55 +--
 2 files changed, 88 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index b0e2416..88ea8c7 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -3322,6 +3322,13 @@ static int bnxt_alloc_mem(struct bnxt *bp, bool 
irq_re_init)
bp->bnapi[i] = bnapi;
bp->bnapi[i]->index = i;
bp->bnapi[i]->bp = bp;
+   if (bp->flags & BNXT_FLAG_CHIP_P5) {
+   struct bnxt_cp_ring_info *cpr =
+   >bnapi[i]->cp_ring;
+
+   cpr->cp_ring_struct.ring_mem.flags =
+   BNXT_RMEM_RING_PTE_FLAG;
+   }
}
 
bp->rx_ring = kcalloc(bp->rx_nr_rings,
@@ -3331,7 +3338,15 @@ static int bnxt_alloc_mem(struct bnxt *bp, bool 
irq_re_init)
return -ENOMEM;
 
for (i = 0; i < bp->rx_nr_rings; i++) {
-   bp->rx_ring[i].bnapi = bp->bnapi[i];
+   struct bnxt_rx_ring_info *rxr = >rx_ring[i];
+
+   if (bp->flags & BNXT_FLAG_CHIP_P5) {
+   rxr->rx_ring_struct.ring_mem.flags =
+   BNXT_RMEM_RING_PTE_FLAG;
+   rxr->rx_agg_ring_struct.ring_mem.flags =
+   BNXT_RMEM_RING_PTE_FLAG;
+   }
+   rxr->bnapi = bp->bnapi[i];
bp->bnapi[i]->rx_ring = >rx_ring[i];
}
 
@@ -3353,12 +3368,16 @@ static int bnxt_alloc_mem(struct bnxt *bp, bool 
irq_re_init)
j = bp->rx_nr_rings;
 
for (i = 0; i < bp->tx_nr_rings; i++, j++) {
-   bp->tx_ring[i].bnapi = bp->bnapi[j];
-   bp->bnapi[j]->tx_ring = >tx_ring[i];
+   struct bnxt_tx_ring_info *txr = >tx_ring[i];
+
+   if (bp->flags & BNXT_FLAG_CHIP_P5)
+   txr->tx_ring_struct.ring_mem.flags =
+   BNXT_RMEM_RING_PTE_FLAG;
+   txr->bnapi = bp->bnapi[j];
+   bp->bnapi[j]->tx_ring = txr;
bp->tx_ring_map[i] = bp->tx_nr_rings_xdp + i;
if (i >= bp->tx_nr_rings_xdp) {
-   bp->tx_ring[i].txq_index = i -
-   bp->tx_nr_rings_xdp;
+   txr->txq_index = i - bp->tx_nr_rings_xdp;
bp->bnapi[j]->tx_int = bnxt_tx_int;
} else {
bp->bnapi[j]->flags |= BNXT_NAPI_FLAG_XDP;
@@ -9326,6 +9345,9 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
goto init_err_pci_clean;
}
 
+   if (BNXT_CHIP_P5(bp))
+   bp->flags |= BNXT_FLAG_CHIP_P5;
+
rc = bnxt_hwrm_func_reset(bp);
if (rc)
goto init_err_pci_clean;
@@ -9340,7 +9362,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
   NETIF_F_GSO_PARTIAL | NETIF_F_RXHASH |
   NETIF_F_RXCSUM | NETIF_F_GRO;
 
-   if (!BNXT_CHIP_TYPE_NITRO_A0(bp))
+   if (BNXT_SUPPORTS_TPA(bp))
dev->hw_features |= NETIF_F_LRO;
 
dev->hw_enc_features =
@@ -9354,7 +9376,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
dev->vlan_features = dev->hw_features | NETIF_F_HIGHDMA;
dev->hw_features |= NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_CTAG_TX |
NETIF_F_HW_VLAN_STAG_RX | NETIF_F_HW_VLAN_STAG_TX;
-   if (!BNXT_CHIP_TYPE_NITRO_A0(bp))
+   if (BNXT_SUPPORTS_TPA(bp))
dev->hw_features |= NETIF_F_GRO_HW;
dev->features |= dev->hw_features | NETIF_F_HIGHDMA;
if (dev->features & NETIF_F_GRO_HW)
@@ -9365,10 +

[PATCH net-next 13/23] bnxt_en: Allocate completion ring structures for 57500 series chips.

2018-10-14 Thread Michael Chan
On 57500 chips, the original bnxt_cp_ring_info struct now refers to the
NQ.  bp->cp_nr_rings refer to the number of NQs on 57500 chips.  There
are now 2 pointers for the CP rings associated with RX and TX rings.
Modify bnxt_alloc_cp_rings() and bnxt_free_cp_rings() accordingly.

With multiple CP rings per NAPI, we need to add a pointer in
bnxt_cp_ring_info struct to point back to the bnxt_napi struct.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 64 +++
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  3 ++
 2 files changed, 67 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index a0d7237..9af99dd 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -2482,6 +2482,7 @@ static void bnxt_free_cp_rings(struct bnxt *bp)
struct bnxt_napi *bnapi = bp->bnapi[i];
struct bnxt_cp_ring_info *cpr;
struct bnxt_ring_struct *ring;
+   int j;
 
if (!bnapi)
continue;
@@ -2490,11 +2491,50 @@ static void bnxt_free_cp_rings(struct bnxt *bp)
ring = >cp_ring_struct;
 
bnxt_free_ring(bp, >ring_mem);
+
+   for (j = 0; j < 2; j++) {
+   struct bnxt_cp_ring_info *cpr2 = cpr->cp_ring_arr[j];
+
+   if (cpr2) {
+   ring = >cp_ring_struct;
+   bnxt_free_ring(bp, >ring_mem);
+   kfree(cpr2);
+   cpr->cp_ring_arr[j] = NULL;
+   }
+   }
}
 }
 
+static struct bnxt_cp_ring_info *bnxt_alloc_cp_sub_ring(struct bnxt *bp)
+{
+   struct bnxt_ring_mem_info *rmem;
+   struct bnxt_ring_struct *ring;
+   struct bnxt_cp_ring_info *cpr;
+   int rc;
+
+   cpr = kzalloc(sizeof(*cpr), GFP_KERNEL);
+   if (!cpr)
+   return NULL;
+
+   ring = >cp_ring_struct;
+   rmem = >ring_mem;
+   rmem->nr_pages = bp->cp_nr_pages;
+   rmem->page_size = HW_CMPD_RING_SIZE;
+   rmem->pg_arr = (void **)cpr->cp_desc_ring;
+   rmem->dma_arr = cpr->cp_desc_mapping;
+   rmem->flags = BNXT_RMEM_RING_PTE_FLAG;
+   rc = bnxt_alloc_ring(bp, rmem);
+   if (rc) {
+   bnxt_free_ring(bp, rmem);
+   kfree(cpr);
+   cpr = NULL;
+   }
+   return cpr;
+}
+
 static int bnxt_alloc_cp_rings(struct bnxt *bp)
 {
+   bool sh = !!(bp->flags & BNXT_FLAG_SHARED_RINGS);
int i, rc, ulp_base_vec, ulp_msix;
 
ulp_msix = bnxt_get_ulp_msix_num(bp);
@@ -2508,6 +2548,7 @@ static int bnxt_alloc_cp_rings(struct bnxt *bp)
continue;
 
cpr = >cp_ring;
+   cpr->bnapi = bnapi;
ring = >cp_ring_struct;
 
rc = bnxt_alloc_ring(bp, >ring_mem);
@@ -2518,6 +2559,29 @@ static int bnxt_alloc_cp_rings(struct bnxt *bp)
ring->map_idx = i + ulp_msix;
else
ring->map_idx = i;
+
+   if (!(bp->flags & BNXT_FLAG_CHIP_P5))
+   continue;
+
+   if (i < bp->rx_nr_rings) {
+   struct bnxt_cp_ring_info *cpr2 =
+   bnxt_alloc_cp_sub_ring(bp);
+
+   cpr->cp_ring_arr[BNXT_RX_HDL] = cpr2;
+   if (!cpr2)
+   return -ENOMEM;
+   cpr2->bnapi = bnapi;
+   }
+   if ((sh && i < bp->tx_nr_rings) ||
+   (!sh && i >= bp->rx_nr_rings)) {
+   struct bnxt_cp_ring_info *cpr2 =
+   bnxt_alloc_cp_sub_ring(bp);
+
+   cpr->cp_ring_arr[BNXT_TX_HDL] = cpr2;
+   if (!cpr2)
+   return -ENOMEM;
+   cpr2->bnapi = bnapi;
+   }
}
return 0;
 }
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 25d592d..589b0be 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -787,6 +787,7 @@ struct bnxt_rx_ring_info {
 };
 
 struct bnxt_cp_ring_info {
+   struct bnxt_napi*bnapi;
u32 cp_raw_cons;
struct bnxt_db_info cp_db;
 
@@ -812,6 +813,8 @@ struct bnxt_cp_ring_info {
struct bnxt_ring_struct cp_ring_struct;
 
struct bnxt_cp_ring_info *cp_ring_arr[2];
+#define BNXT_RX_HDL0
+#define BNXT_TX_HDL1
 };
 
 struct bnxt_napi {
-- 
2.5.1



[PATCH net-next 14/23] bnxt_en: Add helper functions to get firmware CP ring ID.

2018-10-14 Thread Michael Chan
On the new 57500 chips, getting the associated CP ring ID associated with
an RX ring or TX ring is different than before.  On the legacy chips,
we find the associated ring group and look up the CP ring ID.  On the
57500 chips, each RX ring and TX ring has a dedicated CP ring even if
they share the MSIX.  Use these helper functions at appropriate places
to get the CP ring ID.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 67 ++-
 1 file changed, 56 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 9af99dd..99af288 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -2358,6 +2358,7 @@ static int bnxt_alloc_rx_rings(struct bnxt *bp)
if (rc)
return rc;
 
+   ring->grp_idx = i;
if (agg_rings) {
u16 mem_size;
 
@@ -4145,6 +4146,40 @@ static int bnxt_hwrm_vnic_set_tpa(struct bnxt *bp, u16 
vnic_id, u32 tpa_flags)
return hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
 }
 
+static u16 bnxt_cp_ring_from_grp(struct bnxt *bp, struct bnxt_ring_struct 
*ring)
+{
+   struct bnxt_ring_grp_info *grp_info;
+
+   grp_info = >grp_info[ring->grp_idx];
+   return grp_info->cp_fw_ring_id;
+}
+
+static u16 bnxt_cp_ring_for_rx(struct bnxt *bp, struct bnxt_rx_ring_info *rxr)
+{
+   if (bp->flags & BNXT_FLAG_CHIP_P5) {
+   struct bnxt_napi *bnapi = rxr->bnapi;
+   struct bnxt_cp_ring_info *cpr;
+
+   cpr = bnapi->cp_ring.cp_ring_arr[BNXT_RX_HDL];
+   return cpr->cp_ring_struct.fw_ring_id;
+   } else {
+   return bnxt_cp_ring_from_grp(bp, >rx_ring_struct);
+   }
+}
+
+static u16 bnxt_cp_ring_for_tx(struct bnxt *bp, struct bnxt_tx_ring_info *txr)
+{
+   if (bp->flags & BNXT_FLAG_CHIP_P5) {
+   struct bnxt_napi *bnapi = txr->bnapi;
+   struct bnxt_cp_ring_info *cpr;
+
+   cpr = bnapi->cp_ring.cp_ring_arr[BNXT_TX_HDL];
+   return cpr->cp_ring_struct.fw_ring_id;
+   } else {
+   return bnxt_cp_ring_from_grp(bp, >tx_ring_struct);
+   }
+}
+
 static int bnxt_hwrm_vnic_set_rss(struct bnxt *bp, u16 vnic_id, bool set_rss)
 {
u32 i, j, max_rings;
@@ -4491,15 +4526,20 @@ static int hwrm_ring_alloc_send_msg(struct bnxt *bp,
req.logical_id = cpu_to_le16(map_index);
 
switch (ring_type) {
-   case HWRM_RING_ALLOC_TX:
+   case HWRM_RING_ALLOC_TX: {
+   struct bnxt_tx_ring_info *txr;
+
+   txr = container_of(ring, struct bnxt_tx_ring_info,
+  tx_ring_struct);
req.ring_type = RING_ALLOC_REQ_RING_TYPE_TX;
/* Association of transmit ring with completion ring */
grp_info = >grp_info[ring->grp_idx];
-   req.cmpl_ring_id = cpu_to_le16(grp_info->cp_fw_ring_id);
+   req.cmpl_ring_id = cpu_to_le16(bnxt_cp_ring_for_tx(bp, txr));
req.length = cpu_to_le32(bp->tx_ring_mask + 1);
req.stat_ctx_id = cpu_to_le32(grp_info->fw_stats_ctx);
req.queue_id = cpu_to_le16(ring->queue_id);
break;
+   }
case HWRM_RING_ALLOC_RX:
req.ring_type = RING_ALLOC_REQ_RING_TYPE_RX;
req.length = cpu_to_le32(bp->rx_ring_mask + 1);
@@ -4711,9 +4751,9 @@ static void bnxt_hwrm_ring_free(struct bnxt *bp, bool 
close_path)
for (i = 0; i < bp->tx_nr_rings; i++) {
struct bnxt_tx_ring_info *txr = >tx_ring[i];
struct bnxt_ring_struct *ring = >tx_ring_struct;
-   u32 grp_idx = txr->bnapi->index;
-   u32 cmpl_ring_id = bp->grp_info[grp_idx].cp_fw_ring_id;
+   u32 cmpl_ring_id;
 
+   cmpl_ring_id = bnxt_cp_ring_for_tx(bp, txr);
if (ring->fw_ring_id != INVALID_HW_RING_ID) {
hwrm_ring_free_send_msg(bp, ring,
RING_FREE_REQ_RING_TYPE_TX,
@@ -4727,8 +4767,9 @@ static void bnxt_hwrm_ring_free(struct bnxt *bp, bool 
close_path)
struct bnxt_rx_ring_info *rxr = >rx_ring[i];
struct bnxt_ring_struct *ring = >rx_ring_struct;
u32 grp_idx = rxr->bnapi->index;
-   u32 cmpl_ring_id = bp->grp_info[grp_idx].cp_fw_ring_id;
+   u32 cmpl_ring_id;
 
+   cmpl_ring_id = bnxt_cp_ring_for_rx(bp, rxr);
if (ring->fw_ring_id != INVALID_HW_RING_ID) {
hwrm_ring_free_send_msg(bp, ring,
RING_FREE_REQ_RING_TYPE_RX,
@@ -4744,8 +4785,9 @@ static void 

[PATCH net-next 02/23] bnxt_en: Add additional extended port statistics.

2018-10-14 Thread Michael Chan
Latest firmware spec. has some additional rx extended port stats and new
tx extended port stats added.  We now need to check the size of the
returned rx and tx extended stats and determine how many counters are
valid.  New counters added include CoS byte and packet counts for rx
and tx.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 30 +++-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  7 ++
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 91 +--
 3 files changed, 121 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index e2d9254..84c1e6c 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -3078,6 +3078,13 @@ static void bnxt_free_stats(struct bnxt *bp)
bp->hw_rx_port_stats = NULL;
}
 
+   if (bp->hw_tx_port_stats_ext) {
+   dma_free_coherent(>dev, sizeof(struct tx_port_stats_ext),
+ bp->hw_tx_port_stats_ext,
+ bp->hw_tx_port_stats_ext_map);
+   bp->hw_tx_port_stats_ext = NULL;
+   }
+
if (bp->hw_rx_port_stats_ext) {
dma_free_coherent(>dev, sizeof(struct rx_port_stats_ext),
  bp->hw_rx_port_stats_ext,
@@ -3152,6 +3159,13 @@ static int bnxt_alloc_stats(struct bnxt *bp)
if (!bp->hw_rx_port_stats_ext)
return 0;
 
+   if (bp->hwrm_spec_code >= 0x10902) {
+   bp->hw_tx_port_stats_ext =
+   dma_zalloc_coherent(>dev,
+   sizeof(struct tx_port_stats_ext),
+   >hw_tx_port_stats_ext_map,
+   GFP_KERNEL);
+   }
bp->flags |= BNXT_FLAG_PORT_STATS_EXT;
}
return 0;
@@ -5425,8 +5439,10 @@ static int bnxt_hwrm_port_qstats(struct bnxt *bp)
 
 static int bnxt_hwrm_port_qstats_ext(struct bnxt *bp)
 {
+   struct hwrm_port_qstats_ext_output *resp = bp->hwrm_cmd_resp_addr;
struct hwrm_port_qstats_ext_input req = {0};
struct bnxt_pf_info *pf = >pf;
+   int rc;
 
if (!(bp->flags & BNXT_FLAG_PORT_STATS_EXT))
return 0;
@@ -5435,7 +5451,19 @@ static int bnxt_hwrm_port_qstats_ext(struct bnxt *bp)
req.port_id = cpu_to_le16(pf->port_id);
req.rx_stat_size = cpu_to_le16(sizeof(struct rx_port_stats_ext));
req.rx_stat_host_addr = cpu_to_le64(bp->hw_rx_port_stats_ext_map);
-   return hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
+   req.tx_stat_size = cpu_to_le16(sizeof(struct tx_port_stats_ext));
+   req.tx_stat_host_addr = cpu_to_le64(bp->hw_tx_port_stats_ext_map);
+   mutex_lock(>hwrm_cmd_lock);
+   rc = _hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
+   if (!rc) {
+   bp->fw_rx_stats_ext_size = le16_to_cpu(resp->rx_stat_size) / 8;
+   bp->fw_tx_stats_ext_size = le16_to_cpu(resp->tx_stat_size) / 8;
+   } else {
+   bp->fw_rx_stats_ext_size = 0;
+   bp->fw_tx_stats_ext_size = 0;
+   }
+   mutex_unlock(>hwrm_cmd_lock);
+   return rc;
 }
 
 static void bnxt_hwrm_free_tunnel_ports(struct bnxt *bp)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 766c50b..2cd7ee5 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1305,10 +1305,14 @@ struct bnxt {
struct rx_port_stats*hw_rx_port_stats;
struct tx_port_stats*hw_tx_port_stats;
struct rx_port_stats_ext*hw_rx_port_stats_ext;
+   struct rx_port_stats_ext*hw_tx_port_stats_ext;
dma_addr_t  hw_rx_port_stats_map;
dma_addr_t  hw_tx_port_stats_map;
dma_addr_t  hw_rx_port_stats_ext_map;
+   dma_addr_t  hw_tx_port_stats_ext_map;
int hw_port_stats_size;
+   u16 fw_rx_stats_ext_size;
+   u16 fw_tx_stats_ext_size;
 
u16 hwrm_max_req_len;
int hwrm_cmd_timeout;
@@ -1425,6 +1429,9 @@ struct bnxt {
 #define BNXT_RX_STATS_EXT_OFFSET(counter)  \
(offsetof(struct rx_port_stats_ext, counter) / 8)
 
+#define BNXT_TX_STATS_EXT_OFFSET(counter)  \
+   (offsetof(struct tx_port_stats_ext, counter) / 8)
+
 #define I2C_DEV_ADDR_A00xa0
 #define I2C_DEV_ADDR_A20xa2
 #define SFF_DIAG_SUPPORT_OFFSET0x5c
diff --git a/drivers/net/et

[PATCH net-next 05/23] bnxt_en: Refactor bnxt_ring_struct.

2018-10-14 Thread Michael Chan
Move the DMA page table and vmem fields in bnxt_ring_struct to a new
bnxt_ring_mem_info struct.  This will allow context memory management
for a new device to re-use some of the existing infrastructure.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 138 --
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |   6 +-
 2 files changed, 77 insertions(+), 67 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 83b1313..602dc09 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -2202,60 +2202,60 @@ static void bnxt_free_skbs(struct bnxt *bp)
bnxt_free_rx_skbs(bp);
 }
 
-static void bnxt_free_ring(struct bnxt *bp, struct bnxt_ring_struct *ring)
+static void bnxt_free_ring(struct bnxt *bp, struct bnxt_ring_mem_info *rmem)
 {
struct pci_dev *pdev = bp->pdev;
int i;
 
-   for (i = 0; i < ring->nr_pages; i++) {
-   if (!ring->pg_arr[i])
+   for (i = 0; i < rmem->nr_pages; i++) {
+   if (!rmem->pg_arr[i])
continue;
 
-   dma_free_coherent(>dev, ring->page_size,
- ring->pg_arr[i], ring->dma_arr[i]);
+   dma_free_coherent(>dev, rmem->page_size,
+ rmem->pg_arr[i], rmem->dma_arr[i]);
 
-   ring->pg_arr[i] = NULL;
+   rmem->pg_arr[i] = NULL;
}
-   if (ring->pg_tbl) {
-   dma_free_coherent(>dev, ring->nr_pages * 8,
- ring->pg_tbl, ring->pg_tbl_map);
-   ring->pg_tbl = NULL;
+   if (rmem->pg_tbl) {
+   dma_free_coherent(>dev, rmem->nr_pages * 8,
+ rmem->pg_tbl, rmem->pg_tbl_map);
+   rmem->pg_tbl = NULL;
}
-   if (ring->vmem_size && *ring->vmem) {
-   vfree(*ring->vmem);
-   *ring->vmem = NULL;
+   if (rmem->vmem_size && *rmem->vmem) {
+   vfree(*rmem->vmem);
+   *rmem->vmem = NULL;
}
 }
 
-static int bnxt_alloc_ring(struct bnxt *bp, struct bnxt_ring_struct *ring)
+static int bnxt_alloc_ring(struct bnxt *bp, struct bnxt_ring_mem_info *rmem)
 {
-   int i;
struct pci_dev *pdev = bp->pdev;
+   int i;
 
-   if (ring->nr_pages > 1) {
-   ring->pg_tbl = dma_alloc_coherent(>dev,
- ring->nr_pages * 8,
- >pg_tbl_map,
+   if (rmem->nr_pages > 1) {
+   rmem->pg_tbl = dma_alloc_coherent(>dev,
+ rmem->nr_pages * 8,
+ >pg_tbl_map,
  GFP_KERNEL);
-   if (!ring->pg_tbl)
+   if (!rmem->pg_tbl)
return -ENOMEM;
}
 
-   for (i = 0; i < ring->nr_pages; i++) {
-   ring->pg_arr[i] = dma_alloc_coherent(>dev,
-ring->page_size,
->dma_arr[i],
+   for (i = 0; i < rmem->nr_pages; i++) {
+   rmem->pg_arr[i] = dma_alloc_coherent(>dev,
+rmem->page_size,
+>dma_arr[i],
 GFP_KERNEL);
-   if (!ring->pg_arr[i])
+   if (!rmem->pg_arr[i])
return -ENOMEM;
 
-   if (ring->nr_pages > 1)
-   ring->pg_tbl[i] = cpu_to_le64(ring->dma_arr[i]);
+   if (rmem->nr_pages > 1)
+   rmem->pg_tbl[i] = cpu_to_le64(rmem->dma_arr[i]);
}
 
-   if (ring->vmem_size) {
-   *ring->vmem = vzalloc(ring->vmem_size);
-   if (!(*ring->vmem))
+   if (rmem->vmem_size) {
+   *rmem->vmem = vzalloc(rmem->vmem_size);
+   if (!(*rmem->vmem))
return -ENOMEM;
}
return 0;
@@ -2285,10 +2285,10 @@ static void bnxt_free_rx_rings(struct bnxt *bp)
rxr->rx_agg_bmap = NULL;
 
ring = >rx_ring_struct;
-   bnxt_free_ring(bp, ring);
+   bnxt_free_ring(bp, >ring_mem);
 
ring = >rx_agg_ring_struct;
-   bnxt_free_ring(bp, ring);
+   bnxt_free_ring(bp, >ring_mem);
}
 }
 
@@ -2315,7 +2315,7 @@ static int bnxt_alloc_rx_rings(struct bnxt *bp)
 

[PATCH net-next 18/23] bnxt_en: Add RSS support for 57500 chips.

2018-10-14 Thread Michael Chan
RSS context allocation and RSS indirection table setup are very different
on the new chip.  Refactor bnxt_setup_vnic() to call 2 different functions
to set up RSS for the vnic based on chip type.  On the new chip, the
number of RSS contexts and the indirection table size depends on the
number of RX rings.  Each indirection table entry is also different
on the new chip since ring groups are no longer used.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 113 --
 1 file changed, 109 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 1a31328..d1f9130 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4202,7 +4202,8 @@ static int bnxt_hwrm_vnic_set_rss(struct bnxt *bp, u16 
vnic_id, bool set_rss)
struct bnxt_vnic_info *vnic = >vnic_info[vnic_id];
struct hwrm_vnic_rss_cfg_input req = {0};
 
-   if (vnic->fw_rss_cos_lb_ctx[0] == INVALID_HW_RING_ID)
+   if ((bp->flags & BNXT_FLAG_CHIP_P5) ||
+   vnic->fw_rss_cos_lb_ctx[0] == INVALID_HW_RING_ID)
return 0;
 
bnxt_hwrm_cmd_hdr_init(bp, , HWRM_VNIC_RSS_CFG, -1, -1);
@@ -4233,6 +4234,51 @@ static int bnxt_hwrm_vnic_set_rss(struct bnxt *bp, u16 
vnic_id, bool set_rss)
return hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
 }
 
+static int bnxt_hwrm_vnic_set_rss_p5(struct bnxt *bp, u16 vnic_id, bool 
set_rss)
+{
+   struct bnxt_vnic_info *vnic = >vnic_info[vnic_id];
+   u32 i, j, k, nr_ctxs, max_rings = bp->rx_nr_rings;
+   struct bnxt_rx_ring_info *rxr = >rx_ring[0];
+   struct hwrm_vnic_rss_cfg_input req = {0};
+
+   bnxt_hwrm_cmd_hdr_init(bp, , HWRM_VNIC_RSS_CFG, -1, -1);
+   req.vnic_id = cpu_to_le16(vnic->fw_vnic_id);
+   if (!set_rss) {
+   hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
+   return 0;
+   }
+   req.hash_type = cpu_to_le32(bp->rss_hash_cfg);
+   req.hash_mode_flags = VNIC_RSS_CFG_REQ_HASH_MODE_FLAGS_DEFAULT;
+   req.ring_grp_tbl_addr = cpu_to_le64(vnic->rss_table_dma_addr);
+   req.hash_key_tbl_addr = cpu_to_le64(vnic->rss_hash_key_dma_addr);
+   nr_ctxs = DIV_ROUND_UP(bp->rx_nr_rings, 64);
+   for (i = 0, k = 0; i < nr_ctxs; i++) {
+   __le16 *ring_tbl = vnic->rss_table;
+   int rc;
+
+   req.ring_table_pair_index = i;
+   req.rss_ctx_idx = cpu_to_le16(vnic->fw_rss_cos_lb_ctx[i]);
+   for (j = 0; j < 64; j++) {
+   u16 ring_id;
+
+   ring_id = rxr->rx_ring_struct.fw_ring_id;
+   *ring_tbl++ = cpu_to_le16(ring_id);
+   ring_id = bnxt_cp_ring_for_rx(bp, rxr);
+   *ring_tbl++ = cpu_to_le16(ring_id);
+   rxr++;
+   k++;
+   if (k == max_rings) {
+   k = 0;
+   rxr = >rx_ring[0];
+   }
+   }
+   rc = hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
+   if (rc)
+   return -EIO;
+   }
+   return 0;
+}
+
 static int bnxt_hwrm_vnic_set_hds(struct bnxt *bp, u16 vnic_id)
 {
struct bnxt_vnic_info *vnic = >vnic_info[vnic_id];
@@ -4316,6 +4362,18 @@ int bnxt_hwrm_vnic_cfg(struct bnxt *bp, u16 vnic_id)
 
bnxt_hwrm_cmd_hdr_init(bp, , HWRM_VNIC_CFG, -1, -1);
 
+   if (bp->flags & BNXT_FLAG_CHIP_P5) {
+   struct bnxt_rx_ring_info *rxr = >rx_ring[0];
+
+   req.default_rx_ring_id =
+   cpu_to_le16(rxr->rx_ring_struct.fw_ring_id);
+   req.default_cmpl_ring_id =
+   cpu_to_le16(bnxt_cp_ring_for_rx(bp, rxr));
+   req.enables =
+   cpu_to_le32(VNIC_CFG_REQ_ENABLES_DEFAULT_RX_RING_ID |
+   VNIC_CFG_REQ_ENABLES_DEFAULT_CMPL_RING_ID);
+   goto vnic_mru;
+   }
req.enables = cpu_to_le32(VNIC_CFG_REQ_ENABLES_DFLT_RING_GRP);
/* Only RSS support for now TBD: COS & LB */
if (vnic->fw_rss_cos_lb_ctx[0] != INVALID_HW_RING_ID) {
@@ -4348,13 +4406,13 @@ int bnxt_hwrm_vnic_cfg(struct bnxt *bp, u16 vnic_id)
ring = bp->rx_nr_rings - 1;
 
grp_idx = bp->rx_ring[ring].bnapi->index;
-   req.vnic_id = cpu_to_le16(vnic->fw_vnic_id);
req.dflt_ring_grp = cpu_to_le16(bp->grp_info[grp_idx].fw_grp_id);
-
req.lb_rule = cpu_to_le16(0x);
+vnic_mru:
req.mru = cpu_to_le16(bp->dev->mtu + ETH_HLEN + ETH_FCS_LEN +
  VLAN_HLEN);
 
+   req.vnic_id = cpu_to_le16(vnic->fw_vnic_id);
 #ifdef CONFIG_BNX

[PATCH net-next 07/23] bnxt_en: Check context memory requirements from firmware.

2018-10-14 Thread Michael Chan
New device requires host context memory as a backing store.  Call
firmware to check for context memory requirements and store the
parameters.  Allocate host pages accordingly.

We also need to move the call bnxt_hwrm_queue_qportcfg() earlier
so that all the supported hardware queues and the IDs are known
before checking and allocating context memory.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 208 --
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  48 +++
 2 files changed, 248 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index f0da558..83427da 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -5255,6 +5255,187 @@ static int bnxt_hwrm_func_qcfg(struct bnxt *bp)
return rc;
 }
 
+static int bnxt_hwrm_func_backing_store_qcaps(struct bnxt *bp)
+{
+   struct hwrm_func_backing_store_qcaps_input req = {0};
+   struct hwrm_func_backing_store_qcaps_output *resp =
+   bp->hwrm_cmd_resp_addr;
+   int rc;
+
+   if (bp->hwrm_spec_code < 0x10902 || BNXT_VF(bp) || bp->ctx)
+   return 0;
+
+   bnxt_hwrm_cmd_hdr_init(bp, , HWRM_FUNC_BACKING_STORE_QCAPS, -1, -1);
+   mutex_lock(>hwrm_cmd_lock);
+   rc = _hwrm_send_message_silent(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
+   if (!rc) {
+   struct bnxt_ctx_pg_info *ctx_pg;
+   struct bnxt_ctx_mem_info *ctx;
+   int i;
+
+   ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+   if (!ctx) {
+   rc = -ENOMEM;
+   goto ctx_err;
+   }
+   ctx_pg = kzalloc(sizeof(*ctx_pg) * (bp->max_q + 1), GFP_KERNEL);
+   if (!ctx_pg) {
+   kfree(ctx);
+   rc = -ENOMEM;
+   goto ctx_err;
+   }
+   for (i = 0; i < bp->max_q + 1; i++, ctx_pg++)
+   ctx->tqm_mem[i] = ctx_pg;
+
+   bp->ctx = ctx;
+   ctx->qp_max_entries = le32_to_cpu(resp->qp_max_entries);
+   ctx->qp_min_qp1_entries = le16_to_cpu(resp->qp_min_qp1_entries);
+   ctx->qp_max_l2_entries = le16_to_cpu(resp->qp_max_l2_entries);
+   ctx->qp_entry_size = le16_to_cpu(resp->qp_entry_size);
+   ctx->srq_max_l2_entries = le16_to_cpu(resp->srq_max_l2_entries);
+   ctx->srq_max_entries = le32_to_cpu(resp->srq_max_entries);
+   ctx->srq_entry_size = le16_to_cpu(resp->srq_entry_size);
+   ctx->cq_max_l2_entries = le16_to_cpu(resp->cq_max_l2_entries);
+   ctx->cq_max_entries = le32_to_cpu(resp->cq_max_entries);
+   ctx->cq_entry_size = le16_to_cpu(resp->cq_entry_size);
+   ctx->vnic_max_vnic_entries =
+   le16_to_cpu(resp->vnic_max_vnic_entries);
+   ctx->vnic_max_ring_table_entries =
+   le16_to_cpu(resp->vnic_max_ring_table_entries);
+   ctx->vnic_entry_size = le16_to_cpu(resp->vnic_entry_size);
+   ctx->stat_max_entries = le32_to_cpu(resp->stat_max_entries);
+   ctx->stat_entry_size = le16_to_cpu(resp->stat_entry_size);
+   ctx->tqm_entry_size = le16_to_cpu(resp->tqm_entry_size);
+   ctx->tqm_min_entries_per_ring =
+   le32_to_cpu(resp->tqm_min_entries_per_ring);
+   ctx->tqm_max_entries_per_ring =
+   le32_to_cpu(resp->tqm_max_entries_per_ring);
+   ctx->tqm_entries_multiple = resp->tqm_entries_multiple;
+   if (!ctx->tqm_entries_multiple)
+   ctx->tqm_entries_multiple = 1;
+   ctx->mrav_max_entries = le32_to_cpu(resp->mrav_max_entries);
+   ctx->mrav_entry_size = le16_to_cpu(resp->mrav_entry_size);
+   ctx->tim_entry_size = le16_to_cpu(resp->tim_entry_size);
+   ctx->tim_max_entries = le32_to_cpu(resp->tim_max_entries);
+   } else {
+   rc = 0;
+   }
+ctx_err:
+   mutex_unlock(>hwrm_cmd_lock);
+   return rc;
+}
+
+static int bnxt_alloc_ctx_mem_blk(struct bnxt *bp,
+ struct bnxt_ctx_pg_info *ctx_pg, u32 mem_size)
+{
+   struct bnxt_ring_mem_info *rmem = _pg->ring_mem;
+
+   if (!mem_size)
+   return 0;
+
+   rmem->nr_pages = DIV_ROUND_UP(mem_size, BNXT_PAGE_SIZE);
+   if (rmem->nr_pages > MAX_CTX_PAGES) {
+   rmem->nr_pages = 0;
+   return -EINVAL;
+   }
+   rmem->page_size = BNXT_PAGE_SIZE;
+   rmem->pg_arr = ctx_pg->ctx_pg_arr;
+   

[PATCH net-next 08/23] bnxt_en: Configure context memory on new devices.

2018-10-14 Thread Michael Chan
Call firmware to configure the DMA addresses of all context memory
pages on new devices requiring context memory.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 123 +-
 1 file changed, 120 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 83427da..b0e2416 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -5325,6 +5325,114 @@ static int bnxt_hwrm_func_backing_store_qcaps(struct 
bnxt *bp)
return rc;
 }
 
+static void bnxt_hwrm_set_pg_attr(struct bnxt_ring_mem_info *rmem, u8 *pg_attr,
+ __le64 *pg_dir)
+{
+   u8 pg_size = 0;
+
+   if (BNXT_PAGE_SHIFT == 13)
+   pg_size = 1 << 4;
+   else if (BNXT_PAGE_SIZE == 16)
+   pg_size = 2 << 4;
+
+   *pg_attr = pg_size;
+   if (rmem->nr_pages > 1) {
+   *pg_attr |= 1;
+   *pg_dir = cpu_to_le64(rmem->pg_tbl_map);
+   } else {
+   *pg_dir = cpu_to_le64(rmem->dma_arr[0]);
+   }
+}
+
+#define FUNC_BACKING_STORE_CFG_REQ_DFLT_ENABLES\
+   (FUNC_BACKING_STORE_CFG_REQ_ENABLES_QP |\
+FUNC_BACKING_STORE_CFG_REQ_ENABLES_SRQ |   \
+FUNC_BACKING_STORE_CFG_REQ_ENABLES_CQ |\
+FUNC_BACKING_STORE_CFG_REQ_ENABLES_VNIC |  \
+FUNC_BACKING_STORE_CFG_REQ_ENABLES_STAT)
+
+static int bnxt_hwrm_func_backing_store_cfg(struct bnxt *bp, u32 enables)
+{
+   struct hwrm_func_backing_store_cfg_input req = {0};
+   struct bnxt_ctx_mem_info *ctx = bp->ctx;
+   struct bnxt_ctx_pg_info *ctx_pg;
+   __le32 *num_entries;
+   __le64 *pg_dir;
+   u8 *pg_attr;
+   int i, rc;
+   u32 ena;
+
+   if (!ctx)
+   return 0;
+
+   bnxt_hwrm_cmd_hdr_init(bp, , HWRM_FUNC_BACKING_STORE_CFG, -1, -1);
+   req.enables = cpu_to_le32(enables);
+
+   if (enables & FUNC_BACKING_STORE_CFG_REQ_ENABLES_QP) {
+   ctx_pg = >qp_mem;
+   req.qp_num_entries = cpu_to_le32(ctx_pg->entries);
+   req.qp_num_qp1_entries = cpu_to_le16(ctx->qp_min_qp1_entries);
+   req.qp_num_l2_entries = cpu_to_le16(ctx->qp_max_l2_entries);
+   req.qp_entry_size = cpu_to_le16(ctx->qp_entry_size);
+   bnxt_hwrm_set_pg_attr(_pg->ring_mem,
+ _pg_size_qpc_lvl,
+ _page_dir);
+   }
+   if (enables & FUNC_BACKING_STORE_CFG_REQ_ENABLES_SRQ) {
+   ctx_pg = >srq_mem;
+   req.srq_num_entries = cpu_to_le32(ctx_pg->entries);
+   req.srq_num_l2_entries = cpu_to_le16(ctx->srq_max_l2_entries);
+   req.srq_entry_size = cpu_to_le16(ctx->srq_entry_size);
+   bnxt_hwrm_set_pg_attr(_pg->ring_mem,
+ _pg_size_srq_lvl,
+ _page_dir);
+   }
+   if (enables & FUNC_BACKING_STORE_CFG_REQ_ENABLES_CQ) {
+   ctx_pg = >cq_mem;
+   req.cq_num_entries = cpu_to_le32(ctx_pg->entries);
+   req.cq_num_l2_entries = cpu_to_le16(ctx->cq_max_l2_entries);
+   req.cq_entry_size = cpu_to_le16(ctx->cq_entry_size);
+   bnxt_hwrm_set_pg_attr(_pg->ring_mem, _pg_size_cq_lvl,
+ _page_dir);
+   }
+   if (enables & FUNC_BACKING_STORE_CFG_REQ_ENABLES_VNIC) {
+   ctx_pg = >vnic_mem;
+   req.vnic_num_vnic_entries =
+   cpu_to_le16(ctx->vnic_max_vnic_entries);
+   req.vnic_num_ring_table_entries =
+   cpu_to_le16(ctx->vnic_max_ring_table_entries);
+   req.vnic_entry_size = cpu_to_le16(ctx->vnic_entry_size);
+   bnxt_hwrm_set_pg_attr(_pg->ring_mem,
+ _pg_size_vnic_lvl,
+ _page_dir);
+   }
+   if (enables & FUNC_BACKING_STORE_CFG_REQ_ENABLES_STAT) {
+   ctx_pg = >stat_mem;
+   req.stat_num_entries = cpu_to_le32(ctx->stat_max_entries);
+   req.stat_entry_size = cpu_to_le16(ctx->stat_entry_size);
+   bnxt_hwrm_set_pg_attr(_pg->ring_mem,
+ _pg_size_stat_lvl,
+ _page_dir);
+   }
+   for (i = 0, num_entries = _sp_num_entries,
+pg_attr = _sp_pg_size_tqm_sp_lvl,
+pg_dir = _sp_page_dir,
+ena = FUNC_BACKING_STORE_CFG_REQ_ENABLES_TQM_SP;
+i < 9; i++, num_entries++, pg_attr++, pg_dir++, ena <<= 1) {
+   if (!(enables & ena))
+  

[PATCH net-next 00/23] bnxt_en: Add support for new 57500 chips.

2018-10-14 Thread Michael Chan
This patch-set is larger than normal because I wanted a complete series
to add basic support for the new 57500 chips.  The new chips have the
following main differences compared to legacy chips:

1. Requires the PF driver to allocate DMA context memory as a backing
store.
2. New NQ (notification queue) for interrupt events.
3. One or more CP rings can be associated with an NQ.
4. 64-bit doorbells.

Most other structures and firmware APIs are compatible with legacy
devices with some exceptions.  For example, ring groups are no longer
used and RSS table format has changed.

The patch-set includes the usual firmware spec. update, some refactoring
and restructuring, and adding the new code to add basic support for the
new class of devices.

Michael Chan (23):
  bnxt_en: Update firmware interface spec. to 1.10.0.3.
  bnxt_en: Add additional extended port statistics.
  bnxt_en: Add maximum extended request length fw message support.
  bnxt_en: Update interrupt coalescing logic.
  bnxt_en: Refactor bnxt_ring_struct.
  bnxt_en: Add new flags to setup new page table PTE bits on newer
devices.
  bnxt_en: Check context memory requirements from firmware.
  bnxt_en: Configure context memory on new devices.
  bnxt_en: Add 57500 new chip ID and basic structures.
  bnxt_en: Re-structure doorbells.
  bnxt_en: Adjust MSIX and ring groups for 57500 series chips.
  bnxt_en: Modify the ring reservation functions for 57500 series chips.
  bnxt_en: Allocate completion ring structures for 57500 series chips.
  bnxt_en: Add helper functions to get firmware CP ring ID.
  bnxt_en: Modify bnxt_ring_alloc_send_msg() to support 57500 chips.
  bnxt_en: Allocate/Free CP rings for 57500 series chips.
  bnxt_en: Increase RSS context array count and skip ring groups on
57500 chips.
  bnxt_en: Add RSS support for 57500 chips.
  bnxt_en: Use bnxt_cp_ring_info struct pointer as parameter for RX
path.
  bnxt_en: Add coalescing setup for 57500 chips.
  bnxt_en: Refactor bnxt_poll_work().
  bnxt_en: Add new NAPI poll function for 57500 chips.
  bnxt_en: Add PCI ID for BCM57508 device.

 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 1671 +
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  250 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c |  112 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h |  310 ++--
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c |2 +-
 5 files changed, 1944 insertions(+), 401 deletions(-)

-- 
2.5.1



[PATCH net-next 15/23] bnxt_en: Modify bnxt_ring_alloc_send_msg() to support 57500 chips.

2018-10-14 Thread Michael Chan
Firmware ring allocation semantics are slightly different for most
ring types on 57500 chips.  Allocation/deallocation for NQ rings are
also added for the new chips.

A CP ring handle is also added so that from the NQ interrupt event,
we can locate the CP ring.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 61 ---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  1 +
 2 files changed, 56 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 99af288..db1dbad 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4543,14 +4543,53 @@ static int hwrm_ring_alloc_send_msg(struct bnxt *bp,
case HWRM_RING_ALLOC_RX:
req.ring_type = RING_ALLOC_REQ_RING_TYPE_RX;
req.length = cpu_to_le32(bp->rx_ring_mask + 1);
+   if (bp->flags & BNXT_FLAG_CHIP_P5) {
+   u16 flags = 0;
+
+   /* Association of rx ring with stats context */
+   grp_info = >grp_info[ring->grp_idx];
+   req.rx_buf_size = cpu_to_le16(bp->rx_buf_use_size);
+   req.stat_ctx_id = cpu_to_le32(grp_info->fw_stats_ctx);
+   req.enables |= cpu_to_le32(
+   RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID);
+   if (NET_IP_ALIGN == 2)
+   flags = RING_ALLOC_REQ_FLAGS_RX_SOP_PAD;
+   req.flags = cpu_to_le16(flags);
+   }
break;
case HWRM_RING_ALLOC_AGG:
-   req.ring_type = RING_ALLOC_REQ_RING_TYPE_RX;
+   if (bp->flags & BNXT_FLAG_CHIP_P5) {
+   req.ring_type = RING_ALLOC_REQ_RING_TYPE_RX_AGG;
+   /* Association of agg ring with rx ring */
+   grp_info = >grp_info[ring->grp_idx];
+   req.rx_ring_id = cpu_to_le16(grp_info->rx_fw_ring_id);
+   req.rx_buf_size = cpu_to_le16(BNXT_RX_PAGE_SIZE);
+   req.stat_ctx_id = cpu_to_le32(grp_info->fw_stats_ctx);
+   req.enables |= cpu_to_le32(
+   RING_ALLOC_REQ_ENABLES_RX_RING_ID_VALID |
+   RING_ALLOC_REQ_ENABLES_RX_BUF_SIZE_VALID);
+   } else {
+   req.ring_type = RING_ALLOC_REQ_RING_TYPE_RX;
+   }
req.length = cpu_to_le32(bp->rx_agg_ring_mask + 1);
break;
case HWRM_RING_ALLOC_CMPL:
req.ring_type = RING_ALLOC_REQ_RING_TYPE_L2_CMPL;
req.length = cpu_to_le32(bp->cp_ring_mask + 1);
+   if (bp->flags & BNXT_FLAG_CHIP_P5) {
+   /* Association of cp ring with nq */
+   grp_info = >grp_info[map_index];
+   req.nq_ring_id = cpu_to_le16(grp_info->cp_fw_ring_id);
+   req.cq_handle = cpu_to_le64(ring->handle);
+   req.enables |= cpu_to_le32(
+   RING_ALLOC_REQ_ENABLES_NQ_RING_ID_VALID);
+   } else if (bp->flags & BNXT_FLAG_USING_MSIX) {
+   req.int_mode = RING_ALLOC_REQ_INT_MODE_MSIX;
+   }
+   break;
+   case HWRM_RING_ALLOC_NQ:
+   req.ring_type = RING_ALLOC_REQ_RING_TYPE_NQ;
+   req.length = cpu_to_le32(bp->cp_ring_mask + 1);
if (bp->flags & BNXT_FLAG_USING_MSIX)
req.int_mode = RING_ALLOC_REQ_INT_MODE_MSIX;
break;
@@ -4645,7 +4684,10 @@ static int bnxt_hwrm_ring_alloc(struct bnxt *bp)
int i, rc = 0;
u32 type;
 
-   type = HWRM_RING_ALLOC_CMPL;
+   if (bp->flags & BNXT_FLAG_CHIP_P5)
+   type = HWRM_RING_ALLOC_NQ;
+   else
+   type = HWRM_RING_ALLOC_CMPL;
for (i = 0; i < bp->cp_nr_rings; i++) {
struct bnxt_napi *bnapi = bp->bnapi[i];
struct bnxt_cp_ring_info *cpr = >cp_ring;
@@ -4743,6 +4785,7 @@ static int hwrm_ring_free_send_msg(struct bnxt *bp,
 
 static void bnxt_hwrm_ring_free(struct bnxt *bp, bool close_path)
 {
+   u32 type;
int i;
 
if (!bp->bnapi)
@@ -4781,6 +4824,10 @@ static void bnxt_hwrm_ring_free(struct bnxt *bp, bool 
close_path)
}
}
 
+   if (bp->flags & BNXT_FLAG_CHIP_P5)
+   type = RING_FREE_REQ_RING_TYPE_RX_AGG;
+   else
+   type = RING_FREE_REQ_RING_TYPE_RX;
for (i = 0; i < bp->rx_nr_rings; i++) {
struct bnxt_rx_ring_info *rxr = >rx_ring[i];
struct bnxt_ring_struct *ring = >rx_agg_ring_struct

[PATCH net-next 10/23] bnxt_en: Re-structure doorbells.

2018-10-14 Thread Michael Chan
The 57500 series chips have a new 64-bit doorbell format.  Use a new
bnxt_db_info structure to unify the new and the old 32-bit doorbells.
Add a new bnxt_set_db() function to set up the doorbell addreses and
doorbell keys ahead of time.  Modify and introduce new doorbell
helpers to help abstract and unify the old and new doorbells.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 164 +++---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  65 +++--
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c |   2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c |   2 +-
 4 files changed, 171 insertions(+), 62 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 88ea8c7..56439a4 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -241,15 +241,46 @@ static bool bnxt_vf_pciid(enum board_idx idx)
 #define DB_CP_FLAGS(DB_KEY_CP | DB_IDX_VALID | DB_IRQ_DIS)
 #define DB_CP_IRQ_DIS_FLAGS(DB_KEY_CP | DB_IRQ_DIS)
 
-#define BNXT_CP_DB_REARM(db, raw_cons) \
-   writel(DB_CP_REARM_FLAGS | RING_CMP(raw_cons), db)
-
-#define BNXT_CP_DB(db, raw_cons)   \
-   writel(DB_CP_FLAGS | RING_CMP(raw_cons), db)
-
 #define BNXT_CP_DB_IRQ_DIS(db) \
writel(DB_CP_IRQ_DIS_FLAGS, db)
 
+#define BNXT_DB_CQ(db, idx)\
+   writel(DB_CP_FLAGS | RING_CMP(idx), (db)->doorbell)
+
+#define BNXT_DB_NQ_P5(db, idx) \
+   writeq((db)->db_key64 | DBR_TYPE_NQ | RING_CMP(idx), (db)->doorbell)
+
+#define BNXT_DB_CQ_ARM(db, idx)
\
+   writel(DB_CP_REARM_FLAGS | RING_CMP(idx), (db)->doorbell)
+
+#define BNXT_DB_NQ_ARM_P5(db, idx) \
+   writeq((db)->db_key64 | DBR_TYPE_NQ_ARM | RING_CMP(idx), (db)->doorbell)
+
+static void bnxt_db_nq(struct bnxt *bp, struct bnxt_db_info *db, u32 idx)
+{
+   if (bp->flags & BNXT_FLAG_CHIP_P5)
+   BNXT_DB_NQ_P5(db, idx);
+   else
+   BNXT_DB_CQ(db, idx);
+}
+
+static void bnxt_db_nq_arm(struct bnxt *bp, struct bnxt_db_info *db, u32 idx)
+{
+   if (bp->flags & BNXT_FLAG_CHIP_P5)
+   BNXT_DB_NQ_ARM_P5(db, idx);
+   else
+   BNXT_DB_CQ_ARM(db, idx);
+}
+
+static void bnxt_db_cq(struct bnxt *bp, struct bnxt_db_info *db, u32 idx)
+{
+   if (bp->flags & BNXT_FLAG_CHIP_P5)
+   writeq(db->db_key64 | DBR_TYPE_CQ_ARMALL | RING_CMP(idx),
+  db->doorbell);
+   else
+   BNXT_DB_CQ(db, idx);
+}
+
 const u16 bnxt_lhint_arr[] = {
TX_BD_FLAGS_LHINT_512_AND_SMALLER,
TX_BD_FLAGS_LHINT_512_TO_1023,
@@ -341,6 +372,7 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, 
struct net_device *dev)
struct tx_push_buffer *tx_push_buf = txr->tx_push;
struct tx_push_bd *tx_push = _push_buf->push_bd;
struct tx_bd_ext *tx_push1 = _push->txbd2;
+   void __iomem *db = txr->tx_db.doorbell;
void *pdata = tx_push_buf->data;
u64 *end;
int j, push_len;
@@ -398,12 +430,11 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, 
struct net_device *dev)
 
push_len = (length + sizeof(*tx_push) + 7) / 8;
if (push_len > 16) {
-   __iowrite64_copy(txr->tx_doorbell, tx_push_buf, 16);
-   __iowrite32_copy(txr->tx_doorbell + 4, tx_push_buf + 1,
+   __iowrite64_copy(db, tx_push_buf, 16);
+   __iowrite32_copy(db + 4, tx_push_buf + 1,
 (push_len - 16) << 1);
} else {
-   __iowrite64_copy(txr->tx_doorbell, tx_push_buf,
-push_len);
+   __iowrite64_copy(db, tx_push_buf, push_len);
}
 
goto tx_done;
@@ -505,7 +536,7 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, 
struct net_device *dev)
txr->tx_prod = prod;
 
if (!skb->xmit_more || netif_xmit_stopped(txq))
-   bnxt_db_write(bp, txr->tx_doorbell, DB_KEY_TX | prod);
+   bnxt_db_write(bp, >tx_db, prod);
 
 tx_done:
 
@@ -513,7 +544,7 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, 
struct net_device *dev)
 
if (unlikely(bnxt_tx_avail(bp, txr) <= MAX_SKB_FRAGS + 1)) {
if (skb->xmit_more && !tx_buf->is_push)
-   bnxt_db_write(bp, txr->tx_doorbell, DB_KEY_TX | prod);
+  

[PATCH net-next 12/23] bnxt_en: Modify the ring reservation functions for 57500 series chips.

2018-10-14 Thread Michael Chan
The ring reservation functions have to be modified for P5 chips in the
following ways:

- bnxt_cp_ring_info structs map to internal NQs as well as CP rings.
- Ring groups are not used.
- 1 CP ring must be available for each RX or TX ring.
- number of RSS contexts to reserve is multiples of 64 RX rings.
- RFS currently not supported.

Also, RX AGG rings are only used for jumbo frames, so we need to
unconditionally call bnxt_reserve_rings() in __bnxt_open_nic()
to see if we need to reserve AGG rings in case MTU has changed.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 127 +++---
 1 file changed, 97 insertions(+), 30 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 427eb82..a0d7237 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4330,7 +4330,8 @@ static int bnxt_hwrm_vnic_qcaps(struct bnxt *bp)
if (!rc) {
u32 flags = le32_to_cpu(resp->flags);
 
-   if (flags & VNIC_QCAPS_RESP_FLAGS_RSS_DFLT_CR_CAP)
+   if (!(bp->flags & BNXT_FLAG_CHIP_P5) &&
+   (flags & VNIC_QCAPS_RESP_FLAGS_RSS_DFLT_CR_CAP))
bp->flags |= BNXT_FLAG_NEW_RSS_CAP;
if (flags &
VNIC_QCAPS_RESP_FLAGS_ROCE_MIRRORING_CAPABLE_VNIC_CAP)
@@ -4713,6 +4714,9 @@ static void bnxt_hwrm_ring_free(struct bnxt *bp, bool 
close_path)
}
 }
 
+static int bnxt_trim_rings(struct bnxt *bp, int *rx, int *tx, int max,
+  bool shared);
+
 static int bnxt_hwrm_get_rings(struct bnxt *bp)
 {
struct hwrm_func_qcfg_output *resp = bp->hwrm_cmd_resp_addr;
@@ -4743,6 +4747,22 @@ static int bnxt_hwrm_get_rings(struct bnxt *bp)
cp = le16_to_cpu(resp->alloc_cmpl_rings);
stats = le16_to_cpu(resp->alloc_stat_ctx);
cp = min_t(u16, cp, stats);
+   if (bp->flags & BNXT_FLAG_CHIP_P5) {
+   int rx = hw_resc->resv_rx_rings;
+   int tx = hw_resc->resv_tx_rings;
+
+   if (bp->flags & BNXT_FLAG_AGG_RINGS)
+   rx >>= 1;
+   if (cp < (rx + tx)) {
+   bnxt_trim_rings(bp, , , cp, false);
+   if (bp->flags & BNXT_FLAG_AGG_RINGS)
+   rx <<= 1;
+   hw_resc->resv_rx_rings = rx;
+   hw_resc->resv_tx_rings = tx;
+   }
+   cp = le16_to_cpu(resp->alloc_msix);
+   hw_resc->resv_hw_ring_grps = rx;
+   }
hw_resc->resv_cp_rings = cp;
}
mutex_unlock(>hwrm_cmd_lock);
@@ -4768,6 +4788,8 @@ int __bnxt_hwrm_get_tx_rings(struct bnxt *bp, u16 fid, 
int *tx_rings)
return rc;
 }
 
+static bool bnxt_rfs_supported(struct bnxt *bp);
+
 static void
 __bnxt_hwrm_reserve_pf_rings(struct bnxt *bp, struct hwrm_func_cfg_input *req,
 int tx_rings, int rx_rings, int ring_grps,
@@ -4781,15 +4803,38 @@ __bnxt_hwrm_reserve_pf_rings(struct bnxt *bp, struct 
hwrm_func_cfg_input *req,
req->num_tx_rings = cpu_to_le16(tx_rings);
if (BNXT_NEW_RM(bp)) {
enables |= rx_rings ? FUNC_CFG_REQ_ENABLES_NUM_RX_RINGS : 0;
-   enables |= cp_rings ? FUNC_CFG_REQ_ENABLES_NUM_CMPL_RINGS |
- FUNC_CFG_REQ_ENABLES_NUM_STAT_CTXS : 0;
-   enables |= ring_grps ?
-  FUNC_CFG_REQ_ENABLES_NUM_HW_RING_GRPS : 0;
+   if (bp->flags & BNXT_FLAG_CHIP_P5) {
+   enables |= cp_rings ? FUNC_CFG_REQ_ENABLES_NUM_MSIX : 0;
+   enables |= tx_rings + ring_grps ?
+  FUNC_CFG_REQ_ENABLES_NUM_CMPL_RINGS |
+  FUNC_CFG_REQ_ENABLES_NUM_STAT_CTXS : 0;
+   enables |= rx_rings ?
+   FUNC_CFG_REQ_ENABLES_NUM_RSSCOS_CTXS : 0;
+   } else {
+   enables |= cp_rings ?
+  FUNC_CFG_REQ_ENABLES_NUM_CMPL_RINGS |
+  FUNC_CFG_REQ_ENABLES_NUM_STAT_CTXS : 0;
+   enables |= ring_grps ?
+  FUNC_CFG_REQ_ENABLES_NUM_HW_RING_GRPS |
+  FUNC_CFG_REQ_ENABLES_NUM_RSSCOS_CTXS : 0;
+   }
enables |= vnics ? FUNC_CFG_REQ_ENABLES_NUM_VNICS : 0;
 
req->num_rx_rings = cpu_to_le16(rx_rings);
-   req->num_hw_ring_grps = cpu_to_le16(ring_grps);
-   req->num_cmpl_rings = cpu_to_le16(

[PATCH net-next 11/23] bnxt_en: Adjust MSIX and ring groups for 57500 series chips.

2018-10-14 Thread Michael Chan
Store the maximum MSIX capability in PCIe config. space earlier.  When
we call firmware to query capability, we need to compare the PCIe
MSIX max count with the firmware count and use the smaller one as
the MSIX count for 57500 (P5) chips.

The new chips don't use ring groups.  But previous chips do and
the existing logic limits the available rings based on resource
calculations including ring groups.  Setting the max ring groups to
the max rx rings will work on the new chips without changing the
existing logic.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 56439a4..427eb82 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -5677,6 +5677,13 @@ int bnxt_hwrm_func_resc_qcaps(struct bnxt *bp, bool all)
hw_resc->min_stat_ctxs = le16_to_cpu(resp->min_stat_ctx);
hw_resc->max_stat_ctxs = le16_to_cpu(resp->max_stat_ctx);
 
+   if (bp->flags & BNXT_FLAG_CHIP_P5) {
+   u16 max_msix = le16_to_cpu(resp->max_msix);
+
+   hw_resc->max_irqs = min_t(u16, hw_resc->max_irqs, max_msix);
+   hw_resc->max_hw_ring_grps = hw_resc->max_rx_rings;
+   }
+
if (BNXT_PF(bp)) {
struct bnxt_pf_info *pf = >pf;
 
@@ -9382,6 +9389,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
return -ENOMEM;
 
bp = netdev_priv(dev);
+   bnxt_set_max_func_irqs(bp, max_irqs);
 
if (bnxt_vf_pciid(ent->driver_data))
bp->flags |= BNXT_FLAG_VF;
@@ -9513,7 +9521,6 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
bnxt_set_rx_skb_mode(bp, false);
bnxt_set_tpa_flags(bp);
bnxt_set_ring_params(bp);
-   bnxt_set_max_func_irqs(bp, max_irqs);
rc = bnxt_set_dflt_rings(bp, true);
if (rc) {
netdev_err(bp->dev, "Not enough rings available.\n");
-- 
2.5.1



[PATCH net-next 06/23] bnxt_en: Add new flags to setup new page table PTE bits on newer devices.

2018-10-14 Thread Michael Chan
Newer chips require the PTU_PTE_VALID bit to be set for every page
table entry for context memory and rings.  Additional bits are also
required for page table entries for all rings.  Add a flags field to
bnxt_ring_mem_info struct to specify these additional bits to be used
when setting up the pages tables as needed.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 17 +++--
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  8 
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 602dc09..f0da558 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -2230,8 +2230,11 @@ static void bnxt_free_ring(struct bnxt *bp, struct 
bnxt_ring_mem_info *rmem)
 static int bnxt_alloc_ring(struct bnxt *bp, struct bnxt_ring_mem_info *rmem)
 {
struct pci_dev *pdev = bp->pdev;
+   u64 valid_bit = 0;
int i;
 
+   if (rmem->flags & (BNXT_RMEM_VALID_PTE_FLAG | BNXT_RMEM_RING_PTE_FLAG))
+   valid_bit = PTU_PTE_VALID;
if (rmem->nr_pages > 1) {
rmem->pg_tbl = dma_alloc_coherent(>dev,
  rmem->nr_pages * 8,
@@ -2242,6 +2245,8 @@ static int bnxt_alloc_ring(struct bnxt *bp, struct 
bnxt_ring_mem_info *rmem)
}
 
for (i = 0; i < rmem->nr_pages; i++) {
+   u64 extra_bits = valid_bit;
+
rmem->pg_arr[i] = dma_alloc_coherent(>dev,
 rmem->page_size,
 >dma_arr[i],
@@ -2249,8 +2254,16 @@ static int bnxt_alloc_ring(struct bnxt *bp, struct 
bnxt_ring_mem_info *rmem)
if (!rmem->pg_arr[i])
return -ENOMEM;
 
-   if (rmem->nr_pages > 1)
-   rmem->pg_tbl[i] = cpu_to_le64(rmem->dma_arr[i]);
+   if (rmem->nr_pages > 1) {
+   if (i == rmem->nr_pages - 2 &&
+   (rmem->flags & BNXT_RMEM_RING_PTE_FLAG))
+   extra_bits |= PTU_PTE_NEXT_TO_LAST;
+   else if (i == rmem->nr_pages - 1 &&
+(rmem->flags & BNXT_RMEM_RING_PTE_FLAG))
+   extra_bits |= PTU_PTE_LAST;
+   rmem->pg_tbl[i] =
+   cpu_to_le64(rmem->dma_arr[i] | extra_bits);
+   }
}
 
if (rmem->vmem_size) {
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 2e4b621..5792e5c 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -580,6 +580,10 @@ struct bnxt_sw_rx_agg_bd {
 struct bnxt_ring_mem_info {
int nr_pages;
int page_size;
+   u32 flags;
+#define BNXT_RMEM_VALID_PTE_FLAG   1
+#define BNXT_RMEM_RING_PTE_FLAG2
+
void**pg_arr;
dma_addr_t  *dma_arr;
 
@@ -1109,6 +1113,10 @@ struct bnxt_vf_rep {
struct bnxt_vf_rep_statstx_stats;
 };
 
+#define PTU_PTE_VALID 0x1UL
+#define PTU_PTE_LAST  0x2UL
+#define PTU_PTE_NEXT_TO_LAST  0x4UL
+
 struct bnxt {
void __iomem*bar0;
void __iomem*bar1;
-- 
2.5.1



[PATCH net 2/4] bnxt_en: Fix enables field in HWRM_QUEUE_COS2BW_CFG request

2018-10-04 Thread Michael Chan
From: Vasundhara Volam 

In HWRM_QUEUE_COS2BW_CFG request, enables field should have the bits
set only for the queue ids which are having the valid parameters.

This causes firmware to return error when the TC to hardware CoS queue
mapping is not 1:1 during DCBNL ETS setup.

Fixes: 2e8ef77ee0ff ("bnxt_en: Add TC to hardware QoS queue mapping logic.")
Signed-off-by: Vasundhara Volam 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c
index ddc98c3..a85d2be 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c
@@ -98,13 +98,13 @@ static int bnxt_hwrm_queue_cos2bw_cfg(struct bnxt *bp, 
struct ieee_ets *ets,
 
bnxt_hwrm_cmd_hdr_init(bp, , HWRM_QUEUE_COS2BW_CFG, -1, -1);
for (i = 0; i < max_tc; i++) {
-   u8 qidx;
+   u8 qidx = bp->tc_to_qidx[i];
 
req.enables |= cpu_to_le32(
-   QUEUE_COS2BW_CFG_REQ_ENABLES_COS_QUEUE_ID0_VALID << i);
+   QUEUE_COS2BW_CFG_REQ_ENABLES_COS_QUEUE_ID0_VALID <<
+   qidx);
 
memset(, 0, sizeof(cos2bw));
-   qidx = bp->tc_to_qidx[i];
cos2bw.queue_id = bp->q_info[qidx].queue_id;
if (ets->tc_tsa[i] == IEEE_8021QAZ_TSA_STRICT) {
cos2bw.tsa =
-- 
2.5.1



[PATCH net 1/4] bnxt_en: Fix VNIC reservations on the PF.

2018-10-04 Thread Michael Chan
The enables bit for VNIC was set wrong when calling the HWRM_FUNC_CFG
firmware call to reserve VNICs.  This has the effect that the firmware
will keep a large number of VNICs for the PF, and having very few for
VFs.  DPDK driver running on the VFs, which requires more VNICs, may not
work properly as a result.

Fixes: 674f50a5b026 ("bnxt_en: Implement new method to reserve rings.")
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 0478e56..2564a92 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4650,7 +4650,7 @@ __bnxt_hwrm_reserve_pf_rings(struct bnxt *bp, struct 
hwrm_func_cfg_input *req,
  FUNC_CFG_REQ_ENABLES_NUM_STAT_CTXS : 0;
enables |= ring_grps ?
   FUNC_CFG_REQ_ENABLES_NUM_HW_RING_GRPS : 0;
-   enables |= vnics ? FUNC_VF_CFG_REQ_ENABLES_NUM_VNICS : 0;
+   enables |= vnics ? FUNC_CFG_REQ_ENABLES_NUM_VNICS : 0;
 
req->num_rx_rings = cpu_to_le16(rx_rings);
req->num_hw_ring_grps = cpu_to_le16(ring_grps);
-- 
2.5.1



[PATCH net 3/4] bnxt_en: free hwrm resources, if driver probe fails.

2018-10-04 Thread Michael Chan
From: Venkat Duvvuru 

When the driver probe fails, all the resources that were allocated prior
to the failure must be freed. However, hwrm dma response memory is not
getting freed.

This patch fixes the problem described above.

Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Venkat Duvvuru 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 2564a92..3718984 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -3017,10 +3017,11 @@ static void bnxt_free_hwrm_resources(struct bnxt *bp)
 {
struct pci_dev *pdev = bp->pdev;
 
-   dma_free_coherent(>dev, PAGE_SIZE, bp->hwrm_cmd_resp_addr,
- bp->hwrm_cmd_resp_dma_addr);
-
-   bp->hwrm_cmd_resp_addr = NULL;
+   if (bp->hwrm_cmd_resp_addr) {
+   dma_free_coherent(>dev, PAGE_SIZE, bp->hwrm_cmd_resp_addr,
+ bp->hwrm_cmd_resp_dma_addr);
+   bp->hwrm_cmd_resp_addr = NULL;
+   }
 }
 
 static int bnxt_alloc_hwrm_resources(struct bnxt *bp)
@@ -9057,6 +9058,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
bnxt_clear_int_mode(bp);
 
 init_err_pci_clean:
+   bnxt_free_hwrm_resources(bp);
bnxt_cleanup_pci(bp);
 
 init_err_free:
-- 
2.5.1



[PATCH net 0/4] bnxt_en: Misc. bug fixes.

2018-10-04 Thread Michael Chan
4 small bug fixes related to setting firmware message enables bits, possible
memory leak when probe fails, and ring accouting when RDMA driver is loaded.

Please queue these for -stable as well.  Thanks.

Michael Chan (1):
  bnxt_en: Fix VNIC reservations on the PF.

Vasundhara Volam (2):
  bnxt_en: Fix enables field in HWRM_QUEUE_COS2BW_CFG request
  bnxt_en: get the reduced max_irqs by the ones used by RDMA

Venkat Duvvuru (1):
  bnxt_en: free hwrm resources, if driver probe fails.

 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 14 --
 drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c |  6 +++---
 2 files changed, 11 insertions(+), 9 deletions(-)

-- 
2.5.1



[PATCH net 4/4] bnxt_en: get the reduced max_irqs by the ones used by RDMA

2018-10-04 Thread Michael Chan
From: Vasundhara Volam 

When getting the max rings supported, get the reduced max_irqs
by the ones used by RDMA.

If the number MSIX is the limiting factor, this bug may cause the
max ring count to be higher than it should be when RDMA driver is
loaded and may result in ring allocation failures.

Fixes: 30f529473ec9 ("bnxt_en: Do not modify max IRQ count after RDMA driver 
requests/frees IRQs.")
Signed-off-by: Vasundhara Volam 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 3718984..e2d9254 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -8622,7 +8622,7 @@ static void _bnxt_get_max_rings(struct bnxt *bp, int 
*max_rx, int *max_tx,
*max_tx = hw_resc->max_tx_rings;
*max_rx = hw_resc->max_rx_rings;
*max_cp = min_t(int, bnxt_get_max_func_cp_rings_for_en(bp),
-   hw_resc->max_irqs);
+   hw_resc->max_irqs - bnxt_get_ulp_msix_num(bp));
*max_cp = min_t(int, *max_cp, hw_resc->max_stat_ctxs);
max_ring_grps = hw_resc->max_hw_ring_grps;
if (BNXT_CHIP_TYPE_NITRO_A0(bp) && BNXT_PF(bp)) {
-- 
2.5.1



Re: [PATCH net 01/11] netpoll: do not test NAPI_STATE_SCHED in poll_one_napi()

2018-09-27 Thread Michael Chan
On Thu, Sep 27, 2018 at 9:32 AM Eric Dumazet  wrote:
>
> Since we do no longer require NAPI drivers to provide
> an ndo_poll_controller(), napi_schedule() has not been done
> before poll_one_napi() invocation.
>
> So testing NAPI_STATE_SCHED is likely to cause early returns.
>
> While we are at it, remove outdated comment.
>
> Note to future bisections : This change might surface prior
> bugs in drivers. See commit 73f21c653f93 ("bnxt_en: Fix TX
> timeout during netpoll.") for one occurrence.
>
> Fixes: ac3d9dd034e5 ("netpoll: make ndo_poll_controller() optional")
> Signed-off-by: Eric Dumazet 
> Tested-by: Song Liu 
> Cc: Michael Chan 

Reviewed-and-tested-by: Michael Chan 


[PATCH net v2] bnxt_en: Fix TX timeout during netpoll.

2018-09-25 Thread Michael Chan
The current netpoll implementation in the bnxt_en driver has problems
that may miss TX completion events.  bnxt_poll_work() in effect is
only handling at most 1 TX packet before exiting.  In addition,
there may be in flight TX completions that ->poll() may miss even
after we fix bnxt_poll_work() to handle all visible TX completions.
netpoll may not call ->poll() again and HW may not generate IRQ
because the driver does not ARM the IRQ when the budget (0 for netpoll)
is reached.

We fix it by handling all TX completions and to always ARM the IRQ
when we exit ->poll() with 0 budget.

Also, the logic to ACK the completion ring in case it is almost filled
with TX completions need to be adjusted to take care of the 0 budget
case, as discussed with Eric Dumazet 

Reported-by: Song Liu 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 61957b0..0478e56 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1884,8 +1884,11 @@ static int bnxt_poll_work(struct bnxt *bp, struct 
bnxt_napi *bnapi, int budget)
if (TX_CMP_TYPE(txcmp) == CMP_TYPE_TX_L2_CMP) {
tx_pkts++;
/* return full budget so NAPI will complete. */
-   if (unlikely(tx_pkts > bp->tx_wake_thresh))
+   if (unlikely(tx_pkts > bp->tx_wake_thresh)) {
rx_pkts = budget;
+   raw_cons = NEXT_RAW_CMP(raw_cons);
+   break;
+   }
} else if ((TX_CMP_TYPE(txcmp) & 0x30) == 0x10) {
if (likely(budget))
rc = bnxt_rx_pkt(bp, bnapi, _cons, );
@@ -1913,7 +1916,7 @@ static int bnxt_poll_work(struct bnxt *bp, struct 
bnxt_napi *bnapi, int budget)
}
raw_cons = NEXT_RAW_CMP(raw_cons);
 
-   if (rx_pkts == budget)
+   if (rx_pkts && rx_pkts == budget)
break;
}
 
@@ -2027,8 +2030,12 @@ static int bnxt_poll(struct napi_struct *napi, int 
budget)
while (1) {
work_done += bnxt_poll_work(bp, bnapi, budget - work_done);
 
-   if (work_done >= budget)
+   if (work_done >= budget) {
+   if (!budget)
+   BNXT_CP_DB_REARM(cpr->cp_doorbell,
+cpr->cp_raw_cons);
break;
+   }
 
if (!bnxt_has_work(bp, cpr)) {
if (napi_complete_done(napi, work_done))
-- 
2.5.1



Re: [PATCH net RFT] bnxt_en: Fix TX timeout during netpoll.

2018-09-25 Thread Michael Chan
On Tue, Sep 25, 2018 at 7:25 PM Eric Dumazet  wrote:
>
> On Tue, Sep 25, 2018 at 7:15 PM Michael Chan  
> wrote:
> >
> > On Tue, Sep 25, 2018 at 4:11 PM Michael Chan  
> > wrote:
> > >
> > > On Tue, Sep 25, 2018 at 3:15 PM Eric Dumazet  
> > > wrote:
> > >
> > > >
> > > > It seems bnx2 should have a similar issue ?
> > > >
> > >
> > > Yes, I think so.  The MSIX mode in bnx2 is also auto-masking, meaning
> > > that MSIX will only assert once after it is ARMed.  If we return from
> > > ->poll() when budget of 0 is reached without ARMing, we may not get
> > > another MSIX.
> > >
> >
> > On second thought, I think bnx2 is ok.  If netpoll is polling on the
> > TX packets and reaching budget of 0 and returning, the INT_ACK_CMD
> > register is untouched.  bnx2 uses the status block for events and the
> > producers/consumers are cumulative.  So there is no need to ACK the
> > status block unless ARMing for interrupts.  If there is an IRQ about
> > to be fired, it won't be affected by the polling done by netpoll.
> >
> > In the case of bnxt, a completion ring is used for the events.  The
> > polling done by netpoll will cause the completion ring to be ACKed as
> > entries are processed.  ACKing the completion ring without ARMing may
> > cause future IRQs to be disabled for that ring.
>
> About bnxt : Are you sure it is all about IRQ problems ?

I'm pretty sure, because FB first reported TX timeouts followed by
ring reset failures when running netconsole.  These ring reset
failures are caused by IRQs no longer working on some rings.

>
> What if the whole ring buffer is is filled, then all entries
> are processed from netpoll.
>
> If cp_raw_cons becomes too high without the NIC knowing its (updated)
> value, maybe no IRQ can be generated anymore because
> of some wrapping issue (based on ring size)

Good point.  We have logic to handle that.  We will ACK the ring at
least once every tp->tx_wake_thresh TX packets.  But this logic fails
when the budget is 0, so I need to send a revised patch take care of
this one case.

>
> I guess that in order to test this, we would need something bursting
> 16000 messages while holding napi->poll_owner.
> The (single) IRQ would set/grab the SCHED bit but the cpu responsible
> to service this (soft)irq would spin for the whole test,
> and no more IRQ should be fired really.

Right, not easy to hit.  But it should be handled by my v2 patch.  Thanks.


Re: [PATCH net RFT] bnxt_en: Fix TX timeout during netpoll.

2018-09-25 Thread Michael Chan
On Tue, Sep 25, 2018 at 4:11 PM Michael Chan  wrote:
>
> On Tue, Sep 25, 2018 at 3:15 PM Eric Dumazet  wrote:
>
> >
> > It seems bnx2 should have a similar issue ?
> >
>
> Yes, I think so.  The MSIX mode in bnx2 is also auto-masking, meaning
> that MSIX will only assert once after it is ARMed.  If we return from
> ->poll() when budget of 0 is reached without ARMing, we may not get
> another MSIX.
>

On second thought, I think bnx2 is ok.  If netpoll is polling on the
TX packets and reaching budget of 0 and returning, the INT_ACK_CMD
register is untouched.  bnx2 uses the status block for events and the
producers/consumers are cumulative.  So there is no need to ACK the
status block unless ARMing for interrupts.  If there is an IRQ about
to be fired, it won't be affected by the polling done by netpoll.

In the case of bnxt, a completion ring is used for the events.  The
polling done by netpoll will cause the completion ring to be ACKed as
entries are processed.  ACKing the completion ring without ARMing may
cause future IRQs to be disabled for that ring.


Re: [PATCH net RFT] bnxt_en: Fix TX timeout during netpoll.

2018-09-25 Thread Michael Chan
On Tue, Sep 25, 2018 at 3:15 PM Eric Dumazet  wrote:

>
> It seems bnx2 should have a similar issue ?
>

Yes, I think so.  The MSIX mode in bnx2 is also auto-masking, meaning
that MSIX will only assert once after it is ARMed.  If we return from
->poll() when budget of 0 is reached without ARMing, we may not get
another MSIX.

I can work on a similar patch but I don't have bnx2 cards to test with
anymore.  Thanks.


[PATCH net RFT] bnxt_en: Fix TX timeout during netpoll.

2018-09-25 Thread Michael Chan
The current netpoll implementation in the bnxt_en driver has problems
that may miss TX completion events.  bnxt_poll_work() in effect is
only handling at most 1 TX packet before exiting.  In addition,
there may be in flight TX completions that ->poll() may miss even
after we fix bnxt_poll_work() to handle all visible TX completions.
netpoll may not call ->poll() again and HW may not generate IRQ
because the driver does not ARM the IRQ when the budget (0 for netpoll)
is reached.

We fix it by handling all TX completions and to always ARM the IRQ
when we exit ->poll() with 0 budget.

Reported-by: Song Liu 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 61957b0..c981b53 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1913,7 +1913,7 @@ static int bnxt_poll_work(struct bnxt *bp, struct 
bnxt_napi *bnapi, int budget)
}
raw_cons = NEXT_RAW_CMP(raw_cons);
 
-   if (rx_pkts == budget)
+   if (rx_pkts && rx_pkts == budget)
break;
}
 
@@ -2027,8 +2027,12 @@ static int bnxt_poll(struct napi_struct *napi, int 
budget)
while (1) {
work_done += bnxt_poll_work(bp, bnapi, budget - work_done);
 
-   if (work_done >= budget)
+   if (work_done >= budget) {
+   if (!budget)
+   BNXT_CP_DB_REARM(cpr->cp_doorbell,
+cpr->cp_raw_cons);
break;
+   }
 
if (!bnxt_has_work(bp, cpr)) {
if (napi_complete_done(napi, work_done))
-- 
2.5.1



Re: [PATCH net 00/15] netpoll: avoid capture effects for NAPI drivers

2018-09-25 Thread Michael Chan
On Tue, Sep 25, 2018 at 11:25 AM Song Liu  wrote:

>
> Hi Michael,
>
> This may not be related. But I am looking at this:
>
> bnxt_poll_work() {
>
> while (1) {
> 
> if (rx_pkts == budget)
> return
> }
> }
>
> With budget of 0, the loop will terminate after processing one packet.
> But I think the expectation is to finish all tx packets. So it doesn't
> feel right. Could you please confirm?
>

Right, this in effect is processing only 1 TX packet so it will be
inefficient at least.

But I think fixing it here still will not fix all the issues, because
even if we process all the TX packets here, we may still miss some
that are in flight.  When we exit poll, netpoll may not call us back
again and there may be no interrupts because we don't ARM the IRQ when
budget of 0 is reached.  I will send a test patch shortly for review
and testing.  Thanks.


Re: [PATCH net 00/15] netpoll: avoid capture effects for NAPI drivers

2018-09-25 Thread Michael Chan
On Tue, Sep 25, 2018 at 7:20 AM Eric Dumazet  wrote:
>
> On Tue, Sep 25, 2018 at 7:02 AM Michael Chan  
> wrote:
> >
> > On Mon, Sep 24, 2018 at 2:18 PM Song Liu  wrote:
> > >
> > >
> > >
> > > > On Sep 24, 2018, at 2:05 PM, Eric Dumazet  wrote:
> > > >
> > > >>
> > > >> Interesting, maybe a bnxt specific issue.
> > > >>
> > > >> It seems their model is to process TX/RX notification in the same 
> > > >> queue,
> > > >> they throw away RX events if budget == 0
> > > >>
> > > >> It means commit e7b9569102995ebc26821789628eef45bd9840d8 is wrong and
> > > >> must be reverted.
> > > >>
> > > >> Otherwise, we have a possibility of blocking a queue under netpoll 
> > > >> pressure.
> > > >
> > > > Hmm, actually a revert might not be enough, since code at lines 
> > > > 2030-2031
> > > > would fire and we might not call napi_complete_done() anyway.
> > > >
> > > > Unfortunately this driver logic is quite complex.
> > > >
> > > > Could you test on other NIC eventually ?
> > > >
> > >
> > > It actually runs OK on ixgbe.
> > >
> > > @Michael, could you please help us with this?
> > >
> > I've taken a quick look using today's net tree plus Eric's
> > poll_one_napi() patch.  The problem I'm seeing is that netpoll calls
> > bnxt_poll() with budget 0.  And since work_done >= budget of 0, we
> > return without calling napi_complete_done() and without arming the
> > interrupt.  netpoll doesn't always call us back until we call
> > napi_complete_done(), right?  So I think if there are in-flight TX
> > completions, we'll miss those.
>
> That's the whole point of netpoll :
>
>  We drain the TX queues, without interrupts being involved at all,
> by calling ->napi() with a zero budget.
>
> napi_complete(), even if called from ->napi() while budget was zero,
> should do nothing but return early.
>
> budget==0 means that ->napi() should process all TX completions.

All TX completions that we can see.  We cannot see the in-flight ones.

If budget is exceeded, I think the assumption is that poll will always
be called again.

>
> So it looks like bnxt has a bug, that is showing up after the latest
> poll_one_napi() patch.
> This latest patch is needed otherwise the cpu attempting the
> netpoll-TX-drain might drain nothing at all,
> since it does not anymore call ndo_poll_controller() that was grabbing
> SCHED bits on all queues (napi_schedule() like calls)

I think the latest patch is preventing the normal interrupt -> NAPI
path from coming in and cleaning the remaining TX completions and
arming the interrupt.


Re: [PATCH net 00/15] netpoll: avoid capture effects for NAPI drivers

2018-09-25 Thread Michael Chan
On Mon, Sep 24, 2018 at 2:18 PM Song Liu  wrote:
>
>
>
> > On Sep 24, 2018, at 2:05 PM, Eric Dumazet  wrote:
> >
> >>
> >> Interesting, maybe a bnxt specific issue.
> >>
> >> It seems their model is to process TX/RX notification in the same queue,
> >> they throw away RX events if budget == 0
> >>
> >> It means commit e7b9569102995ebc26821789628eef45bd9840d8 is wrong and
> >> must be reverted.
> >>
> >> Otherwise, we have a possibility of blocking a queue under netpoll 
> >> pressure.
> >
> > Hmm, actually a revert might not be enough, since code at lines 2030-2031
> > would fire and we might not call napi_complete_done() anyway.
> >
> > Unfortunately this driver logic is quite complex.
> >
> > Could you test on other NIC eventually ?
> >
>
> It actually runs OK on ixgbe.
>
> @Michael, could you please help us with this?
>
I've taken a quick look using today's net tree plus Eric's
poll_one_napi() patch.  The problem I'm seeing is that netpoll calls
bnxt_poll() with budget 0.  And since work_done >= budget of 0, we
return without calling napi_complete_done() and without arming the
interrupt.  netpoll doesn't always call us back until we call
napi_complete_done(), right?  So I think if there are in-flight TX
completions, we'll miss those.


[PATCH net] bnxt_en: Fix VF mac address regression.

2018-09-14 Thread Michael Chan
The recent commit to always forward the VF MAC address to the PF for
approval may not work if the PF driver or the firmware is older.  This
will cause the VF driver to fail during probe:

  bnxt_en :00:03.0 (unnamed net_device) (uninitialized): hwrm req_type 0xf 
seq id 0x5 error 0x
  bnxt_en :00:03.0 (unnamed net_device) (uninitialized): VF MAC address 
00:00:17:02:05:d0 not approved by the PF
  bnxt_en :00:03.0: Unable to initialize mac address.
  bnxt_en: probe of :00:03.0 failed with error -99

We fix it by treating the error as fatal only if the VF MAC address is
locally generated by the VF.

Fixes: 707e7e966026 ("bnxt_en: Always forward VF MAC address to the PF.")
Reported-by: Seth Forshee 
Reported-by: Siwei Liu 
Signed-off-by: Michael Chan 
---
Please queue this for stable as well.  Thanks.

 drivers/net/ethernet/broadcom/bnxt/bnxt.c   | 9 +++--
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 9 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h | 2 +-
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index cecbb1d..177587f 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -8027,7 +8027,7 @@ static int bnxt_change_mac_addr(struct net_device *dev, 
void *p)
if (ether_addr_equal(addr->sa_data, dev->dev_addr))
return 0;
 
-   rc = bnxt_approve_mac(bp, addr->sa_data);
+   rc = bnxt_approve_mac(bp, addr->sa_data, true);
if (rc)
return rc;
 
@@ -8827,14 +8827,19 @@ static int bnxt_init_mac_addr(struct bnxt *bp)
} else {
 #ifdef CONFIG_BNXT_SRIOV
struct bnxt_vf_info *vf = >vf;
+   bool strict_approval = true;
 
if (is_valid_ether_addr(vf->mac_addr)) {
/* overwrite netdev dev_addr with admin VF MAC */
memcpy(bp->dev->dev_addr, vf->mac_addr, ETH_ALEN);
+   /* Older PF driver or firmware may not approve this
+* correctly.
+*/
+   strict_approval = false;
} else {
eth_hw_addr_random(bp->dev);
}
-   rc = bnxt_approve_mac(bp, bp->dev->dev_addr);
+   rc = bnxt_approve_mac(bp, bp->dev->dev_addr, strict_approval);
 #endif
}
return rc;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
index fcd085a..3962f6f 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
@@ -1104,7 +1104,7 @@ void bnxt_update_vf_mac(struct bnxt *bp)
mutex_unlock(>hwrm_cmd_lock);
 }
 
-int bnxt_approve_mac(struct bnxt *bp, u8 *mac)
+int bnxt_approve_mac(struct bnxt *bp, u8 *mac, bool strict)
 {
struct hwrm_func_vf_cfg_input req = {0};
int rc = 0;
@@ -1122,12 +1122,13 @@ int bnxt_approve_mac(struct bnxt *bp, u8 *mac)
memcpy(req.dflt_mac_addr, mac, ETH_ALEN);
rc = hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
 mac_done:
-   if (rc) {
+   if (rc && strict) {
rc = -EADDRNOTAVAIL;
netdev_warn(bp->dev, "VF MAC address %pM not approved by the 
PF\n",
mac);
+   return rc;
}
-   return rc;
+   return 0;
 }
 #else
 
@@ -1144,7 +1145,7 @@ void bnxt_update_vf_mac(struct bnxt *bp)
 {
 }
 
-int bnxt_approve_mac(struct bnxt *bp, u8 *mac)
+int bnxt_approve_mac(struct bnxt *bp, u8 *mac, bool strict)
 {
return 0;
 }
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h
index e9b20cd..2eed9ed 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h
@@ -39,5 +39,5 @@ int bnxt_sriov_configure(struct pci_dev *pdev, int num_vfs);
 void bnxt_sriov_disable(struct bnxt *);
 void bnxt_hwrm_exec_fwd_req(struct bnxt *);
 void bnxt_update_vf_mac(struct bnxt *);
-int bnxt_approve_mac(struct bnxt *, u8 *);
+int bnxt_approve_mac(struct bnxt *, u8 *, bool);
 #endif
-- 
2.5.1



Re: [PATCH net 0/3] bnxt_en: Bug fixes.

2018-09-04 Thread Michael Chan
On Mon, Sep 3, 2018 at 10:50 PM, Michael Chan  wrote:
> On Mon, Sep 3, 2018 at 10:01 PM, David Miller  wrote:
>>
>> From: Michael Chan 
>> Date: Mon,  3 Sep 2018 04:23:16 -0400
>>
>> > This short series fixes resource related logic in the driver, mostly
>> > affecting the RDMA driver under corner cases.
>>
>> Series applied, thanks Michael.
>>
>> Do you want patch #3 queued up for -stable?
>
> Yes, please go ahead.  Thanks.

But there is a dependency on patch #2 though.  So #2 needs to be queued as well.


Re: [PATCH net 0/3] bnxt_en: Bug fixes.

2018-09-03 Thread Michael Chan
On Mon, Sep 3, 2018 at 10:01 PM, David Miller  wrote:
>
> From: Michael Chan 
> Date: Mon,  3 Sep 2018 04:23:16 -0400
>
> > This short series fixes resource related logic in the driver, mostly
> > affecting the RDMA driver under corner cases.
>
> Series applied, thanks Michael.
>
> Do you want patch #3 queued up for -stable?

Yes, please go ahead.  Thanks.


[PATCH net 1/3] bnxt_en: Fix firmware signaled resource change logic in open.

2018-09-03 Thread Michael Chan
When the driver detects that resources have changed during open, it
should reset the rx and tx rings to 0.  This will properly setup the
init sequence to initialize the default rings again.  We also need
to signal the RDMA driver to stop and clear its interrupts.  We then
call the RoCE driver to restart if a new set of default rings is
successfully reserved.

Fixes: 25e1acd6b92b ("bnxt_en: Notify firmware about IF state changes.")
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 8bb1e38..6a1baf3 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -6684,6 +6684,8 @@ static int bnxt_hwrm_if_change(struct bnxt *bp, bool up)
hw_resc->resv_rx_rings = 0;
hw_resc->resv_hw_ring_grps = 0;
hw_resc->resv_vnics = 0;
+   bp->tx_nr_rings = 0;
+   bp->rx_nr_rings = 0;
}
return rc;
 }
@@ -8769,20 +8771,25 @@ static int bnxt_init_dflt_ring_mode(struct bnxt *bp)
if (bp->tx_nr_rings)
return 0;
 
+   bnxt_ulp_irq_stop(bp);
+   bnxt_clear_int_mode(bp);
rc = bnxt_set_dflt_rings(bp, true);
if (rc) {
netdev_err(bp->dev, "Not enough rings available.\n");
-   return rc;
+   goto init_dflt_ring_err;
}
rc = bnxt_init_int_mode(bp);
if (rc)
-   return rc;
+   goto init_dflt_ring_err;
+
bp->tx_nr_rings_per_tc = bp->tx_nr_rings;
if (bnxt_rfs_supported(bp) && bnxt_rfs_capable(bp)) {
bp->flags |= BNXT_FLAG_RFS;
bp->dev->features |= NETIF_F_NTUPLE;
}
-   return 0;
+init_dflt_ring_err:
+   bnxt_ulp_irq_restart(bp, rc);
+   return rc;
 }
 
 int bnxt_restore_pf_fw_resources(struct bnxt *bp)
-- 
2.5.1



[PATCH net 3/3] bnxt_en: Do not adjust max_cp_rings by the ones used by RDMA.

2018-09-03 Thread Michael Chan
Currently, the driver adjusts the bp->hw_resc.max_cp_rings by the number
of MSIX vectors used by RDMA.  There is one code path in open that needs
to check the true max_cp_rings including any used by RDMA.  This code
is now checking for the reduced max_cp_rings which will fail when the
number of cp rings is very small.

To fix this in a clean way, we don't adjust max_cp_rings anymore.
Instead, we add a helper bnxt_get_max_func_cp_rings_for_en() to get the
reduced max_cp_rings when appropriate.

Fixes: ec86f14ea506 ("bnxt_en: Add ULP calls to stop and restart IRQs.")
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c   | 7 ---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h   | 2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 7 ---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c   | 5 -
 4 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 6472ce4..cecbb1d 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -5913,9 +5913,9 @@ unsigned int bnxt_get_max_func_cp_rings(struct bnxt *bp)
return bp->hw_resc.max_cp_rings;
 }
 
-void bnxt_set_max_func_cp_rings(struct bnxt *bp, unsigned int max)
+unsigned int bnxt_get_max_func_cp_rings_for_en(struct bnxt *bp)
 {
-   bp->hw_resc.max_cp_rings = max;
+   return bp->hw_resc.max_cp_rings - bnxt_get_ulp_msix_num(bp);
 }
 
 static unsigned int bnxt_get_max_func_irqs(struct bnxt *bp)
@@ -8631,7 +8631,8 @@ static void _bnxt_get_max_rings(struct bnxt *bp, int 
*max_rx, int *max_tx,
 
*max_tx = hw_resc->max_tx_rings;
*max_rx = hw_resc->max_rx_rings;
-   *max_cp = min_t(int, hw_resc->max_irqs, hw_resc->max_cp_rings);
+   *max_cp = min_t(int, bnxt_get_max_func_cp_rings_for_en(bp),
+   hw_resc->max_irqs);
*max_cp = min_t(int, *max_cp, hw_resc->max_stat_ctxs);
max_ring_grps = hw_resc->max_hw_ring_grps;
if (BNXT_CHIP_TYPE_NITRO_A0(bp) && BNXT_PF(bp)) {
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index c4c77b9..bde3846 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1481,7 +1481,7 @@ int bnxt_hwrm_set_coal(struct bnxt *);
 unsigned int bnxt_get_max_func_stat_ctxs(struct bnxt *bp);
 void bnxt_set_max_func_stat_ctxs(struct bnxt *bp, unsigned int max);
 unsigned int bnxt_get_max_func_cp_rings(struct bnxt *bp);
-void bnxt_set_max_func_cp_rings(struct bnxt *bp, unsigned int max);
+unsigned int bnxt_get_max_func_cp_rings_for_en(struct bnxt *bp);
 int bnxt_get_avail_msix(struct bnxt *bp, int num);
 int bnxt_reserve_rings(struct bnxt *bp);
 void bnxt_tx_disable(struct bnxt *bp);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
index 6d583bc..fcd085a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
@@ -451,7 +451,7 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, int 
num_vfs)
 
bnxt_hwrm_cmd_hdr_init(bp, , HWRM_FUNC_VF_RESOURCE_CFG, -1, -1);
 
-   vf_cp_rings = hw_resc->max_cp_rings - bp->cp_nr_rings;
+   vf_cp_rings = bnxt_get_max_func_cp_rings_for_en(bp) - bp->cp_nr_rings;
vf_stat_ctx = hw_resc->max_stat_ctxs - bp->num_stat_ctxs;
if (bp->flags & BNXT_FLAG_AGG_RINGS)
vf_rx_rings = hw_resc->max_rx_rings - bp->rx_nr_rings * 2;
@@ -549,7 +549,8 @@ static int bnxt_hwrm_func_cfg(struct bnxt *bp, int num_vfs)
max_stat_ctxs = hw_resc->max_stat_ctxs;
 
/* Remaining rings are distributed equally amongs VF's for now */
-   vf_cp_rings = (hw_resc->max_cp_rings - bp->cp_nr_rings) / num_vfs;
+   vf_cp_rings = (bnxt_get_max_func_cp_rings_for_en(bp) -
+  bp->cp_nr_rings) / num_vfs;
vf_stat_ctx = (max_stat_ctxs - bp->num_stat_ctxs) / num_vfs;
if (bp->flags & BNXT_FLAG_AGG_RINGS)
vf_rx_rings = (hw_resc->max_rx_rings - bp->rx_nr_rings * 2) /
@@ -643,7 +644,7 @@ static int bnxt_sriov_enable(struct bnxt *bp, int *num_vfs)
 */
vfs_supported = *num_vfs;
 
-   avail_cp = hw_resc->max_cp_rings - bp->cp_nr_rings;
+   avail_cp = bnxt_get_max_func_cp_rings_for_en(bp) - bp->cp_nr_rings;
avail_stat = hw_resc->max_stat_ctxs - bp->num_stat_ctxs;
avail_cp = min_t(int, avail_cp, avail_stat);
 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
index deac73e..beee612 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
@@ -169,7 +169,6 @@ static int bnxt_req_msix_vecs(struct bnxt_en_dev *edev, int

[PATCH net 0/3] bnxt_en: Bug fixes.

2018-09-03 Thread Michael Chan
This short series fixes resource related logic in the driver, mostly
affecting the RDMA driver under corner cases.

Michael Chan (3):
  bnxt_en: Fix firmware signaled resource change logic in open.
  bnxt_en: Clean up unused functions.
  bnxt_en: Do not adjust max_cp_rings by the ones used by RDMA.

 drivers/net/ethernet/broadcom/bnxt/bnxt.c   | 22 +++---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h   |  3 +--
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c |  7 ---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c   | 20 
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h   |  1 -
 5 files changed, 20 insertions(+), 33 deletions(-)

-- 
2.5.1



[PATCH net 2/3] bnxt_en: Clean up unused functions.

2018-09-03 Thread Michael Chan
Remove unused bnxt_subtract_ulp_resources().  Change
bnxt_get_max_func_irqs() to static since it is only locally used.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c |  2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  1 -
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 15 ---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h |  1 -
 4 files changed, 1 insertion(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 6a1baf3..6472ce4 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -5918,7 +5918,7 @@ void bnxt_set_max_func_cp_rings(struct bnxt *bp, unsigned 
int max)
bp->hw_resc.max_cp_rings = max;
 }
 
-unsigned int bnxt_get_max_func_irqs(struct bnxt *bp)
+static unsigned int bnxt_get_max_func_irqs(struct bnxt *bp)
 {
struct bnxt_hw_resc *hw_resc = >hw_resc;
 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index fefa011..c4c77b9 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1482,7 +1482,6 @@ unsigned int bnxt_get_max_func_stat_ctxs(struct bnxt *bp);
 void bnxt_set_max_func_stat_ctxs(struct bnxt *bp, unsigned int max);
 unsigned int bnxt_get_max_func_cp_rings(struct bnxt *bp);
 void bnxt_set_max_func_cp_rings(struct bnxt *bp, unsigned int max);
-unsigned int bnxt_get_max_func_irqs(struct bnxt *bp);
 int bnxt_get_avail_msix(struct bnxt *bp, int num);
 int bnxt_reserve_rings(struct bnxt *bp);
 void bnxt_tx_disable(struct bnxt *bp);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
index c37b284..deac73e 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
@@ -220,21 +220,6 @@ int bnxt_get_ulp_msix_base(struct bnxt *bp)
return 0;
 }
 
-void bnxt_subtract_ulp_resources(struct bnxt *bp, int ulp_id)
-{
-   ASSERT_RTNL();
-   if (bnxt_ulp_registered(bp->edev, ulp_id)) {
-   struct bnxt_en_dev *edev = bp->edev;
-   unsigned int msix_req, max;
-
-   msix_req = edev->ulp_tbl[ulp_id].msix_requested;
-   max = bnxt_get_max_func_cp_rings(bp);
-   bnxt_set_max_func_cp_rings(bp, max - msix_req);
-   max = bnxt_get_max_func_stat_ctxs(bp);
-   bnxt_set_max_func_stat_ctxs(bp, max - 1);
-   }
-}
-
 static int bnxt_send_msg(struct bnxt_en_dev *edev, int ulp_id,
 struct bnxt_fw_msg *fw_msg)
 {
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h
index df48ac7..d9bea37 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.h
@@ -90,7 +90,6 @@ static inline bool bnxt_ulp_registered(struct bnxt_en_dev 
*edev, int ulp_id)
 
 int bnxt_get_ulp_msix_num(struct bnxt *bp);
 int bnxt_get_ulp_msix_base(struct bnxt *bp);
-void bnxt_subtract_ulp_resources(struct bnxt *bp, int ulp_id);
 void bnxt_ulp_stop(struct bnxt *bp);
 void bnxt_ulp_start(struct bnxt *bp);
 void bnxt_ulp_sriov_cfg(struct bnxt *bp, int num_vfs);
-- 
2.5.1



Re: bnxt: card intermittently hanging and dropping link

2018-08-16 Thread Michael Chan
On Thu, Aug 16, 2018 at 2:09 AM, Daniel Axtens  wrote:
> Hi Michael,
>
>> The main issue is the TX timeout.
>> .
>>
>>> [ 2682.911693] bnxt_en :3b:00.0 eth4: TX timeout detected, starting 
>>> reset task!
>>> [ 2683.782496] bnxt_en :3b:00.0 eth4: Resp cmpl intr err msg: 0x51
>>> [ 2683.783061] bnxt_en :3b:00.0 eth4: hwrm_ring_free tx failed. rc:-1
>>> [ 2684.634557] bnxt_en :3b:00.0 eth4: Resp cmpl intr err msg: 0x51
>>> [ 2684.635120] bnxt_en :3b:00.0 eth4: hwrm_ring_free tx failed. rc:-1
>>
>> and it is not recovering.
>>
>> Please provide ethtool -i eth4 which will show the firmware version on
>> the NIC.  Let's see if the firmware is too old.
>
> driver: bnxt_en
> version: 1.8.0
> firmware-version: 20.6.151.0/pkg 20.06.05.11

I believe the firmware should be updated.  My colleague will contact
you on how to proceed.

Thanks.


Re: bnxt: card intermittently hanging and dropping link

2018-08-16 Thread Michael Chan
On Wed, Aug 15, 2018 at 10:29 PM, Daniel Axtens  wrote:

> [ 2682.911295] [ cut here ]
> [ 2682.911319] NETDEV WATCHDOG: eth4 (bnxt_en): transmit queue 0 timed out

The main issue is the TX timeout.
.

> [ 2682.911693] bnxt_en :3b:00.0 eth4: TX timeout detected, starting reset 
> task!
> [ 2683.782496] bnxt_en :3b:00.0 eth4: Resp cmpl intr err msg: 0x51
> [ 2683.783061] bnxt_en :3b:00.0 eth4: hwrm_ring_free tx failed. rc:-1
> [ 2684.634557] bnxt_en :3b:00.0 eth4: Resp cmpl intr err msg: 0x51
> [ 2684.635120] bnxt_en :3b:00.0 eth4: hwrm_ring_free tx failed. rc:-1

and it is not recovering.

Please provide ethtool -i eth4 which will show the firmware version on
the NIC.  Let's see if the firmware is too old.

Thanks.


[PATCH net-next v2] bnxt_en: Fix strcpy() warnings in bnxt_ethtool.c

2018-08-10 Thread Michael Chan
From: Vasundhara Volam 

This patch fixes following smatch warnings:

drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:2826 
bnxt_fill_coredump_seg_hdr() error: strcpy() '"sEgM"' too large for 
'seg_hdr->signature' (5 vs 4)
drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:2858 
bnxt_fill_coredump_record() error: strcpy() '"cOrE"' too large for 
'record->signature' (5 vs 4)
drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:2879 
bnxt_fill_coredump_record() error: strcpy() 'utsname()->sysname' too large for 
'record->os_name' (65 vs 32)

Fixes: 6c5657d085ae ("bnxt_en: Add support for ethtool get dump.")
Reported-by: Dan Carpenter 
Signed-off-by: Vasundhara Volam 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index b6dbc3f..9c929cd 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -2823,7 +2823,7 @@ bnxt_fill_coredump_seg_hdr(struct bnxt *bp,
   int status, u32 duration, u32 instance)
 {
memset(seg_hdr, 0, sizeof(*seg_hdr));
-   strcpy(seg_hdr->signature, "sEgM");
+   memcpy(seg_hdr->signature, "sEgM", 4);
if (seg_rec) {
seg_hdr->component_id = (__force __le32)seg_rec->component_id;
seg_hdr->segment_id = (__force __le32)seg_rec->segment_id;
@@ -2855,7 +2855,7 @@ bnxt_fill_coredump_record(struct bnxt *bp, struct 
bnxt_coredump_record *record,
 
time64_to_tm(start, 0, );
memset(record, 0, sizeof(*record));
-   strcpy(record->signature, "cOrE");
+   memcpy(record->signature, "cOrE", 4);
record->flags = 0;
record->low_version = 0;
record->high_version = 1;
@@ -2876,7 +2876,7 @@ bnxt_fill_coredump_record(struct bnxt *bp, struct 
bnxt_coredump_record *record,
record->os_ver_major = cpu_to_le32(os_ver_major);
record->os_ver_minor = cpu_to_le32(os_ver_minor);
 
-   strcpy(record->os_name, utsname()->sysname);
+   strlcpy(record->os_name, utsname()->sysname, 32);
time64_to_tm(end, 0, );
record->end_year = cpu_to_le16(tm.tm_year + 1900);
record->end_month = cpu_to_le16(tm.tm_mon + 1);
-- 
2.5.1



Re: [PATCH net-next] bnxt_en: Fix strcpy() warnings in bnxt_ethtool.c

2018-08-10 Thread Michael Chan
On Fri, Aug 10, 2018 at 2:37 PM, David Miller  wrote:
> From: David Miller 
> Date: Fri, 10 Aug 2018 14:35:45 -0700 (PDT)
>
>> From: Michael Chan 
>> Date: Fri, 10 Aug 2018 17:02:12 -0400
>>
>>> From: Vasundhara Volam 
>>>
>>> This patch fixes following smatch warnings:
>>>
>>> drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:2826 
>>> bnxt_fill_coredump_seg_hdr() error: strcpy() '"sEgM"' too large for 
>>> 'seg_hdr->signature' (5 vs 4)
>>> drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:2858 
>>> bnxt_fill_coredump_record() error: strcpy() '"cOrE"' too large for 
>>> 'record->signature' (5 vs 4)
>>> drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:2879 
>>> bnxt_fill_coredump_record() error: strcpy() 'utsname()->sysname' too large 
>>> for 'record->os_name' (65 vs 32)
>>>
>>> Fixes: 6c5657d085ae ("bnxt_en: Add support for ethtool get dump.")
>>> Reported-by: Dan Carpenter 
>>> Signed-off-by: Vasundhara Volam 
>>> Signed-off-by: Michael Chan 
>>
>> Applied, thanks Michael.
>
> Actually, I'm reverting, this may fix those three warnings, but they are 
> replaced with
> a new one:
>
> ./include/linux/string.h:246:9: warning: ‘__builtin_strncpy’ output may be 
> truncated copying 32 bytes from a string of length 64 [-Wstringop-truncation]
>

OK.  I'm guessing strlcpy() is the right variant to use here.  I will
repost v2 using strlcpy().  Thanks.


[PATCH net-next] bnxt_en: Fix strcpy() warnings in bnxt_ethtool.c

2018-08-10 Thread Michael Chan
From: Vasundhara Volam 

This patch fixes following smatch warnings:

drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:2826 
bnxt_fill_coredump_seg_hdr() error: strcpy() '"sEgM"' too large for 
'seg_hdr->signature' (5 vs 4)
drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:2858 
bnxt_fill_coredump_record() error: strcpy() '"cOrE"' too large for 
'record->signature' (5 vs 4)
drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:2879 
bnxt_fill_coredump_record() error: strcpy() 'utsname()->sysname' too large for 
'record->os_name' (65 vs 32)

Fixes: 6c5657d085ae ("bnxt_en: Add support for ethtool get dump.")
Reported-by: Dan Carpenter 
Signed-off-by: Vasundhara Volam 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index b6dbc3f..d6f3289 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -2823,7 +2823,7 @@ bnxt_fill_coredump_seg_hdr(struct bnxt *bp,
   int status, u32 duration, u32 instance)
 {
memset(seg_hdr, 0, sizeof(*seg_hdr));
-   strcpy(seg_hdr->signature, "sEgM");
+   memcpy(seg_hdr->signature, "sEgM", 4);
if (seg_rec) {
seg_hdr->component_id = (__force __le32)seg_rec->component_id;
seg_hdr->segment_id = (__force __le32)seg_rec->segment_id;
@@ -2855,7 +2855,7 @@ bnxt_fill_coredump_record(struct bnxt *bp, struct 
bnxt_coredump_record *record,
 
time64_to_tm(start, 0, );
memset(record, 0, sizeof(*record));
-   strcpy(record->signature, "cOrE");
+   memcpy(record->signature, "cOrE", 4);
record->flags = 0;
record->low_version = 0;
record->high_version = 1;
@@ -2876,7 +2876,7 @@ bnxt_fill_coredump_record(struct bnxt *bp, struct 
bnxt_coredump_record *record,
record->os_ver_major = cpu_to_le32(os_ver_major);
record->os_ver_minor = cpu_to_le32(os_ver_minor);
 
-   strcpy(record->os_name, utsname()->sysname);
+   strncpy(record->os_name, utsname()->sysname, 32);
time64_to_tm(end, 0, );
record->end_year = cpu_to_le16(tm.tm_year + 1900);
record->end_month = cpu_to_le16(tm.tm_mon + 1);
-- 
2.5.1



[PATCH net-next 02/13] bnxt_en: Adjust timer based on ethtool stats-block-usecs settings.

2018-08-05 Thread Michael Chan
The driver gathers statistics using 2 mechanisms.  Some stats are DMA'ed
directly from hardware and others are polled from the driver's timer.
Currently, we only adjust the DMA frequency based on the ethtool
stats-block-usecs setting.  This patch adjusts the driver's timer
frequency as well to make everything consistent.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 3d40e49..1f626af 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -112,6 +112,11 @@ static int bnxt_set_coalesce(struct net_device *dev,
  BNXT_MAX_STATS_COAL_TICKS);
stats_ticks = rounddown(stats_ticks, BNXT_MIN_STATS_COAL_TICKS);
bp->stats_coal_ticks = stats_ticks;
+   if (bp->stats_coal_ticks)
+   bp->current_interval =
+   bp->stats_coal_ticks * HZ / 100;
+   else
+   bp->current_interval = BNXT_TIMER_INTERVAL;
update_stats = true;
}
 
-- 
2.5.1



[PATCH net-next 05/13] bnxt_en: Add new VF resource allocation strategy mode.

2018-08-05 Thread Michael Chan
The new mode is "minimal-static" to be used when resources are more
limited to support a large number of VFs, for example  The PF driver
will provision guaranteed minimum resources of 0.  Each VF has no
guranteed resources until it tries to reserve resources during device
open.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c   |  2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h   |  1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 23 ++-
 3 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index fd936c5..e0e3b4b 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -5162,7 +5162,7 @@ int bnxt_hwrm_func_resc_qcaps(struct bnxt *bp, bool all)
 
pf->vf_resv_strategy =
le16_to_cpu(resp->vf_reservation_strategy);
-   if (pf->vf_resv_strategy > BNXT_VF_RESV_STRATEGY_MINIMAL)
+   if (pf->vf_resv_strategy > BNXT_VF_RESV_STRATEGY_MINIMAL_STATIC)
pf->vf_resv_strategy = BNXT_VF_RESV_STRATEGY_MAXIMAL;
}
 hwrm_func_resc_qcaps_exit:
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 47eec14..b44a758 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -862,6 +862,7 @@ struct bnxt_pf_info {
u8  vf_resv_strategy;
 #define BNXT_VF_RESV_STRATEGY_MAXIMAL  0
 #define BNXT_VF_RESV_STRATEGY_MINIMAL  1
+#define BNXT_VF_RESV_STRATEGY_MINIMAL_STATIC   2
void*hwrm_cmd_req_addr[4];
dma_addr_t  hwrm_cmd_req_dma_addr[4];
struct bnxt_vf_info *vf;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
index f560845..b896a52 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
@@ -447,7 +447,7 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, int 
num_vfs)
u16 vf_tx_rings, vf_rx_rings, vf_cp_rings;
u16 vf_stat_ctx, vf_vnics, vf_ring_grps;
struct bnxt_pf_info *pf = >pf;
-   int i, rc = 0;
+   int i, rc = 0, min = 1;
 
bnxt_hwrm_cmd_hdr_init(bp, , HWRM_FUNC_VF_RESOURCE_CFG, -1, -1);
 
@@ -464,14 +464,19 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, 
int num_vfs)
 
req.min_rsscos_ctx = cpu_to_le16(BNXT_VF_MIN_RSS_CTX);
req.max_rsscos_ctx = cpu_to_le16(BNXT_VF_MAX_RSS_CTX);
-   if (pf->vf_resv_strategy == BNXT_VF_RESV_STRATEGY_MINIMAL) {
-   req.min_cmpl_rings = cpu_to_le16(1);
-   req.min_tx_rings = cpu_to_le16(1);
-   req.min_rx_rings = cpu_to_le16(1);
-   req.min_l2_ctxs = cpu_to_le16(BNXT_VF_MIN_L2_CTX);
-   req.min_vnics = cpu_to_le16(1);
-   req.min_stat_ctx = cpu_to_le16(1);
-   req.min_hw_ring_grps = cpu_to_le16(1);
+   if (pf->vf_resv_strategy == BNXT_VF_RESV_STRATEGY_MINIMAL_STATIC) {
+   min = 0;
+   req.min_rsscos_ctx = cpu_to_le16(min);
+   }
+   if (pf->vf_resv_strategy == BNXT_VF_RESV_STRATEGY_MINIMAL ||
+   pf->vf_resv_strategy == BNXT_VF_RESV_STRATEGY_MINIMAL_STATIC) {
+   req.min_cmpl_rings = cpu_to_le16(min);
+   req.min_tx_rings = cpu_to_le16(min);
+   req.min_rx_rings = cpu_to_le16(min);
+   req.min_l2_ctxs = cpu_to_le16(min);
+   req.min_vnics = cpu_to_le16(min);
+   req.min_stat_ctx = cpu_to_le16(min);
+   req.min_hw_ring_grps = cpu_to_le16(min);
} else {
vf_cp_rings /= num_vfs;
vf_tx_rings /= num_vfs;
-- 
2.5.1



[PATCH net-next 04/13] bnxt_en: Add PHY retry logic.

2018-08-05 Thread Michael Chan
During hotplug, the driver's open function can be called almost
immediately after power on reset.  The PHY may not be ready and the
firmware may return failure when the driver tries to update PHY
settings.  Add retry logic fired from the driver's timer to retry
the operation for 5 seconds.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 31 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  4 
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index d9fc905..fd936c5 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -6898,8 +6898,14 @@ static int __bnxt_open_nic(struct bnxt *bp, bool 
irq_re_init, bool link_re_init)
mutex_lock(>link_lock);
rc = bnxt_update_phy_setting(bp);
mutex_unlock(>link_lock);
-   if (rc)
+   if (rc) {
netdev_warn(bp->dev, "failed to update phy settings\n");
+   if (BNXT_SINGLE_PF(bp)) {
+   bp->link_info.phy_retry = true;
+   bp->link_info.phy_retry_expires =
+   jiffies + 5 * HZ;
+   }
+   }
}
 
if (irq_re_init)
@@ -7583,6 +7589,16 @@ static void bnxt_timer(struct timer_list *t)
set_bit(BNXT_FLOW_STATS_SP_EVENT, >sp_event);
bnxt_queue_sp_work(bp);
}
+
+   if (bp->link_info.phy_retry) {
+   if (time_after(jiffies, bp->link_info.phy_retry_expires)) {
+   bp->link_info.phy_retry = 0;
+   netdev_warn(bp->dev, "failed to update phy settings 
after maximum retries.\n");
+   } else {
+   set_bit(BNXT_UPDATE_PHY_SP_EVENT, >sp_event);
+   bnxt_queue_sp_work(bp);
+   }
+   }
 bnxt_restart_timer:
mod_timer(>timer, jiffies + bp->current_interval);
 }
@@ -7670,6 +7686,19 @@ static void bnxt_sp_task(struct work_struct *work)
netdev_err(bp->dev, "SP task can't update link (rc: 
%x)\n",
   rc);
}
+   if (test_and_clear_bit(BNXT_UPDATE_PHY_SP_EVENT, >sp_event)) {
+   int rc;
+
+   mutex_lock(>link_lock);
+   rc = bnxt_update_phy_setting(bp);
+   mutex_unlock(>link_lock);
+   if (rc) {
+   netdev_warn(bp->dev, "update phy settings retry 
failed\n");
+   } else {
+   bp->link_info.phy_retry = false;
+   netdev_info(bp->dev, "update phy settings retry 
succeeded\n");
+   }
+   }
if (test_and_clear_bit(BNXT_HWRM_PORT_MODULE_SP_EVENT, >sp_event)) {
mutex_lock(>link_lock);
bnxt_get_port_module_status(bp);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 0d49fe0..47eec14 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -959,6 +959,9 @@ struct bnxt_link_info {
u16 advertising;/* user adv setting */
boolforce_link_chng;
 
+   boolphy_retry;
+   unsigned long   phy_retry_expires;
+
/* a copy of phy_qcfg output used to report link
 * info to VF
 */
@@ -1344,6 +1347,7 @@ struct bnxt {
 #define BNXT_GENEVE_DEL_PORT_SP_EVENT  13
 #define BNXT_LINK_SPEED_CHNG_SP_EVENT  14
 #define BNXT_FLOW_STATS_SP_EVENT   15
+#define BNXT_UPDATE_PHY_SP_EVENT   16
 
struct bnxt_hw_resc hw_resc;
struct bnxt_pf_info pf;
-- 
2.5.1



[PATCH net-next 13/13] bnxt_en: Do not use the CNP CoS queue for networking traffic.

2018-08-05 Thread Michael Chan
The CNP CoS queue is reserved for internal RDMA Congestion Notification
Packets (CNP) and should not be used for a TC.  Modify the CoS queue
discovery code to skip over the CNP CoS queue and to reduce
bp->max_tc accordingly.  However, if RDMA is disabled in NVRAM, the
the CNP CoS queue can be used for a TC.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 22 ++
 drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h |  4 
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index dde904b..d7f51ab 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -5281,7 +5281,8 @@ static int bnxt_hwrm_queue_qportcfg(struct bnxt *bp)
int rc = 0;
struct hwrm_queue_qportcfg_input req = {0};
struct hwrm_queue_qportcfg_output *resp = bp->hwrm_cmd_resp_addr;
-   u8 i, *qptr;
+   u8 i, j, *qptr;
+   bool no_rdma;
 
bnxt_hwrm_cmd_hdr_init(bp, , HWRM_QUEUE_QPORTCFG, -1, -1);
 
@@ -5299,19 +5300,24 @@ static int bnxt_hwrm_queue_qportcfg(struct bnxt *bp)
if (bp->max_tc > BNXT_MAX_QUEUE)
bp->max_tc = BNXT_MAX_QUEUE;
 
+   no_rdma = !(bp->flags & BNXT_FLAG_ROCE_CAP);
+   qptr = >queue_id0;
+   for (i = 0, j = 0; i < bp->max_tc; i++) {
+   bp->q_info[j].queue_id = *qptr++;
+   bp->q_info[j].queue_profile = *qptr++;
+   bp->tc_to_qidx[j] = j;
+   if (!BNXT_CNPQ(bp->q_info[j].queue_profile) ||
+   (no_rdma && BNXT_PF(bp)))
+   j++;
+   }
+   bp->max_tc = max_t(u8, j, 1);
+
if (resp->queue_cfg_info & QUEUE_QPORTCFG_RESP_QUEUE_CFG_INFO_ASYM_CFG)
bp->max_tc = 1;
 
if (bp->max_lltc > bp->max_tc)
bp->max_lltc = bp->max_tc;
 
-   qptr = >queue_id0;
-   for (i = 0; i < bp->max_tc; i++) {
-   bp->q_info[i].queue_id = *qptr++;
-   bp->q_info[i].queue_profile = *qptr++;
-   bp->tc_to_qidx[i] = i;
-   }
-
 qportcfg_exit:
mutex_unlock(>hwrm_cmd_lock);
return rc;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h
index c0e16c0..6eed231 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h
@@ -43,6 +43,10 @@ struct bnxt_dscp2pri_entry {
((q_profile) == \
 QUEUE_QPORTCFG_RESP_QUEUE_ID0_SERVICE_PROFILE_LOSSLESS_ROCE)
 
+#define BNXT_CNPQ(q_profile)   \
+   ((q_profile) == \
+QUEUE_QPORTCFG_RESP_QUEUE_ID0_SERVICE_PROFILE_LOSSY_ROCE_CNP)
+
 #define HWRM_STRUCT_DATA_SUBTYPE_HOST_OPERATIONAL  0x0300
 
 void bnxt_dcb_init(struct bnxt *bp);
-- 
2.5.1



[PATCH net-next 10/13] bnxt_en: Notify firmware about IF state changes.

2018-08-05 Thread Michael Chan
Use latest firmware API to notify firmware about IF state changes.
Firmware has the option to clean up resources during IF down and
to require the driver to reserve resources again during IF up.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 53 +--
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  1 +
 2 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 1659940..56bd097 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -3638,7 +3638,9 @@ int bnxt_hwrm_func_rgtr_async_events(struct bnxt *bp, 
unsigned long *bmap,
 
 static int bnxt_hwrm_func_drv_rgtr(struct bnxt *bp)
 {
+   struct hwrm_func_drv_rgtr_output *resp = bp->hwrm_cmd_resp_addr;
struct hwrm_func_drv_rgtr_input req = {0};
+   int rc;
 
bnxt_hwrm_cmd_hdr_init(bp, , HWRM_FUNC_DRV_RGTR, -1, -1);
 
@@ -3676,7 +3678,15 @@ static int bnxt_hwrm_func_drv_rgtr(struct bnxt *bp)
cpu_to_le32(FUNC_DRV_RGTR_REQ_ENABLES_VF_REQ_FWD);
}
 
-   return hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
+   mutex_lock(>hwrm_cmd_lock);
+   rc = _hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
+   if (rc)
+   rc = -EIO;
+   else if (resp->flags &
+cpu_to_le32(FUNC_DRV_RGTR_RESP_FLAGS_IF_CHANGE_SUPPORTED))
+   bp->fw_cap |= BNXT_FW_CAP_IF_CHANGE;
+   mutex_unlock(>hwrm_cmd_lock);
+   return rc;
 }
 
 static int bnxt_hwrm_func_drv_unrgtr(struct bnxt *bp)
@@ -6637,6 +6647,39 @@ static int bnxt_hwrm_shutdown_link(struct bnxt *bp)
return hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
 }
 
+static int bnxt_hwrm_if_change(struct bnxt *bp, bool up)
+{
+   struct hwrm_func_drv_if_change_output *resp = bp->hwrm_cmd_resp_addr;
+   struct hwrm_func_drv_if_change_input req = {0};
+   bool resc_reinit = false;
+   int rc;
+
+   if (!(bp->fw_cap & BNXT_FW_CAP_IF_CHANGE))
+   return 0;
+
+   bnxt_hwrm_cmd_hdr_init(bp, , HWRM_FUNC_DRV_IF_CHANGE, -1, -1);
+   if (up)
+   req.flags = cpu_to_le32(FUNC_DRV_IF_CHANGE_REQ_FLAGS_UP);
+   mutex_lock(>hwrm_cmd_lock);
+   rc = _hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
+   if (!rc && (resp->flags &
+   cpu_to_le32(FUNC_DRV_IF_CHANGE_RESP_FLAGS_RESC_CHANGE)))
+   resc_reinit = true;
+   mutex_unlock(>hwrm_cmd_lock);
+
+   if (up && resc_reinit && BNXT_NEW_RM(bp)) {
+   struct bnxt_hw_resc *hw_resc = >hw_resc;
+
+   rc = bnxt_hwrm_func_resc_qcaps(bp, true);
+   hw_resc->resv_cp_rings = 0;
+   hw_resc->resv_tx_rings = 0;
+   hw_resc->resv_rx_rings = 0;
+   hw_resc->resv_hw_ring_grps = 0;
+   hw_resc->resv_vnics = 0;
+   }
+   return rc;
+}
+
 static int bnxt_hwrm_port_led_qcaps(struct bnxt *bp)
 {
struct hwrm_port_led_qcaps_output *resp = bp->hwrm_cmd_resp_addr;
@@ -6991,8 +7034,13 @@ void bnxt_half_close_nic(struct bnxt *bp)
 static int bnxt_open(struct net_device *dev)
 {
struct bnxt *bp = netdev_priv(dev);
+   int rc;
 
-   return __bnxt_open_nic(bp, true, true);
+   bnxt_hwrm_if_change(bp, true);
+   rc = __bnxt_open_nic(bp, true, true);
+   if (rc)
+   bnxt_hwrm_if_change(bp, false);
+   return rc;
 }
 
 static bool bnxt_drv_busy(struct bnxt *bp)
@@ -7056,6 +7104,7 @@ static int bnxt_close(struct net_device *dev)
 
bnxt_close_nic(bp, true, true);
bnxt_hwrm_shutdown_link(bp);
+   bnxt_hwrm_if_change(bp, false);
return 0;
 }
 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index ded2aff..6c40b257 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1290,6 +1290,7 @@ struct bnxt {
#define BNXT_FW_CAP_LLDP_AGENT  0x0002
#define BNXT_FW_CAP_DCBX_AGENT  0x0004
#define BNXT_FW_CAP_NEW_RM  0x0008
+   #define BNXT_FW_CAP_IF_CHANGE   0x0010
 
 #define BNXT_NEW_RM(bp)((bp)->fw_cap & BNXT_FW_CAP_NEW_RM)
u32 hwrm_spec_code;
-- 
2.5.1



[PATCH net-next 06/13] bnxt_en: Update RSS setup and GRO-HW logic according to the latest spec.

2018-08-05 Thread Michael Chan
Set the default hash mode flag in HWRM_VNIC_RSS_CFG to signal to the
firmware that the driver is compliant with the latest spec.  With
that, the firmware can return expanded RSS profile IDs that the driver
checks to setup the proper gso_type for GRO-HW packets.  But instead
of checking for the new profile IDs, we check the IP_TYPE flag
in TPA_START which is more straight forward than checking a list of
profile IDs.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h | 4 
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index e0e3b4b..1714850 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1115,7 +1115,7 @@ static void bnxt_tpa_start(struct bnxt *bp, struct 
bnxt_rx_ring_info *rxr,
tpa_info->hash_type = PKT_HASH_TYPE_L4;
tpa_info->gso_type = SKB_GSO_TCPV4;
/* RSS profiles 1 and 3 with extract code 0 for inner 4-tuple */
-   if (hash_type == 3)
+   if (hash_type == 3 || TPA_START_IS_IPV6(tpa_start1))
tpa_info->gso_type = SKB_GSO_TCPV6;
tpa_info->rss_hash =
le32_to_cpu(tpa_start->rx_tpa_start_cmp_rss_hash);
@@ -3981,6 +3981,7 @@ static int bnxt_hwrm_vnic_set_rss(struct bnxt *bp, u16 
vnic_id, bool set_rss)
bnxt_hwrm_cmd_hdr_init(bp, , HWRM_VNIC_RSS_CFG, -1, -1);
if (set_rss) {
req.hash_type = cpu_to_le32(bp->rss_hash_cfg);
+   req.hash_mode_flags = VNIC_RSS_CFG_REQ_HASH_MODE_FLAGS_DEFAULT;
if (vnic->flags & BNXT_VNIC_RSS_FLAG) {
if (BNXT_CHIP_TYPE_NITRO_A0(bp))
max_rings = bp->rx_nr_rings - 1;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index b44a758..7ea022d 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -326,6 +326,10 @@ struct rx_tpa_start_cmp_ext {
((le32_to_cpu((rx_tpa_start)->rx_tpa_start_cmp_cfa_code_v2) &   \
 RX_TPA_START_CMP_CFA_CODE) >> RX_TPA_START_CMPL_CFA_CODE_SHIFT)
 
+#define TPA_START_IS_IPV6(rx_tpa_start)\
+   (!!((rx_tpa_start)->rx_tpa_start_cmp_flags2 &   \
+   cpu_to_le32(RX_TPA_START_CMP_FLAGS2_IP_TYPE)))
+
 struct rx_tpa_end_cmp {
__le32 rx_tpa_end_cmp_len_flags_type;
#define RX_TPA_END_CMP_TYPE (0x3f << 0)
-- 
2.5.1



[PATCH net-next 07/13] bnxt_en: Add support for ethtool get dump.

2018-08-05 Thread Michael Chan
From: Vasundhara Volam 

Add support to collect live firmware coredump via ethtool.

Signed-off-by: Vasundhara Volam 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.h |  66 
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c  | 333 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h  |  37 +++
 3 files changed, 436 insertions(+)
 create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.h

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.h
new file mode 100644
index 000..09c22f8
--- /dev/null
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.h
@@ -0,0 +1,66 @@
+/* Broadcom NetXtreme-C/E network driver.
+ *
+ * Copyright (c) 2018 Broadcom Inc
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ */
+
+#ifndef BNXT_COREDUMP_H
+#define BNXT_COREDUMP_H
+
+struct bnxt_coredump_segment_hdr {
+   __u8 signature[4];
+   __le32 component_id;
+   __le32 segment_id;
+   __le32 flags;
+   __u8 low_version;
+   __u8 high_version;
+   __le16 function_id;
+   __le32 offset;
+   __le32 length;
+   __le32 status;
+   __le32 duration;
+   __le32 data_offset;
+   __le32 instance;
+   __le32 rsvd[5];
+};
+
+struct bnxt_coredump_record {
+   __u8 signature[4];
+   __le32 flags;
+   __u8 low_version;
+   __u8 high_version;
+   __u8 asic_state;
+   __u8 rsvd0[5];
+   char system_name[32];
+   __le16 year;
+   __le16 month;
+   __le16 day;
+   __le16 hour;
+   __le16 minute;
+   __le16 second;
+   __le16 utc_bias;
+   __le16 rsvd1;
+   char commandline[256];
+   __le32 total_segments;
+   __le32 os_ver_major;
+   __le32 os_ver_minor;
+   __le32 rsvd2;
+   char os_name[32];
+   __le16 end_year;
+   __le16 end_month;
+   __le16 end_day;
+   __le16 end_hour;
+   __le16 end_minute;
+   __le16 end_second;
+   __le16 end_utc_bias;
+   __le32 asic_id1;
+   __le32 asic_id2;
+   __le32 coredump_status;
+   __u8 ioctl_low_version;
+   __u8 ioctl_high_version;
+   __le16 rsvd3[313];
+};
+#endif
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 9517633..3fc7c74 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -16,12 +16,15 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include "bnxt_hsi.h"
 #include "bnxt.h"
 #include "bnxt_xdp.h"
 #include "bnxt_ethtool.h"
 #include "bnxt_nvm_defs.h" /* NVRAM content constant and structure defs */
 #include "bnxt_fw_hdr.h"   /* Firmware hdr constant and structure defs */
+#include "bnxt_coredump.h"
 #define FLASH_NVRAM_TIMEOUT((HWRM_CMD_TIMEOUT) * 100)
 #define FLASH_PACKAGE_TIMEOUT  ((HWRM_CMD_TIMEOUT) * 200)
 #define INSTALL_PACKAGE_TIMEOUT((HWRM_CMD_TIMEOUT) * 200)
@@ -2685,6 +2688,334 @@ static int bnxt_reset(struct net_device *dev, u32 
*flags)
return rc;
 }
 
+static int bnxt_hwrm_dbg_dma_data(struct bnxt *bp, void *msg, int msg_len,
+ struct bnxt_hwrm_dbg_dma_info *info)
+{
+   struct hwrm_dbg_cmn_output *cmn_resp = bp->hwrm_cmd_resp_addr;
+   struct hwrm_dbg_cmn_input *cmn_req = msg;
+   __le16 *seq_ptr = msg + info->seq_off;
+   u16 seq = 0, len, segs_off;
+   void *resp = cmn_resp;
+   dma_addr_t dma_handle;
+   int rc, off = 0;
+   void *dma_buf;
+
+   dma_buf = dma_alloc_coherent(>pdev->dev, info->dma_len, _handle,
+GFP_KERNEL);
+   if (!dma_buf)
+   return -ENOMEM;
+
+   segs_off = offsetof(struct hwrm_dbg_coredump_list_output,
+   total_segments);
+   cmn_req->host_dest_addr = cpu_to_le64(dma_handle);
+   cmn_req->host_buf_len = cpu_to_le32(info->dma_len);
+   mutex_lock(>hwrm_cmd_lock);
+   while (1) {
+   *seq_ptr = cpu_to_le16(seq);
+   rc = _hwrm_send_message(bp, msg, msg_len, HWRM_CMD_TIMEOUT);
+   if (rc)
+   break;
+
+   len = le16_to_cpu(*((__le16 *)(resp + info->data_len_off)));
+   if (!seq &&
+   cmn_req->req_type == cpu_to_le16(HWRM_DBG_COREDUMP_LIST)) {
+   info->segs = le16_to_cpu(*((__le16 *)(resp +
+ segs_off)));
+   if (!info->segs) {
+   rc = -EIO;
+   break;
+   }
+
+  

[PATCH net-next 03/13] bnxt_en: Add external loopback test to ethtool selftest.

2018-08-05 Thread Michael Chan
Add code to detect firmware support for external loopback and the extra
test entry for external loopback.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c |  4 +++
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  2 ++
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 32 ++-
 3 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index c612d74..d9fc905 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -6337,6 +6337,10 @@ static int bnxt_hwrm_phy_qcaps(struct bnxt *bp)
bp->lpi_tmr_hi = le32_to_cpu(resp->valid_tx_lpi_timer_high) &
 PORT_PHY_QCAPS_RESP_TX_LPI_TIMER_HIGH_MASK;
}
+   if (resp->flags & PORT_PHY_QCAPS_RESP_FLAGS_EXTERNAL_LPBK_SUPPORTED) {
+   if (bp->test_info)
+   bp->test_info->flags |= BNXT_TEST_FL_EXT_LPBK;
+   }
if (resp->supported_speeds_auto_mode)
link_info->support_auto_speeds =
le16_to_cpu(resp->supported_speeds_auto_mode);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 3b5a55c..0d49fe0 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -990,6 +990,8 @@ struct bnxt_led_info {
 
 struct bnxt_test_info {
u8 offline_mask;
+   u8 flags;
+#define BNXT_TEST_FL_EXT_LPBK  0x1
u16 timeout;
char string[BNXT_MAX_TEST][ETH_GSTRING_LEN];
 };
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 1f626af..9517633 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -2397,7 +2397,7 @@ static int bnxt_disable_an_for_lpbk(struct bnxt *bp,
return rc;
 }
 
-static int bnxt_hwrm_phy_loopback(struct bnxt *bp, bool enable)
+static int bnxt_hwrm_phy_loopback(struct bnxt *bp, bool enable, bool ext)
 {
struct hwrm_port_phy_cfg_input req = {0};
 
@@ -2405,7 +2405,10 @@ static int bnxt_hwrm_phy_loopback(struct bnxt *bp, bool 
enable)
 
if (enable) {
bnxt_disable_an_for_lpbk(bp, );
-   req.lpbk = PORT_PHY_CFG_REQ_LPBK_LOCAL;
+   if (ext)
+   req.lpbk = PORT_PHY_CFG_REQ_LPBK_EXTERNAL;
+   else
+   req.lpbk = PORT_PHY_CFG_REQ_LPBK_LOCAL;
} else {
req.lpbk = PORT_PHY_CFG_REQ_LPBK_NONE;
}
@@ -2538,15 +2541,17 @@ static int bnxt_run_fw_tests(struct bnxt *bp, u8 
test_mask, u8 *test_results)
return rc;
 }
 
-#define BNXT_DRV_TESTS 3
+#define BNXT_DRV_TESTS 4
 #define BNXT_MACLPBK_TEST_IDX  (bp->num_tests - BNXT_DRV_TESTS)
 #define BNXT_PHYLPBK_TEST_IDX  (BNXT_MACLPBK_TEST_IDX + 1)
-#define BNXT_IRQ_TEST_IDX  (BNXT_MACLPBK_TEST_IDX + 2)
+#define BNXT_EXTLPBK_TEST_IDX  (BNXT_MACLPBK_TEST_IDX + 2)
+#define BNXT_IRQ_TEST_IDX  (BNXT_MACLPBK_TEST_IDX + 3)
 
 static void bnxt_self_test(struct net_device *dev, struct ethtool_test *etest,
   u64 *buf)
 {
struct bnxt *bp = netdev_priv(dev);
+   bool do_ext_lpbk = false;
bool offline = false;
u8 test_results = 0;
u8 test_mask = 0;
@@ -2560,6 +2565,10 @@ static void bnxt_self_test(struct net_device *dev, 
struct ethtool_test *etest,
return;
}
 
+   if ((etest->flags & ETH_TEST_FL_EXTERNAL_LB) &&
+   (bp->test_info->flags & BNXT_TEST_FL_EXT_LPBK))
+   do_ext_lpbk = true;
+
if (etest->flags & ETH_TEST_FL_OFFLINE) {
if (bp->pf.active_vfs) {
etest->flags |= ETH_TEST_FL_FAILED;
@@ -2600,13 +2609,22 @@ static void bnxt_self_test(struct net_device *dev, 
struct ethtool_test *etest,
buf[BNXT_MACLPBK_TEST_IDX] = 0;
 
bnxt_hwrm_mac_loopback(bp, false);
-   bnxt_hwrm_phy_loopback(bp, true);
+   bnxt_hwrm_phy_loopback(bp, true, false);
msleep(1000);
if (bnxt_run_loopback(bp)) {
buf[BNXT_PHYLPBK_TEST_IDX] = 1;
etest->flags |= ETH_TEST_FL_FAILED;
}
-   bnxt_hwrm_phy_loopback(bp, false);
+   if (do_ext_lpbk) {
+   etest->flags |= ETH_TEST_FL_EXTERNAL_LB_DONE;
+   bnxt_hwrm_phy_loopback(bp, true, true);
+   msleep(1000);
+   if (bnxt_run_loopback(bp)) {
+   buf[BNXT_EXTLPBK

[PATCH net-next 01/13] bnxt_en: Update firmware interface version to 1.9.2.25.

2018-08-05 Thread Michael Chan
New interface has firmware core dump support, new extended port
statistics, and IF state change notifications to the firmware.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |4 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c |8 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c |6 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h | 1227 +++--
 4 files changed, 924 insertions(+), 321 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 934aa11..3b5a55c 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -12,11 +12,11 @@
 #define BNXT_H
 
 #define DRV_MODULE_NAME"bnxt_en"
-#define DRV_MODULE_VERSION "1.9.1"
+#define DRV_MODULE_VERSION "1.9.2"
 
 #define DRV_VER_MAJ1
 #define DRV_VER_MIN9
-#define DRV_VER_UPD1
+#define DRV_VER_UPD2
 
 #include 
 #include 
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
index 7bd96ab..f3b9fbc 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
@@ -29,7 +29,7 @@ static const struct bnxt_dl_nvm_param nvm_params[] = {
 static int bnxt_hwrm_nvm_req(struct bnxt *bp, u32 param_id, void *msg,
 int msg_len, union devlink_param_value *val)
 {
-   struct hwrm_nvm_variable_input *req = msg;
+   struct hwrm_nvm_get_variable_input *req = msg;
void *data_addr = NULL, *buf = NULL;
struct bnxt_dl_nvm_param nvm_param;
int bytesize, idx = 0, rc, i;
@@ -60,18 +60,18 @@ static int bnxt_hwrm_nvm_req(struct bnxt *bp, u32 param_id, 
void *msg,
if (!data_addr)
return -ENOMEM;
 
-   req->data_addr = cpu_to_le64(data_dma_addr);
+   req->dest_data_addr = cpu_to_le64(data_dma_addr);
req->data_len = cpu_to_le16(nvm_param.num_bits);
req->option_num = cpu_to_le16(nvm_param.offset);
req->index_0 = cpu_to_le16(idx);
if (idx)
req->dimensions = cpu_to_le16(1);
 
-   if (req->req_type == HWRM_NVM_SET_VARIABLE)
+   if (req->req_type == cpu_to_le16(HWRM_NVM_SET_VARIABLE))
memcpy(data_addr, buf, bytesize);
 
rc = hwrm_send_message(bp, msg, msg_len, HWRM_CMD_TIMEOUT);
-   if (!rc && req->req_type == HWRM_NVM_GET_VARIABLE)
+   if (!rc && req->req_type == cpu_to_le16(HWRM_NVM_GET_VARIABLE))
memcpy(buf, data_addr, bytesize);
 
dma_free_coherent(>pdev->dev, bytesize, data_addr, data_dma_addr);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 7270c8b..3d40e49 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -162,7 +162,7 @@ static const struct {
BNXT_RX_STATS_ENTRY(rx_128b_255b_frames),
BNXT_RX_STATS_ENTRY(rx_256b_511b_frames),
BNXT_RX_STATS_ENTRY(rx_512b_1023b_frames),
-   BNXT_RX_STATS_ENTRY(rx_1024b_1518_frames),
+   BNXT_RX_STATS_ENTRY(rx_1024b_1518b_frames),
BNXT_RX_STATS_ENTRY(rx_good_vlan_frames),
BNXT_RX_STATS_ENTRY(rx_1519b_2047b_frames),
BNXT_RX_STATS_ENTRY(rx_2048b_4095b_frames),
@@ -205,9 +205,9 @@ static const struct {
BNXT_TX_STATS_ENTRY(tx_128b_255b_frames),
BNXT_TX_STATS_ENTRY(tx_256b_511b_frames),
BNXT_TX_STATS_ENTRY(tx_512b_1023b_frames),
-   BNXT_TX_STATS_ENTRY(tx_1024b_1518_frames),
+   BNXT_TX_STATS_ENTRY(tx_1024b_1518b_frames),
BNXT_TX_STATS_ENTRY(tx_good_vlan_frames),
-   BNXT_TX_STATS_ENTRY(tx_1519b_2047_frames),
+   BNXT_TX_STATS_ENTRY(tx_1519b_2047b_frames),
BNXT_TX_STATS_ENTRY(tx_2048b_4095b_frames),
BNXT_TX_STATS_ENTRY(tx_4096b_9216b_frames),
BNXT_TX_STATS_ENTRY(tx_9217b_16383b_frames),
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h
index c75d7fa..971ace5d 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h
@@ -96,6 +96,7 @@ struct hwrm_short_input {
 struct cmd_nums {
__le16  req_type;
#define HWRM_VER_GET  0x0UL
+   #define HWRM_FUNC_DRV_IF_CHANGE   0xdUL
#define HWRM_FUNC_BUF_UNRGTR  0xeUL
#define HWRM_FUNC_VF_CFG  0xfUL
#define HWRM_RESERVED10x10UL
@@ -159,6 +160,7 @@ struct cmd_nums {
#define HWRM_RING_FREE0x51UL
#define HWRM_RING_CMPL_RING_QAGGINT_PARAMS0x52UL
#define HWRM_RING_CMPL_RING_CFG_AGGINT_PARAMS 0x53UL
+   #defin

[PATCH net-next 12/13] bnxt_en: Add DCBNL DSCP application protocol support.

2018-08-05 Thread Michael Chan
Expand the .ieee_setapp() and ieee_delapp() DCBNL methods to support
DSCP.  This allows DSCP values to user priority mappings instead
of using VLAN priorities.  Each DSCP mapping is added or deleted one
entry at a time using the firmware API.  The firmware call can only be
made from a PF.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c | 83 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h |  6 ++
 3 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 006726c..fefa011 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1281,6 +1281,7 @@ struct bnxt {
struct ieee_ets *ieee_ets;
u8  dcbx_cap;
u8  default_pri;
+   u8  max_dscp_value;
 #endif /* CONFIG_BNXT_DCB */
 
u32 msg_enable;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c
index 00dd26d..ddc98c3 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c
@@ -385,6 +385,61 @@ static int bnxt_hwrm_set_dcbx_app(struct bnxt *bp, struct 
dcb_app *app,
return rc;
 }
 
+static int bnxt_hwrm_queue_dscp_qcaps(struct bnxt *bp)
+{
+   struct hwrm_queue_dscp_qcaps_output *resp = bp->hwrm_cmd_resp_addr;
+   struct hwrm_queue_dscp_qcaps_input req = {0};
+   int rc;
+
+   if (bp->hwrm_spec_code < 0x10800 || BNXT_VF(bp))
+   return 0;
+
+   bnxt_hwrm_cmd_hdr_init(bp, , HWRM_QUEUE_DSCP_QCAPS, -1, -1);
+   mutex_lock(>hwrm_cmd_lock);
+   rc = _hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
+   if (!rc) {
+   bp->max_dscp_value = (1 << resp->num_dscp_bits) - 1;
+   if (bp->max_dscp_value < 0x3f)
+   bp->max_dscp_value = 0;
+   }
+
+   mutex_unlock(>hwrm_cmd_lock);
+   return rc;
+}
+
+static int bnxt_hwrm_queue_dscp2pri_cfg(struct bnxt *bp, struct dcb_app *app,
+   bool add)
+{
+   struct hwrm_queue_dscp2pri_cfg_input req = {0};
+   struct bnxt_dscp2pri_entry *dscp2pri;
+   dma_addr_t mapping;
+   int rc;
+
+   if (bp->hwrm_spec_code < 0x10800)
+   return 0;
+
+   bnxt_hwrm_cmd_hdr_init(bp, , HWRM_QUEUE_DSCP2PRI_CFG, -1, -1);
+   dscp2pri = dma_alloc_coherent(>pdev->dev, sizeof(*dscp2pri),
+ , GFP_KERNEL);
+   if (!dscp2pri)
+   return -ENOMEM;
+
+   req.src_data_addr = cpu_to_le64(mapping);
+   dscp2pri->dscp = app->protocol;
+   if (add)
+   dscp2pri->mask = 0x3f;
+   else
+   dscp2pri->mask = 0;
+   dscp2pri->pri = app->priority;
+   req.entry_cnt = cpu_to_le16(1);
+   rc = hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
+   if (rc)
+   rc = -EIO;
+   dma_free_coherent(>pdev->dev, sizeof(*dscp2pri), dscp2pri,
+ mapping);
+   return rc;
+}
+
 static int bnxt_ets_validate(struct bnxt *bp, struct ieee_ets *ets, u8 *tc)
 {
int total_ets_bw = 0;
@@ -551,15 +606,30 @@ static int bnxt_dcbnl_ieee_setpfc(struct net_device *dev, 
struct ieee_pfc *pfc)
return rc;
 }
 
+static int bnxt_dcbnl_ieee_dscp_app_prep(struct bnxt *bp, struct dcb_app *app)
+{
+   if (app->selector == IEEE_8021QAZ_APP_SEL_DSCP) {
+   if (!bp->max_dscp_value)
+   return -ENOTSUPP;
+   if (app->protocol > bp->max_dscp_value)
+   return -EINVAL;
+   }
+   return 0;
+}
+
 static int bnxt_dcbnl_ieee_setapp(struct net_device *dev, struct dcb_app *app)
 {
struct bnxt *bp = netdev_priv(dev);
-   int rc = -EINVAL;
+   int rc;
 
if (!(bp->dcbx_cap & DCB_CAP_DCBX_VER_IEEE) ||
!(bp->dcbx_cap & DCB_CAP_DCBX_HOST))
return -EINVAL;
 
+   rc = bnxt_dcbnl_ieee_dscp_app_prep(bp, app);
+   if (rc)
+   return rc;
+
rc = dcb_ieee_setapp(dev, app);
if (rc)
return rc;
@@ -570,6 +640,9 @@ static int bnxt_dcbnl_ieee_setapp(struct net_device *dev, 
struct dcb_app *app)
 app->protocol == ROCE_V2_UDP_DPORT))
rc = bnxt_hwrm_set_dcbx_app(bp, app, true);
 
+   if (app->selector == IEEE_8021QAZ_APP_SEL_DSCP)
+   rc = bnxt_hwrm_queue_dscp2pri_cfg(bp, app, true);
+
return rc;
 }
 
@@ -582,6 +655,10 @@ static int bnxt_dcbnl_ieee_delapp(struct net_device *dev, 
struct dcb_app *app)
!(bp->dcbx_cap & DCB_CAP_DCBX_HOST))
  

[PATCH net-next 11/13] bnxt_en: Add hwmon sysfs support to read temperature

2018-08-05 Thread Michael Chan
From: Vasundhara Volam 

Export temperature sensor reading via hwmon sysfs.

Signed-off-by: Vasundhara Volam 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/Kconfig |  8 
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 62 +++
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  1 +
 3 files changed, 71 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/Kconfig 
b/drivers/net/ethernet/broadcom/Kconfig
index b7aa8ad..c1d3ee9b 100644
--- a/drivers/net/ethernet/broadcom/Kconfig
+++ b/drivers/net/ethernet/broadcom/Kconfig
@@ -230,4 +230,12 @@ config BNXT_DCB
 
  If unsure, say N.
 
+config BNXT_HWMON
+   bool "Broadcom NetXtreme-C/E HWMON support"
+   default y
+   depends on BNXT && HWMON && !(BNXT=y && HWMON=m)
+   ---help---
+ Say Y if you want to expose the thermal sensor data on NetXtreme-C/E
+ devices, via the hwmon sysfs interface.
+
 endif # NET_VENDOR_BROADCOM
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 56bd097..dde904b 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -51,6 +51,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "bnxt_hsi.h"
 #include "bnxt.h"
@@ -6789,6 +6791,62 @@ static void bnxt_get_wol_settings(struct bnxt *bp)
} while (handle && handle != 0x);
 }
 
+#ifdef CONFIG_BNXT_HWMON
+static ssize_t bnxt_show_temp(struct device *dev,
+ struct device_attribute *devattr, char *buf)
+{
+   struct hwrm_temp_monitor_query_input req = {0};
+   struct hwrm_temp_monitor_query_output *resp;
+   struct bnxt *bp = dev_get_drvdata(dev);
+   u32 temp = 0;
+
+   resp = bp->hwrm_cmd_resp_addr;
+   bnxt_hwrm_cmd_hdr_init(bp, , HWRM_TEMP_MONITOR_QUERY, -1, -1);
+   mutex_lock(>hwrm_cmd_lock);
+   if (!_hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT))
+   temp = resp->temp * 1000; /* display millidegree */
+   mutex_unlock(>hwrm_cmd_lock);
+
+   return sprintf(buf, "%u\n", temp);
+}
+static SENSOR_DEVICE_ATTR(temp1_input, 0444, bnxt_show_temp, NULL, 0);
+
+static struct attribute *bnxt_attrs[] = {
+   _dev_attr_temp1_input.dev_attr.attr,
+   NULL
+};
+ATTRIBUTE_GROUPS(bnxt);
+
+static void bnxt_hwmon_close(struct bnxt *bp)
+{
+   if (bp->hwmon_dev) {
+   hwmon_device_unregister(bp->hwmon_dev);
+   bp->hwmon_dev = NULL;
+   }
+}
+
+static void bnxt_hwmon_open(struct bnxt *bp)
+{
+   struct pci_dev *pdev = bp->pdev;
+
+   bp->hwmon_dev = hwmon_device_register_with_groups(>dev,
+ DRV_MODULE_NAME, bp,
+ bnxt_groups);
+   if (IS_ERR(bp->hwmon_dev)) {
+   bp->hwmon_dev = NULL;
+   dev_warn(>dev, "Cannot register hwmon device\n");
+   }
+}
+#else
+static void bnxt_hwmon_close(struct bnxt *bp)
+{
+}
+
+static void bnxt_hwmon_open(struct bnxt *bp)
+{
+}
+#endif
+
 static bool bnxt_eee_config_ok(struct bnxt *bp)
 {
struct ethtool_eee *eee = >eee;
@@ -7040,6 +7098,9 @@ static int bnxt_open(struct net_device *dev)
rc = __bnxt_open_nic(bp, true, true);
if (rc)
bnxt_hwrm_if_change(bp, false);
+
+   bnxt_hwmon_open(bp);
+
return rc;
 }
 
@@ -7102,6 +7163,7 @@ static int bnxt_close(struct net_device *dev)
 {
struct bnxt *bp = netdev_priv(dev);
 
+   bnxt_hwmon_close(bp);
bnxt_close_nic(bp, true, true);
bnxt_hwrm_shutdown_link(bp);
bnxt_hwrm_if_change(bp, false);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 6c40b257..006726c 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1411,6 +1411,7 @@ struct bnxt {
struct bnxt_tc_info *tc_info;
struct dentry   *debugfs_pdev;
struct dentry   *debugfs_dim;
+   struct device   *hwmon_dev;
 };
 
 #define BNXT_RX_STATS_OFFSET(counter)  \
-- 
2.5.1



[PATCH net-next 00/13] bnxt_en: Updates for net-next.

2018-08-05 Thread Michael Chan
This series includes the usual firmware spec update.  The driver has
added external phy loopback test and phy setup retry logic that is
needed during hotplug.  In the SRIOV space, the driver has added a
new VF resource allocation mode that requires the VF driver to
reserve resources during IFUP.  IF state changes are now propagated
to firmware so that firmware can release some resources during IFDOWN.

ethtool method to get firmware core dump and hwmon temperature reading
have been added.  DSCP to user priority support has been added to
the driver's DCBNL interface, and the CoS queue logic has been refined
to make sure that the special RDMA Congestion Notification hardware CoS
queue will not be used for networking traffic.

Michael Chan (11):
  bnxt_en: Update firmware interface version to 1.9.2.25.
  bnxt_en: Adjust timer based on ethtool stats-block-usecs settings.
  bnxt_en: Add external loopback test to ethtool selftest.
  bnxt_en: Add PHY retry logic.
  bnxt_en: Add new VF resource allocation strategy mode.
  bnxt_en: Update RSS setup and GRO-HW logic according to the latest
spec.
  bnxt_en: Add BNXT_NEW_RM() macro.
  bnxt_en: Move firmware related flags to a new fw_cap field in struct
bnxt.
  bnxt_en: Notify firmware about IF state changes.
  bnxt_en: Add DCBNL DSCP application protocol support.
  bnxt_en: Do not use the CNP CoS queue for networking traffic.

Vasundhara Volam (2):
  bnxt_en: Add support for ethtool get dump.
  bnxt_en: Add hwmon sysfs support to read temperature

 drivers/net/ethernet/broadcom/Kconfig  |8 +
 drivers/net/ethernet/broadcom/bnxt/bnxt.c  |  216 +++-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h  |   30 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.h |   66 ++
 drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c  |   89 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h  |   10 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c  |8 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c  |  378 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h  |   37 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_hsi.h  | 1227 +++-
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c|   25 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c  |4 +-
 12 files changed, 1716 insertions(+), 382 deletions(-)
 create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.h

-- 
2.5.1



[PATCH net-next 08/13] bnxt_en: Add BNXT_NEW_RM() macro.

2018-08-05 Thread Michael Chan
The BNXT_FLAG_NEW_RM flag is checked a lot in the code to determine if
the new resource manager is in effect.  Define a macro to perform
this check.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 27 +++
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c |  2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c   |  2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c |  4 ++--
 5 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 1714850..5c9ee3c 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4579,7 +4579,7 @@ static int bnxt_hwrm_get_rings(struct bnxt *bp)
}
 
hw_resc->resv_tx_rings = le16_to_cpu(resp->alloc_tx_rings);
-   if (bp->flags & BNXT_FLAG_NEW_RM) {
+   if (BNXT_NEW_RM(bp)) {
u16 cp, stats;
 
hw_resc->resv_rx_rings = le16_to_cpu(resp->alloc_rx_rings);
@@ -4625,7 +4625,7 @@ __bnxt_hwrm_reserve_pf_rings(struct bnxt *bp, struct 
hwrm_func_cfg_input *req,
req->fid = cpu_to_le16(0x);
enables |= tx_rings ? FUNC_CFG_REQ_ENABLES_NUM_TX_RINGS : 0;
req->num_tx_rings = cpu_to_le16(tx_rings);
-   if (bp->flags & BNXT_FLAG_NEW_RM) {
+   if (BNXT_NEW_RM(bp)) {
enables |= rx_rings ? FUNC_CFG_REQ_ENABLES_NUM_RX_RINGS : 0;
enables |= cp_rings ? FUNC_CFG_REQ_ENABLES_NUM_CMPL_RINGS |
  FUNC_CFG_REQ_ENABLES_NUM_STAT_CTXS : 0;
@@ -4698,7 +4698,7 @@ bnxt_hwrm_reserve_vf_rings(struct bnxt *bp, int tx_rings, 
int rx_rings,
struct hwrm_func_vf_cfg_input req = {0};
int rc;
 
-   if (!(bp->flags & BNXT_FLAG_NEW_RM)) {
+   if (!BNXT_NEW_RM(bp)) {
bp->hw_resc.resv_tx_rings = tx_rings;
return 0;
}
@@ -4758,7 +4758,7 @@ static bool bnxt_need_reserve_rings(struct bnxt *bp)
vnic = rx + 1;
if (bp->flags & BNXT_FLAG_AGG_RINGS)
rx <<= 1;
-   if ((bp->flags & BNXT_FLAG_NEW_RM) &&
+   if (BNXT_NEW_RM(bp) &&
(hw_resc->resv_rx_rings != rx || hw_resc->resv_cp_rings != cp ||
 hw_resc->resv_hw_ring_grps != grp || hw_resc->resv_vnics != vnic))
return true;
@@ -4794,7 +4794,7 @@ static int __bnxt_reserve_rings(struct bnxt *bp)
return rc;
 
tx = hw_resc->resv_tx_rings;
-   if (bp->flags & BNXT_FLAG_NEW_RM) {
+   if (BNXT_NEW_RM(bp)) {
rx = hw_resc->resv_rx_rings;
cp = hw_resc->resv_cp_rings;
grp = hw_resc->resv_hw_ring_grps;
@@ -4838,7 +4838,7 @@ static int bnxt_hwrm_check_vf_rings(struct bnxt *bp, int 
tx_rings, int rx_rings,
u32 flags;
int rc;
 
-   if (!(bp->flags & BNXT_FLAG_NEW_RM))
+   if (!BNXT_NEW_RM(bp))
return 0;
 
__bnxt_hwrm_reserve_vf_rings(bp, , tx_rings, rx_rings, ring_grps,
@@ -4867,7 +4867,7 @@ static int bnxt_hwrm_check_pf_rings(struct bnxt *bp, int 
tx_rings, int rx_rings,
__bnxt_hwrm_reserve_pf_rings(bp, , tx_rings, rx_rings, ring_grps,
 cp_rings, vnics);
flags = FUNC_CFG_REQ_FLAGS_TX_ASSETS_TEST;
-   if (bp->flags & BNXT_FLAG_NEW_RM)
+   if (BNXT_NEW_RM(bp))
flags |= FUNC_CFG_REQ_FLAGS_RX_ASSETS_TEST |
 FUNC_CFG_REQ_FLAGS_CMPL_ASSETS_TEST |
 FUNC_CFG_REQ_FLAGS_RING_GRP_ASSETS_TEST |
@@ -5921,7 +5921,7 @@ int bnxt_get_avail_msix(struct bnxt *bp, int num)
 
max_idx = min_t(int, bp->total_irqs, max_cp);
avail_msix = max_idx - bp->cp_nr_rings;
-   if (!(bp->flags & BNXT_FLAG_NEW_RM) || avail_msix >= num)
+   if (!BNXT_NEW_RM(bp) || avail_msix >= num)
return avail_msix;
 
if (max_irq < total_req) {
@@ -5934,7 +5934,7 @@ int bnxt_get_avail_msix(struct bnxt *bp, int num)
 
 static int bnxt_get_num_msix(struct bnxt *bp)
 {
-   if (!(bp->flags & BNXT_FLAG_NEW_RM))
+   if (!BNXT_NEW_RM(bp))
return bnxt_get_max_func_irqs(bp);
 
return bnxt_cp_rings_in_use(bp);
@@ -6057,8 +6057,7 @@ int bnxt_reserve_rings(struct bnxt *bp)
netdev_err(bp->dev, "ring reservation failure rc: %d\n", rc);
return rc;
}
-   if ((bp->flags & BNXT_FLAG_NEW_RM) &&
-   (bnxt_get_num_msix(bp) != bp->total_irqs)) {
+   if (BNXT_NEW_RM(bp) && (bnxt_get_num_msix(bp) != bp->total_irqs)) {
bnxt_ulp_irq_stop(bp);
bnxt_clear_int_mode(bp);
rc = bnxt_init_int_mode(b

[PATCH net-next 09/13] bnxt_en: Move firmware related flags to a new fw_cap field in struct bnxt.

2018-08-05 Thread Michael Chan
The flags field is almost getting full.  Move firmware capability flags
to a new fw_cap field to better organize these firmware flags.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 12 ++--
 drivers/net/ethernet/broadcom/bnxt/bnxt.h | 13 +++--
 drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c |  6 +++---
 3 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 5c9ee3c..1659940 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -3445,7 +3445,7 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void 
*msg, u32 msg_len,
cp_ring_id = le16_to_cpu(req->cmpl_ring);
intr_process = (cp_ring_id == INVALID_HW_RING_ID) ? 0 : 1;
 
-   if (bp->flags & BNXT_FLAG_SHORT_CMD) {
+   if (bp->fw_cap & BNXT_FW_CAP_SHORT_CMD) {
void *short_cmd_req = bp->hwrm_short_cmd_req_addr;
 
memcpy(short_cmd_req, req, msg_len);
@@ -5089,9 +5089,9 @@ static int bnxt_hwrm_func_qcfg(struct bnxt *bp)
flags = le16_to_cpu(resp->flags);
if (flags & (FUNC_QCFG_RESP_FLAGS_FW_DCBX_AGENT_ENABLED |
 FUNC_QCFG_RESP_FLAGS_FW_LLDP_AGENT_ENABLED)) {
-   bp->flags |= BNXT_FLAG_FW_LLDP_AGENT;
+   bp->fw_cap |= BNXT_FW_CAP_LLDP_AGENT;
if (flags & FUNC_QCFG_RESP_FLAGS_FW_DCBX_AGENT_ENABLED)
-   bp->flags |= BNXT_FLAG_FW_DCBX_AGENT;
+   bp->fw_cap |= BNXT_FW_CAP_DCBX_AGENT;
}
if (BNXT_PF(bp) && (flags & FUNC_QCFG_RESP_FLAGS_MULTI_HOST))
bp->flags |= BNXT_FLAG_MULTI_HOST;
@@ -5249,7 +5249,7 @@ static int bnxt_hwrm_func_qcaps(struct bnxt *bp)
if (bp->hwrm_spec_code >= 0x10803) {
rc = bnxt_hwrm_func_resc_qcaps(bp, true);
if (!rc)
-   bp->flags |= BNXT_FLAG_NEW_RM;
+   bp->fw_cap |= BNXT_FW_CAP_NEW_RM;
}
return 0;
 }
@@ -5352,7 +5352,7 @@ static int bnxt_hwrm_ver_get(struct bnxt *bp)
dev_caps_cfg = le32_to_cpu(resp->dev_caps_cfg);
if ((dev_caps_cfg & VER_GET_RESP_DEV_CAPS_CFG_SHORT_CMD_SUPPORTED) &&
(dev_caps_cfg & VER_GET_RESP_DEV_CAPS_CFG_SHORT_CMD_REQUIRED))
-   bp->flags |= BNXT_FLAG_SHORT_CMD;
+   bp->fw_cap |= BNXT_FW_CAP_SHORT_CMD;
 
 hwrm_ver_get_exit:
mutex_unlock(>hwrm_cmd_lock);
@@ -8760,7 +8760,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
if (rc)
goto init_err_pci_clean;
 
-   if (bp->flags & BNXT_FLAG_SHORT_CMD) {
+   if (bp->fw_cap & BNXT_FW_CAP_SHORT_CMD) {
rc = bnxt_alloc_hwrm_short_cmd_req(bp);
if (rc)
goto init_err_pci_clean;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 37dc896..ded2aff 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1144,7 +1144,6 @@ struct bnxt {
atomic_tintr_sem;
 
u32 flags;
-   #define BNXT_FLAG_DCB_ENABLED   0x1
#define BNXT_FLAG_VF0x2
#define BNXT_FLAG_LRO   0x4
 #ifdef CONFIG_INET
@@ -1173,15 +1172,11 @@ struct bnxt {
 BNXT_FLAG_ROCEV2_CAP)
#define BNXT_FLAG_NO_AGG_RINGS  0x2
#define BNXT_FLAG_RX_PAGE_MODE  0x4
-   #define BNXT_FLAG_FW_LLDP_AGENT 0x8
#define BNXT_FLAG_MULTI_HOST0x10
-   #define BNXT_FLAG_SHORT_CMD 0x20
#define BNXT_FLAG_DOUBLE_DB 0x40
-   #define BNXT_FLAG_FW_DCBX_AGENT 0x80
#define BNXT_FLAG_CHIP_NITRO_A0 0x100
#define BNXT_FLAG_DIM   0x200
#define BNXT_FLAG_ROCE_MIRROR_CAP   0x400
-   #define BNXT_FLAG_NEW_RM0x800
#define BNXT_FLAG_PORT_STATS_EXT0x1000
 
#define BNXT_FLAG_ALL_CONFIG_FEATS (BNXT_FLAG_TPA | \
@@ -1195,7 +1190,6 @@ struct bnxt {
 #define BNXT_SINGLE_PF(bp) (BNXT_PF(bp) && !BNXT_NPAR(bp) && !BNXT_MH(bp))
 #define BNXT_CHIP_TYPE_NITRO_A0(bp) ((bp)->flags & BNXT_FLAG_CHIP_NITRO_A0)
 #define BNXT_RX_PAGE_MODE(bp)  ((bp)->flags & BNXT_FLAG_RX_PAGE_MODE)
-#define BNXT_NEW_RM(bp)((bp)->flags & BNXT_FLAG_NEW_RM)
 
 /* Chip class phase 4 and later */
 #define BNXT_CHIP_P4_PLUS(bp)  \
@@ -1291,6 +1285,13 @@ struct bnxt {
 
u32 msg_enable;
 
+   u32 fw_cap;
+   #define BNXT_FW_CAP_SHORT_CMD   0x0001
+   #define BNXT_FW_CAP_L

[PATCH net 1/6] bnxt_en: Fix the vlan_tci exact match check.

2018-07-09 Thread Michael Chan
From: Venkat Duvvuru 

It is possible that OVS may set don’t care for DEI/CFI bit in
vlan_tci mask. Hence, checking for vlan_tci exact match will endup
in a vlan flow rejection.

This patch fixes the problem by checking for vlan_pcp and vid
separately, instead of checking for the entire vlan_tci.

Fixes: e85a9be93cf1 (bnxt_en: do not allow wildcard matches for L2 flows)
Signed-off-by: Venkat Duvvuru 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c | 30 +---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
index 795f450..491bd40 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
@@ -27,6 +27,15 @@
 #define BNXT_FID_INVALID   0x
 #define VLAN_TCI(vid, prio)((vid) | ((prio) << VLAN_PRIO_SHIFT))
 
+#define is_vlan_pcp_wildcarded(vlan_tci_mask)  \
+   ((ntohs(vlan_tci_mask) & VLAN_PRIO_MASK) == 0x)
+#define is_vlan_pcp_exactmatch(vlan_tci_mask)  \
+   ((ntohs(vlan_tci_mask) & VLAN_PRIO_MASK) == VLAN_PRIO_MASK)
+#define is_vlan_pcp_zero(vlan_tci) \
+   ((ntohs(vlan_tci) & VLAN_PRIO_MASK) == 0x)
+#define is_vid_exactmatch(vlan_tci_mask)   \
+   ((ntohs(vlan_tci_mask) & VLAN_VID_MASK) == VLAN_VID_MASK)
+
 /* Return the dst fid of the func for flow forwarding
  * For PFs: src_fid is the fid of the PF
  * For VF-reps: src_fid the fid of the VF
@@ -389,6 +398,21 @@ static bool is_exactmatch(void *mask, int len)
return true;
 }
 
+static bool is_vlan_tci_allowed(__be16  vlan_tci_mask,
+   __be16  vlan_tci)
+{
+   /* VLAN priority must be either exactly zero or fully wildcarded and
+* VLAN id must be exact match.
+*/
+   if (is_vid_exactmatch(vlan_tci_mask) &&
+   ((is_vlan_pcp_exactmatch(vlan_tci_mask) &&
+ is_vlan_pcp_zero(vlan_tci)) ||
+is_vlan_pcp_wildcarded(vlan_tci_mask)))
+   return true;
+
+   return false;
+}
+
 static bool bits_set(void *key, int len)
 {
const u8 *p = key;
@@ -803,9 +827,9 @@ static bool bnxt_tc_can_offload(struct bnxt *bp, struct 
bnxt_tc_flow *flow)
/* Currently VLAN fields cannot be partial wildcard */
if (bits_set(>l2_key.inner_vlan_tci,
 sizeof(flow->l2_key.inner_vlan_tci)) &&
-   !is_exactmatch(>l2_mask.inner_vlan_tci,
-  sizeof(flow->l2_mask.inner_vlan_tci))) {
-   netdev_info(bp->dev, "Wildcard match unsupported for VLAN 
TCI\n");
+   !is_vlan_tci_allowed(flow->l2_mask.inner_vlan_tci,
+flow->l2_key.inner_vlan_tci)) {
+   netdev_info(bp->dev, "Unsupported VLAN TCI\n");
return false;
}
if (bits_set(>l2_key.inner_vlan_tpid,
-- 
1.8.3.1



[PATCH net 6/6] bnxt_en: Fix for system hang if request_irq fails

2018-07-09 Thread Michael Chan
From: Vikas Gupta 

Fix bug in the error code path when bnxt_request_irq() returns failure.
bnxt_disable_napi() should not be called in this error path because
NAPI has not been enabled yet.

Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Vikas Gupta 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 11b21ad..4394c11 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -6890,7 +6890,7 @@ static int __bnxt_open_nic(struct bnxt *bp, bool 
irq_re_init, bool link_re_init)
rc = bnxt_request_irq(bp);
if (rc) {
netdev_err(bp->dev, "bnxt_request_irq err: %x\n", rc);
-   goto open_err;
+   goto open_err_irq;
}
}
 
@@ -6930,6 +6930,8 @@ static int __bnxt_open_nic(struct bnxt *bp, bool 
irq_re_init, bool link_re_init)
 open_err:
bnxt_debug_dev_exit(bp);
bnxt_disable_napi(bp);
+
+open_err_irq:
bnxt_del_napi(bp);
 
 open_err_free_mem:
-- 
1.8.3.1



[PATCH net 4/6] bnxt_en: Support clearing of the IFF_BROADCAST flag.

2018-07-09 Thread Michael Chan
Currently, the driver assumes IFF_BROADCAST is always set and always sets
the broadcast filter.  Modify the code to set or clear the broadcast
filter according to the IFF_BROADCAST flag.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 5a47607..fac1285 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -5712,7 +5712,9 @@ static int bnxt_init_chip(struct bnxt *bp, bool 
irq_re_init)
}
vnic->uc_filter_count = 1;
 
-   vnic->rx_mask = CFA_L2_SET_RX_MASK_REQ_MASK_BCAST;
+   vnic->rx_mask = 0;
+   if (bp->dev->flags & IFF_BROADCAST)
+   vnic->rx_mask |= CFA_L2_SET_RX_MASK_REQ_MASK_BCAST;
 
if ((bp->dev->flags & IFF_PROMISC) && bnxt_promisc_ok(bp))
vnic->rx_mask |= CFA_L2_SET_RX_MASK_REQ_MASK_PROMISCUOUS;
@@ -7214,13 +7216,16 @@ static void bnxt_set_rx_mode(struct net_device *dev)
 
mask &= ~(CFA_L2_SET_RX_MASK_REQ_MASK_PROMISCUOUS |
  CFA_L2_SET_RX_MASK_REQ_MASK_MCAST |
- CFA_L2_SET_RX_MASK_REQ_MASK_ALL_MCAST);
+ CFA_L2_SET_RX_MASK_REQ_MASK_ALL_MCAST |
+ CFA_L2_SET_RX_MASK_REQ_MASK_BCAST);
 
if ((dev->flags & IFF_PROMISC) && bnxt_promisc_ok(bp))
mask |= CFA_L2_SET_RX_MASK_REQ_MASK_PROMISCUOUS;
 
uc_update = bnxt_uc_list_updated(bp);
 
+   if (dev->flags & IFF_BROADCAST)
+   mask |= CFA_L2_SET_RX_MASK_REQ_MASK_BCAST;
if (dev->flags & IFF_ALLMULTI) {
mask |= CFA_L2_SET_RX_MASK_REQ_MASK_ALL_MCAST;
vnic->mc_list_count = 0;
-- 
1.8.3.1



[PATCH net 3/6] bnxt_en: Always set output parameters in bnxt_get_max_rings().

2018-07-09 Thread Michael Chan
The current code returns -ENOMEM and does not bother to set the output
parameters to 0 when no rings are available.  Some callers, such as
bnxt_get_channels() will display garbage ring numbers when that happens.
Fix it by always setting the output parameters.

Fixes: 6e6c5a57fbe1 ("bnxt_en: Modify bnxt_get_max_rings() to support shared or 
non shared rings.")
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 5d95d78..5a47607 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -8502,11 +8502,11 @@ int bnxt_get_max_rings(struct bnxt *bp, int *max_rx, 
int *max_tx, bool shared)
int rx, tx, cp;
 
_bnxt_get_max_rings(bp, , , );
+   *max_rx = rx;
+   *max_tx = tx;
if (!rx || !tx || !cp)
return -ENOMEM;
 
-   *max_rx = rx;
-   *max_tx = tx;
return bnxt_trim_rings(bp, max_rx, max_tx, cp, shared);
 }
 
-- 
1.8.3.1



[PATCH net 0/6] bnxt_en: Bug fixes.

2018-07-09 Thread Michael Chan
These are bug fixes in error code paths, TC Flower VLAN TCI flow
checking bug fix, proper filtering of Broadcast packets if IFF_BROADCAST
is not set, and a bug fix in bnxt_get_max_rings() to return 0 ring
parameters when the return value is -ENOMEM.

Michael Chan (4):
  bnxt_en: Fix inconsistent BNXT_FLAG_AGG_RINGS logic.
  bnxt_en: Always set output parameters in bnxt_get_max_rings().
  bnxt_en: Support clearing of the IFF_BROADCAST flag.
  bnxt_en: Do not modify max IRQ count after RDMA driver requests/frees
IRQs.

Venkat Duvvuru (1):
  bnxt_en: Fix the vlan_tci exact match check.

Vikas Gupta (1):
  bnxt_en: Fix for system hang if request_irq fails

 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 24 ++---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  1 -
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c  | 30 ---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c |  2 --
 4 files changed, 44 insertions(+), 13 deletions(-)

-- 
1.8.3.1



[PATCH net 5/6] bnxt_en: Do not modify max IRQ count after RDMA driver requests/frees IRQs.

2018-07-09 Thread Michael Chan
Calling bnxt_set_max_func_irqs() to modify the max IRQ count requested or
freed by the RDMA driver is flawed.  The max IRQ count is checked when
re-initializing the IRQ vectors and this can happen multiple times
during ifup or ethtool -L.  If the max IRQ is reduced and the RDMA
driver is operational, we may not initailize IRQs correctly.  This
problem shows up on VFs with very small number of MSIX.

There is no other logic that relies on the IRQ count excluding the ones
used by RDMA.  So we fix it by just removing the call to subtract or
add the IRQs used by RDMA.

Fixes: a588e4580a7e ("bnxt_en: Add interface to support RDMA driver.")
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 -
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 2 --
 3 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index fac1285..11b21ad 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -5919,7 +5919,7 @@ unsigned int bnxt_get_max_func_irqs(struct bnxt *bp)
return min_t(unsigned int, hw_resc->max_irqs, hw_resc->max_cp_rings);
 }
 
-void bnxt_set_max_func_irqs(struct bnxt *bp, unsigned int max_irqs)
+static void bnxt_set_max_func_irqs(struct bnxt *bp, unsigned int max_irqs)
 {
bp->hw_resc.max_irqs = max_irqs;
 }
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 9b14eb6..91575ef 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1470,7 +1470,6 @@ int bnxt_hwrm_func_rgtr_async_events(struct bnxt *bp, 
unsigned long *bmap,
 unsigned int bnxt_get_max_func_cp_rings(struct bnxt *bp);
 void bnxt_set_max_func_cp_rings(struct bnxt *bp, unsigned int max);
 unsigned int bnxt_get_max_func_irqs(struct bnxt *bp);
-void bnxt_set_max_func_irqs(struct bnxt *bp, unsigned int max);
 int bnxt_get_avail_msix(struct bnxt *bp, int num);
 int bnxt_reserve_rings(struct bnxt *bp);
 void bnxt_tx_disable(struct bnxt *bp);
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
index 347e4f9..840f6e5 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
@@ -169,7 +169,6 @@ static int bnxt_req_msix_vecs(struct bnxt_en_dev *edev, int 
ulp_id,
edev->ulp_tbl[ulp_id].msix_requested = avail_msix;
}
bnxt_fill_msix_vecs(bp, ent);
-   bnxt_set_max_func_irqs(bp, bnxt_get_max_func_irqs(bp) - avail_msix);
bnxt_set_max_func_cp_rings(bp, max_cp_rings - avail_msix);
edev->flags |= BNXT_EN_FLAG_MSIX_REQUESTED;
return avail_msix;
@@ -192,7 +191,6 @@ static int bnxt_free_msix_vecs(struct bnxt_en_dev *edev, 
int ulp_id)
msix_requested = edev->ulp_tbl[ulp_id].msix_requested;
bnxt_set_max_func_cp_rings(bp, max_cp_rings + msix_requested);
edev->ulp_tbl[ulp_id].msix_requested = 0;
-   bnxt_set_max_func_irqs(bp, bnxt_get_max_func_irqs(bp) + msix_requested);
edev->flags &= ~BNXT_EN_FLAG_MSIX_REQUESTED;
if (netif_running(dev)) {
bnxt_close_nic(bp, true, false);
-- 
1.8.3.1



[PATCH net 2/6] bnxt_en: Fix inconsistent BNXT_FLAG_AGG_RINGS logic.

2018-07-09 Thread Michael Chan
If there aren't enough RX rings available, the driver will attempt to
use a single RX ring without the aggregation ring.  If that also
fails, the BNXT_FLAG_AGG_RINGS flag is cleared but the other ring
parameters are not set consistently to reflect that.  If more RX
rings become available at the next open, the RX rings will be in
an inconsistent state and may crash when freeing the RX rings.

Fix it by restoring the BNXT_FLAG_AGG_RINGS if not enough RX rings are
available to run without aggregation rings.

Fixes: bdbd1eb59c56 ("bnxt_en: Handle no aggregation ring gracefully.")
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 176fc9f..5d95d78 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -8520,8 +8520,11 @@ static int bnxt_get_dflt_rings(struct bnxt *bp, int 
*max_rx, int *max_tx,
/* Not enough rings, try disabling agg rings. */
bp->flags &= ~BNXT_FLAG_AGG_RINGS;
rc = bnxt_get_max_rings(bp, max_rx, max_tx, shared);
-   if (rc)
+   if (rc) {
+   /* set BNXT_FLAG_AGG_RINGS back for consistency */
+   bp->flags |= BNXT_FLAG_AGG_RINGS;
return rc;
+   }
bp->flags |= BNXT_FLAG_NO_AGG_RINGS;
bp->dev->hw_features &= ~(NETIF_F_LRO | NETIF_F_GRO_HW);
bp->dev->features &= ~(NETIF_F_LRO | NETIF_F_GRO_HW);
-- 
1.8.3.1



Re: [PATCH net-next 1/3] net: Add support to configure SR-IOV VF minimum and maximum queues.

2018-05-30 Thread Michael Chan
On Tue, May 29, 2018 at 11:33 PM, Jakub Kicinski
 wrote:

>
> At some points you (Broadcom) were working whole bunch of devlink
> configuration options for the PCIe side of the ASIC.  The number of
> queues relates to things like number of allocated MSI-X vectors, which
> if memory serves me was in your devlink patch set.  In an ideal world
> we would try to keep all those in one place :)

Yeah, another colleague is now working with Mellanox on something similar.

One difference between those devlink parameters and these queue
parameters is that the former are more permanent and global settings.
For example, number of VFs or number of MSIX per VF are persistent
settings once they are set and after PCIe reset.  On the other hand,
these queue settings are pure run-time settings and may be unique for
each VF.  These are not stored as there is no room in NVRAM to store
128 sets or more of these parameters.

Anyway, let me discuss this with my colleague to see if there is a
natural fit for these queue parameters in the devlink infrastructure
that they are working on.

>
> For PCIe config there is always the question of what can be configured
> at runtime, and what requires a HW reset.  Therefore that devlink API
> which could configure current as well as persistent device settings was
> quite nice.  I'm not sure if reallocating queues would ever require
> PCIe block reset but maybe...  Certainly it seems the notion of min
> queues would make more sense in PCIe configuration devlink API than
> ethtool channel API to me as well.
>
> Queues are in the grey area between netdev and non-netdev constructs.
> They make sense both from PCIe resource allocation perspective (i.e.
> devlink PCIe settings) and netdev perspective (ethtool) because they
> feed into things like qdisc offloads, maybe per-queue stats etc.
>
> So yes...  IMHO it would be nice to add this to a devlink SR-IOV config
> API and/or switchdev representors.  But neither of those are really an
> option for you today so IDK :)


Re: [PATCH net-next 1/3] net: Add support to configure SR-IOV VF minimum and maximum queues.

2018-05-30 Thread Michael Chan
On Tue, May 29, 2018 at 10:56 PM, Jakub Kicinski
 wrote:
> On Tue, 29 May 2018 20:19:54 -0700, Michael Chan wrote:
>> On Tue, May 29, 2018 at 1:46 PM, Samudrala, Sridhar wrote:
>> > Isn't ndo_set_vf_xxx() considered a legacy interface and not planned to be
>> > extended?
>
> +1 it's painful to see this feature being added to the legacy
> API :(  Another duplicated configuration knob.
>
>> I didn't know about that.
>>
>> > Shouldn't we enable this via ethtool on the port representor netdev?
>>
>> We discussed about this.  ethtool on the VF representor will only work
>> in switchdev mode and also will not support min/max values.
>
> Ethtool channel API may be overdue a rewrite in devlink anyway, but I
> feel like implementing switchdev mode and rewriting features in devlink
> may be too much to ask.

Totally agreed.  And switchdev mode doesn't seem to be that widely
used at the moment.  Do you have other suggestions besides NDO?


Re: [PATCH net-next 1/3] net: Add support to configure SR-IOV VF minimum and maximum queues.

2018-05-29 Thread Michael Chan
On Tue, May 29, 2018 at 1:46 PM, Samudrala, Sridhar
 wrote:

>
> Isn't ndo_set_vf_xxx() considered a legacy interface and not planned to be
> extended?

I didn't know about that.

> Shouldn't we enable this via ethtool on the port representor netdev?
>
>

We discussed about this.  ethtool on the VF representor will only work
in switchdev mode and also will not support min/max values.


[PATCH net-next 0/3] net: Add support to configure SR-IOV VF queues.

2018-05-29 Thread Michael Chan
VF Queue resources are always limited and there is currently no
infrastructure to allow the admin. on the host to add or reduce queue
resources for any particular VF.  This series adds the infrastructure
to do that and adds the functionality to the bnxt_en driver.

The "ip link set" command will subsequently be patched to support the new
operation.

v1:
- Changed the meaning of the min parameters to be strictly the minimum
guaranteed value, suggested by Jakub Kicinsky.
- More complete implementation in the bnxt_en driver.

Michael Chan (3):
  net: Add support to configure SR-IOV VF minimum and maximum queues.
  bnxt_en: Store min/max tx/rx rings for individual VFs.
  bnxt_en: Implement .ndo_set_vf_queues().

 drivers/net/ethernet/broadcom/bnxt/bnxt.c   |   1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt.h   |   9 ++
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 157 +++-
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h |   2 +
 include/linux/if_link.h |   4 +
 include/linux/netdevice.h   |   6 +
 include/uapi/linux/if_link.h|   9 ++
 net/core/rtnetlink.c|  32 -
 8 files changed, 213 insertions(+), 7 deletions(-)

-- 
1.8.3.1



[PATCH net-next 2/3] bnxt_en: Store min/max tx/rx rings for individual VFs.

2018-05-29 Thread Michael Chan
With new infrastructure to configure queues differently for each VF,
we need to store the current min/max rx/tx rings and other resources
for each VF.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h   |  9 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 27 +
 2 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 9b14eb6..531c77d 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -837,6 +837,14 @@ struct bnxt_vf_info {
u32 func_flags; /* func cfg flags */
u32 min_tx_rate;
u32 max_tx_rate;
+   u16 min_tx_rings;
+   u16 max_tx_rings;
+   u16 min_rx_rings;
+   u16 max_rx_rings;
+   u16 min_cp_rings;
+   u16 min_stat_ctxs;
+   u16 min_ring_grps;
+   u16 min_vnics;
void*hwrm_cmd_req_addr;
dma_addr_t  hwrm_cmd_req_dma_addr;
 };
@@ -1351,6 +1359,7 @@ struct bnxt {
 #ifdef CONFIG_BNXT_SRIOV
int nr_vfs;
struct bnxt_vf_info vf;
+   struct hwrm_func_vf_resource_cfg_input vf_resc_cfg_input;
wait_queue_head_t   sriov_cfg_wait;
boolsriov_cfg;
 #define BNXT_SRIOV_CFG_WAIT_TMOmsecs_to_jiffies(1)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
index a649108..7a92125 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
@@ -171,6 +171,10 @@ int bnxt_get_vf_config(struct net_device *dev, int vf_id,
ivi->linkstate = IFLA_VF_LINK_STATE_ENABLE;
else
ivi->linkstate = IFLA_VF_LINK_STATE_DISABLE;
+   ivi->min_tx_queues = vf->min_tx_rings;
+   ivi->max_tx_queues = vf->max_tx_rings;
+   ivi->min_rx_queues = vf->min_rx_rings;
+   ivi->max_rx_queues = vf->max_rx_rings;
 
return 0;
 }
@@ -498,6 +502,8 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, int 
num_vfs)
 
mutex_lock(>hwrm_cmd_lock);
for (i = 0; i < num_vfs; i++) {
+   struct bnxt_vf_info *vf = >vf[i];
+
req.vf_id = cpu_to_le16(pf->first_vf_id + i);
rc = _hwrm_send_message(bp, , sizeof(req),
HWRM_CMD_TIMEOUT);
@@ -506,7 +512,15 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, int 
num_vfs)
break;
}
pf->active_vfs = i + 1;
-   pf->vf[i].fw_fid = pf->first_vf_id + i;
+   vf->fw_fid = pf->first_vf_id + i;
+   vf->min_tx_rings = le16_to_cpu(req.min_tx_rings);
+   vf->max_tx_rings = vf_tx_rings;
+   vf->min_rx_rings = le16_to_cpu(req.min_rx_rings);
+   vf->max_rx_rings = vf_rx_rings;
+   vf->min_cp_rings = le16_to_cpu(req.min_cmpl_rings);
+   vf->min_stat_ctxs = le16_to_cpu(req.min_stat_ctx);
+   vf->min_ring_grps = le16_to_cpu(req.min_hw_ring_grps);
+   vf->min_vnics = le16_to_cpu(req.min_vnics);
}
mutex_unlock(>hwrm_cmd_lock);
if (pf->active_vfs) {
@@ -521,6 +535,7 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, int 
num_vfs)
hw_resc->max_stat_ctxs -= le16_to_cpu(req.min_stat_ctx) * n;
hw_resc->max_vnics -= le16_to_cpu(req.min_vnics) * n;
 
+   memcpy(>vf_resc_cfg_input, , sizeof(req));
rc = pf->active_vfs;
}
return rc;
@@ -585,6 +600,7 @@ static int bnxt_hwrm_func_cfg(struct bnxt *bp, int num_vfs)
 
mutex_lock(>hwrm_cmd_lock);
for (i = 0; i < num_vfs; i++) {
+   struct bnxt_vf_info *vf = >vf[i];
int vf_tx_rsvd = vf_tx_rings;
 
req.fid = cpu_to_le16(pf->first_vf_id + i);
@@ -593,12 +609,15 @@ static int bnxt_hwrm_func_cfg(struct bnxt *bp, int 
num_vfs)
if (rc)
break;
pf->active_vfs = i + 1;
-   pf->vf[i].fw_fid = le16_to_cpu(req.fid);
-   rc = __bnxt_hwrm_get_tx_rings(bp, pf->vf[i].fw_fid,
- _tx_rsvd);
+   vf->fw_fid = le16_to_cpu(req.fid);
+   rc = __bnxt_hwrm_get_tx_rings(bp, vf->fw_fid, _tx_rsvd);
if (rc)
break;
total_vf_tx_rings += vf_tx_rsvd;
+   vf->min_tx_rings = vf_tx_rsvd;
+   vf->max_tx_rings = vf_tx_rsvd;
+   vf->min_rx_rings = vf_rx_rings;
+   vf->max_rx_rings = vf_rx_rings;
}
mutex_unlock(>hwrm_cmd_lock);
if (rc)
-- 
1.8.3.1



[PATCH net-next 3/3] bnxt_en: Implement .ndo_set_vf_queues().

2018-05-29 Thread Michael Chan
Implement .ndo_set_vf_queues() on the PF driver to configure the queues
parameters for individual VFs.  This allows the admin. on the host to
increase or decrease queues for individual VFs.

Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c   |   1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 130 
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h |   2 +
 3 files changed, 133 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index dfa0839..2ce9779 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -8373,6 +8373,7 @@ static int bnxt_swdev_port_attr_get(struct net_device 
*dev,
.ndo_set_vf_link_state  = bnxt_set_vf_link_state,
.ndo_set_vf_spoofchk= bnxt_set_vf_spoofchk,
.ndo_set_vf_trust   = bnxt_set_vf_trust,
+   .ndo_set_vf_queues  = bnxt_set_vf_queues,
 #endif
 #ifdef CONFIG_NET_POLL_CONTROLLER
.ndo_poll_controller= bnxt_poll_controller,
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
index 7a92125..a34a32f 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
@@ -138,6 +138,136 @@ int bnxt_set_vf_trust(struct net_device *dev, int vf_id, 
bool trusted)
return 0;
 }
 
+static bool bnxt_param_ok(int new, u16 curr, u16 avail)
+{
+   int delta;
+
+   if (new <= curr)
+   return true;
+
+   delta = new - curr;
+   if (delta <= avail)
+   return true;
+   return false;
+}
+
+static void bnxt_adjust_ring_resc(struct bnxt *bp, struct bnxt_vf_info *vf,
+ struct hwrm_func_vf_resource_cfg_input *req)
+{
+   struct bnxt_hw_resc *hw_resc = >hw_resc;
+   u16 avail_cp_rings, avail_stat_ctx;
+   u16 avail_vnics, avail_ring_grps;
+   u16 cp, grp, stat, vnic;
+   u16 min_tx, min_rx;
+
+   min_tx = le16_to_cpu(req->min_tx_rings);
+   min_rx = le16_to_cpu(req->min_rx_rings);
+   avail_cp_rings = hw_resc->max_cp_rings - bp->cp_nr_rings;
+   avail_stat_ctx = hw_resc->max_stat_ctxs - bp->num_stat_ctxs;
+   avail_ring_grps = hw_resc->max_hw_ring_grps - bp->rx_nr_rings;
+   avail_vnics = hw_resc->max_vnics - bp->nr_vnics;
+
+   cp = max_t(u16, 2 * min_tx, min_rx);
+   if (cp > vf->min_cp_rings)
+   cp = min_t(u16, cp, avail_cp_rings + vf->min_cp_rings);
+   grp = min_tx;
+   if (grp > vf->min_ring_grps)
+   grp = min_t(u16, grp, avail_ring_grps + vf->min_ring_grps);
+   stat = min_rx;
+   if (stat > vf->min_stat_ctxs)
+   stat = min_t(u16, stat, avail_stat_ctx + vf->min_stat_ctxs);
+   vnic = min_rx;
+   if (vnic > vf->min_vnics)
+   vnic = min_t(u16, vnic, avail_vnics + vf->min_vnics);
+
+   req->min_cmpl_rings = req->max_cmpl_rings = cpu_to_le16(cp);
+   req->min_hw_ring_grps = req->max_hw_ring_grps = cpu_to_le16(grp);
+   req->min_stat_ctx = req->max_stat_ctx = cpu_to_le16(stat);
+   req->min_vnics = req->max_vnics = cpu_to_le16(vnic);
+}
+
+static void bnxt_record_ring_resc(struct bnxt *bp, struct bnxt_vf_info *vf,
+ struct hwrm_func_vf_resource_cfg_input *req)
+{
+   struct bnxt_hw_resc *hw_resc = >hw_resc;
+
+   hw_resc->max_tx_rings += vf->min_tx_rings;
+   hw_resc->max_rx_rings += vf->min_rx_rings;
+   vf->min_tx_rings = le16_to_cpu(req->min_tx_rings);
+   vf->max_tx_rings = le16_to_cpu(req->max_tx_rings);
+   vf->min_rx_rings = le16_to_cpu(req->min_rx_rings);
+   vf->max_rx_rings = le16_to_cpu(req->max_rx_rings);
+   hw_resc->max_tx_rings -= vf->min_tx_rings;
+   hw_resc->max_rx_rings -= vf->min_rx_rings;
+   if (bp->pf.vf_resv_strategy == BNXT_VF_RESV_STRATEGY_MAXIMAL) {
+   hw_resc->max_cp_rings += vf->min_cp_rings;
+   hw_resc->max_hw_ring_grps += vf->min_ring_grps;
+   hw_resc->max_stat_ctxs += vf->min_stat_ctxs;
+   hw_resc->max_vnics += vf->min_vnics;
+   vf->min_cp_rings = le16_to_cpu(req->min_cmpl_rings);
+   vf->min_ring_grps = le16_to_cpu(req->min_hw_ring_grps);
+   vf->min_stat_ctxs = le16_to_cpu(req->min_stat_ctx);
+   vf->min_vnics = le16_to_cpu(req->min_vnics);
+   hw_resc->max_cp_rings -= vf->min_cp_rings;
+   hw_resc->max_hw_ring_grps -= vf->min_ring_grps;
+   hw_resc->max_stat_ctxs -= vf->min_stat_ctxs;
+   hw_resc->max_vnics -= vf->min_vnics;
+   }
+}
+
+int bnxt_set_vf_queues(struct net_devic

[PATCH net-next 1/3] net: Add support to configure SR-IOV VF minimum and maximum queues.

2018-05-29 Thread Michael Chan
VF Queue resources are always limited and there is currently no
infrastructure to allow the admin. on the host to add or reduce queue
resources for any particular VF.  With ever increasing number of VFs
being supported, it is desirable to allow the admin. to configure queue
resources differently for the VFs.  Some VFs may require more or fewer
queues due to different bandwidth requirements or different number of
vCPUs in the VM.  This patch adds the infrastructure to do that by
adding IFLA_VF_QUEUES netlink attribute and a new .ndo_set_vf_queues()
to the net_device_ops.

Four parameters are exposed for each VF:

o min_tx_queues - Guaranteed tx queues available to the VF.

o max_tx_queues - Maximum but not necessarily guaranteed tx queues
  available to the VF.

o min_rx_queues - Guaranteed rx queues available to the VF.

o max_rx_queues - Maximum but not necessarily guaranteed rx queues
  available to the VF.

The "ip link set" command will subsequently be patched to support the new
operation to set the above parameters.

After the admin. makes a change to the above parameters, the corresponding
VF will have a new range of channels to set using ethtool -L.  The VF may
have to go through IF down/up before the new queues will take effect.  Up
to the min values are guaranteed.  Up to the max values are possible but not
guaranteed.

Signed-off-by: Michael Chan 
---
 include/linux/if_link.h  |  4 
 include/linux/netdevice.h|  6 ++
 include/uapi/linux/if_link.h |  9 +
 net/core/rtnetlink.c | 32 +---
 4 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index 622658d..8e81121 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -29,5 +29,9 @@ struct ifla_vf_info {
__u32 rss_query_en;
__u32 trusted;
__be16 vlan_proto;
+   __u32 min_tx_queues;
+   __u32 max_tx_queues;
+   __u32 min_rx_queues;
+   __u32 max_rx_queues;
 };
 #endif /* _LINUX_IF_LINK_H */
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 8452f72..17f5892 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1023,6 +1023,8 @@ struct dev_ifalias {
  *  with PF and querying it may introduce a theoretical security risk.
  * int (*ndo_set_vf_rss_query_en)(struct net_device *dev, int vf, bool 
setting);
  * int (*ndo_get_vf_port)(struct net_device *dev, int vf, struct sk_buff *skb);
+ * int (*ndo_set_vf_queues)(struct net_device *dev, int vf, int min_txq,
+ * int max_txq, int min_rxq, int max_rxq);
  * int (*ndo_setup_tc)(struct net_device *dev, enum tc_setup_type type,
  *void *type_data);
  * Called to setup any 'tc' scheduler, classifier or action on @dev.
@@ -1276,6 +1278,10 @@ struct net_device_ops {
int (*ndo_set_vf_rss_query_en)(
   struct net_device *dev,
   int vf, bool setting);
+   int (*ndo_set_vf_queues)(struct net_device *dev,
+int vf,
+int min_txq, int max_txq,
+int min_rxq, int max_rxq);
int (*ndo_setup_tc)(struct net_device *dev,
enum tc_setup_type type,
void *type_data);
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index cf01b68..81bbc4e 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -659,6 +659,7 @@ enum {
IFLA_VF_IB_NODE_GUID,   /* VF Infiniband node GUID */
IFLA_VF_IB_PORT_GUID,   /* VF Infiniband port GUID */
IFLA_VF_VLAN_LIST,  /* nested list of vlans, option for QinQ */
+   IFLA_VF_QUEUES, /* Min and Max TX/RX queues */
__IFLA_VF_MAX,
 };
 
@@ -749,6 +750,14 @@ struct ifla_vf_trust {
__u32 setting;
 };
 
+struct ifla_vf_queues {
+   __u32 vf;
+   __u32 min_tx_queues;/* min guaranteed tx queues */
+   __u32 max_tx_queues;/* max non guaranteed tx queues */
+   __u32 min_rx_queues;/* min guaranteed rx queues */
+   __u32 max_rx_queues;/* max non guaranteed rx queues */
+};
+
 /* VF ports management section
  *
  * Nested layout of set/get msg is:
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 8080254..e21ab8a 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -921,7 +921,8 @@ static inline int rtnl_vfinfo_size(const struct net_device 
*dev,
 nla_total_size_64bit(sizeof(__u64)) +
 /* IFLA_VF_STATS_TX_DROPPED */
 nla_total_size_64bit(sizeof(__u64)) +
-nla_total_s

Re: [PATCH net-next RFC 1/3] net: Add support to configure SR-IOV VF minimum and maximum queues.

2018-05-09 Thread Michael Chan
On Wed, May 9, 2018 at 6:10 PM, Jakub Kicinski
<jakub.kicin...@netronome.com> wrote:
> On Wed, 9 May 2018 17:22:50 -0700, Michael Chan wrote:
>> On Wed, May 9, 2018 at 4:15 PM, Jakub Kicinski wrote:
>> > On Wed,  9 May 2018 07:21:41 -0400, Michael Chan wrote:
>> >> VF Queue resources are always limited and there is currently no
>> >> infrastructure to allow the admin. on the host to add or reduce queue
>> >> resources for any particular VF.  With ever increasing number of VFs
>> >> being supported, it is desirable to allow the admin. to configure queue
>> >> resources differently for the VFs.  Some VFs may require more or fewer
>> >> queues due to different bandwidth requirements or different number of
>> >> vCPUs in the VM.  This patch adds the infrastructure to do that by
>> >> adding IFLA_VF_QUEUES netlink attribute and a new .ndo_set_vf_queues()
>> >> to the net_device_ops.
>> >>
>> >> Four parameters are exposed for each VF:
>> >>
>> >> o min_tx_queues - Guaranteed or current tx queues assigned to the VF.
>> >
>> > This muxing of semantics may be a little awkward and unnecessary, would
>> > it make sense for struct ifla_vf_info to have a separate fields for
>> > current number of queues and the admin-set guaranteed min?
>>
>> The loose semantics is mainly to allow some flexibility in
>> implementation.  Sure, we can tighten the definitions or add
>> additional fields.
>
> I would appreciate that, if others don't disagree.  I personally don't
> see the need for flexibility (AKA per-vendor behaviour) here, quite the
> opposite, min/max/current number of queues seems quite self-explanatory.
>
> Or at least don't allow min to mean current?  Otherwise the API gets a
> bit asymmetrical :(

Sure, will do.

>
>> > Is there a real world use case for the min value or are you trying to
>> > make the API feature complete?
>>
>> In this proposal, these parameters are mainly viewed as the bounds for
>> the queues that each VF can potentially allocate.  The actual number
>> of queues chosen by the VF driver or modified by the VF user can be
>> any number within the bounds.
>
> Perhaps you have misspoken here - these are not allowed bounds, right?
> min is the guarantee that queues will be available, not requirement.
> Similar to bandwidth allocation.
>
> IOW if the bounds are set [4, 16] the VF may still choose to use 1
> queue, event thought that's not within bounds.

Yes, you are absolutely right.  The VF can allocate 1 queue.  Up to
min is guaranteed.  Up to max is not guaranteed.


Re: [PATCH net-next RFC 1/3] net: Add support to configure SR-IOV VF minimum and maximum queues.

2018-05-09 Thread Michael Chan
On Wed, May 9, 2018 at 4:15 PM, Jakub Kicinski
<jakub.kicin...@netronome.com> wrote:
> On Wed,  9 May 2018 07:21:41 -0400, Michael Chan wrote:
>> VF Queue resources are always limited and there is currently no
>> infrastructure to allow the admin. on the host to add or reduce queue
>> resources for any particular VF.  With ever increasing number of VFs
>> being supported, it is desirable to allow the admin. to configure queue
>> resources differently for the VFs.  Some VFs may require more or fewer
>> queues due to different bandwidth requirements or different number of
>> vCPUs in the VM.  This patch adds the infrastructure to do that by
>> adding IFLA_VF_QUEUES netlink attribute and a new .ndo_set_vf_queues()
>> to the net_device_ops.
>>
>> Four parameters are exposed for each VF:
>>
>> o min_tx_queues - Guaranteed or current tx queues assigned to the VF.
>
> This muxing of semantics may be a little awkward and unnecessary, would
> it make sense for struct ifla_vf_info to have a separate fields for
> current number of queues and the admin-set guaranteed min?

The loose semantics is mainly to allow some flexibility in
implementation.  Sure, we can tighten the definitions or add
additional fields.

>
> Is there a real world use case for the min value or are you trying to
> make the API feature complete?

In this proposal, these parameters are mainly viewed as the bounds for
the queues that each VF can potentially allocate.  The actual number
of queues chosen by the VF driver or modified by the VF user can be
any number within the bounds.

We currently need to have min and max parameters to support the
different modes we use to distribute the queue resources to the VFs.
In one mode, for example, resources are statically divided and each VF
has a small number of guaranteed queues (min = max).  In a different
mode, we allow more flexible resource allocation with each VF having a
small number of guaranteed queues but a higher number of
non-guaranteed queues (min < max).  Some VFs may be able to allocate
queues much higher than min when resources are still available, while
others may only be able to allocate min queues when resources are used
up.

With min and max exposed, the PF user can properly tweak the resources
for each VF described above.

>
>> o max_tx_queues - Maximum but not necessarily guaranteed tx queues
>>   available to the VF.
>>
>> o min_rx_queues - Guaranteed or current rx queues assigned to the VF.
>>
>> o max_rx_queues - Maximum but not necessarily guaranteed rx queues
>>   available to the VF.
>>
>> The "ip link set" command will subsequently be patched to support the new
>> operation to set the above parameters.
>>
>> After the admin. makes a change to the above parameters, the corresponding
>> VF will have a new range of channels to set using ethtool -L.
>>
>> Signed-off-by: Michael Chan <michael.c...@broadcom.com>
>
> In switchdev mode we can use number of queues on the representor as a
> proxy for max number of queues allowed for the ASIC port.  This works
> better when representors are muxed in the first place than when they
> have actual queues backing them.  WDYT about such scheme, Or?  A very
> pleasant side-effect is that one can configure qdiscs and get stats
> per-HW queue.

This is an interesting approach.  But it doesn't have the min and max
for each VF, and also only works in switchdev mode.


[PATCH net-next RFC 2/3] bnxt_en: Store min/max tx/rx rings for individual VFs.

2018-05-09 Thread Michael Chan
With new infrastructure to configure queues differently for each VF,
we need to store the current min/max rx/tx rings for each VF.

Signed-off-by: Michael Chan <michael.c...@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h   |  5 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 23 +++
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 9b14eb6..2f5a23c 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -837,6 +837,10 @@ struct bnxt_vf_info {
u32 func_flags; /* func cfg flags */
u32 min_tx_rate;
u32 max_tx_rate;
+   u16 min_tx_rings;
+   u16 max_tx_rings;
+   u16 min_rx_rings;
+   u16 max_rx_rings;
void*hwrm_cmd_req_addr;
dma_addr_t  hwrm_cmd_req_dma_addr;
 };
@@ -1351,6 +1355,7 @@ struct bnxt {
 #ifdef CONFIG_BNXT_SRIOV
int nr_vfs;
struct bnxt_vf_info vf;
+   struct hwrm_func_vf_resource_cfg_input vf_resc_cfg_input;
wait_queue_head_t   sriov_cfg_wait;
boolsriov_cfg;
 #define BNXT_SRIOV_CFG_WAIT_TMOmsecs_to_jiffies(1)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
index a649108..489e534 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
@@ -171,6 +171,10 @@ int bnxt_get_vf_config(struct net_device *dev, int vf_id,
ivi->linkstate = IFLA_VF_LINK_STATE_ENABLE;
else
ivi->linkstate = IFLA_VF_LINK_STATE_DISABLE;
+   ivi->min_tx_queues = vf->min_tx_rings;
+   ivi->max_tx_queues = vf->max_tx_rings;
+   ivi->min_rx_queues = vf->min_rx_rings;
+   ivi->max_rx_queues = vf->max_rx_rings;
 
return 0;
 }
@@ -498,6 +502,8 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, int 
num_vfs)
 
mutex_lock(>hwrm_cmd_lock);
for (i = 0; i < num_vfs; i++) {
+   struct bnxt_vf_info *vf = >vf[i];
+
req.vf_id = cpu_to_le16(pf->first_vf_id + i);
rc = _hwrm_send_message(bp, , sizeof(req),
HWRM_CMD_TIMEOUT);
@@ -506,7 +512,11 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, int 
num_vfs)
break;
}
pf->active_vfs = i + 1;
-   pf->vf[i].fw_fid = pf->first_vf_id + i;
+   vf->fw_fid = pf->first_vf_id + i;
+   vf->min_tx_rings = le16_to_cpu(req.min_tx_rings);
+   vf->max_tx_rings = vf_tx_rings;
+   vf->min_rx_rings = le16_to_cpu(req.min_rx_rings);
+   vf->max_rx_rings = vf_rx_rings;
}
mutex_unlock(>hwrm_cmd_lock);
if (pf->active_vfs) {
@@ -521,6 +531,7 @@ static int bnxt_hwrm_func_vf_resc_cfg(struct bnxt *bp, int 
num_vfs)
hw_resc->max_stat_ctxs -= le16_to_cpu(req.min_stat_ctx) * n;
hw_resc->max_vnics -= le16_to_cpu(req.min_vnics) * n;
 
+   memcpy(>vf_resc_cfg_input, , sizeof(req));
rc = pf->active_vfs;
}
return rc;
@@ -585,6 +596,7 @@ static int bnxt_hwrm_func_cfg(struct bnxt *bp, int num_vfs)
 
mutex_lock(>hwrm_cmd_lock);
for (i = 0; i < num_vfs; i++) {
+   struct bnxt_vf_info *vf = >vf[i];
int vf_tx_rsvd = vf_tx_rings;
 
req.fid = cpu_to_le16(pf->first_vf_id + i);
@@ -593,12 +605,15 @@ static int bnxt_hwrm_func_cfg(struct bnxt *bp, int 
num_vfs)
if (rc)
break;
pf->active_vfs = i + 1;
-   pf->vf[i].fw_fid = le16_to_cpu(req.fid);
-   rc = __bnxt_hwrm_get_tx_rings(bp, pf->vf[i].fw_fid,
- _tx_rsvd);
+   vf->fw_fid = le16_to_cpu(req.fid);
+   rc = __bnxt_hwrm_get_tx_rings(bp, vf->fw_fid, _tx_rsvd);
if (rc)
break;
total_vf_tx_rings += vf_tx_rsvd;
+   vf->min_tx_rings = vf_tx_rsvd;
+   vf->max_tx_rings = vf_tx_rsvd;
+   vf->min_rx_rings = vf_rx_rings;
+   vf->max_rx_rings = vf_rx_rings;
}
mutex_unlock(>hwrm_cmd_lock);
if (rc)
-- 
1.8.3.1



[PATCH net-next RFC 3/3] bnxt_en: Implement .ndo_set_vf_queues().

2018-05-09 Thread Michael Chan
Implement .ndo_set_vf_queues() on the PF driver to configure the queues
parameters for individual VFs.  This allows the admin. on the host to
increase or decrease queues for individual VFs.

Signed-off-by: Michael Chan <michael.c...@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c   |  1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 67 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h |  2 +
 3 files changed, 70 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index dfa0839..2ce9779 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -8373,6 +8373,7 @@ static int bnxt_swdev_port_attr_get(struct net_device 
*dev,
.ndo_set_vf_link_state  = bnxt_set_vf_link_state,
.ndo_set_vf_spoofchk= bnxt_set_vf_spoofchk,
.ndo_set_vf_trust   = bnxt_set_vf_trust,
+   .ndo_set_vf_queues  = bnxt_set_vf_queues,
 #endif
 #ifdef CONFIG_NET_POLL_CONTROLLER
.ndo_poll_controller= bnxt_poll_controller,
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
index 489e534..f0d938c 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
@@ -138,6 +138,73 @@ int bnxt_set_vf_trust(struct net_device *dev, int vf_id, 
bool trusted)
return 0;
 }
 
+static bool bnxt_param_ok(int new, u16 curr, u16 avail)
+{
+   int delta;
+
+   if (new <= curr)
+   return true;
+
+   delta = new - curr;
+   if (delta <= avail)
+   return true;
+   return false;
+}
+
+int bnxt_set_vf_queues(struct net_device *dev, int vf_id, int min_txq,
+  int max_txq, int min_rxq, int max_rxq)
+{
+   struct hwrm_func_vf_resource_cfg_input req = {0};
+   struct bnxt *bp = netdev_priv(dev);
+   u16 avail_tx_rings, avail_rx_rings;
+   struct bnxt_hw_resc *hw_resc;
+   struct bnxt_vf_info *vf;
+   int rc;
+
+   if (bnxt_vf_ndo_prep(bp, vf_id))
+   return -EINVAL;
+
+   if (!(bp->flags & BNXT_FLAG_NEW_RM))
+   return -EOPNOTSUPP;
+
+   vf = >pf.vf[vf_id];
+   hw_resc = >hw_resc;
+
+   avail_tx_rings = hw_resc->max_tx_rings - bp->tx_nr_rings;
+   if (bp->flags & BNXT_FLAG_AGG_RINGS)
+   avail_rx_rings = hw_resc->max_rx_rings - bp->rx_nr_rings * 2;
+   else
+   avail_rx_rings = hw_resc->max_rx_rings - bp->rx_nr_rings;
+   if (!bnxt_param_ok(min_txq, vf->min_tx_rings, avail_tx_rings))
+   return -ENOBUFS;
+   if (!bnxt_param_ok(min_rxq, vf->min_rx_rings, avail_rx_rings))
+   return -ENOBUFS;
+   if (!bnxt_param_ok(max_txq, vf->max_tx_rings, avail_tx_rings))
+   return -ENOBUFS;
+   if (!bnxt_param_ok(max_rxq, vf->max_rx_rings, avail_rx_rings))
+   return -ENOBUFS;
+
+   bnxt_hwrm_cmd_hdr_init(bp, , HWRM_FUNC_VF_RESOURCE_CFG, -1, -1);
+   memcpy(, >vf_resc_cfg_input, sizeof(req));
+   req.min_tx_rings = cpu_to_le16(min_txq);
+   req.min_rx_rings = cpu_to_le16(min_rxq);
+   req.max_tx_rings = cpu_to_le16(max_txq);
+   req.max_rx_rings = cpu_to_le16(max_rxq);
+   rc = hwrm_send_message(bp, , sizeof(req), HWRM_CMD_TIMEOUT);
+   if (rc)
+   return -EIO;
+
+   hw_resc->max_tx_rings += vf->min_tx_rings;
+   hw_resc->max_rx_rings += vf->min_rx_rings;
+   vf->min_tx_rings = min_txq;
+   vf->max_tx_rings = max_txq;
+   vf->min_rx_rings = min_rxq;
+   vf->max_rx_rings = max_rxq;
+   hw_resc->max_tx_rings -= vf->min_tx_rings;
+   hw_resc->max_rx_rings -= vf->min_rx_rings;
+   return 0;
+}
+
 int bnxt_get_vf_config(struct net_device *dev, int vf_id,
   struct ifla_vf_info *ivi)
 {
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h
index e9b20cd..325b412 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h
@@ -35,6 +35,8 @@
 int bnxt_set_vf_link_state(struct net_device *, int, int);
 int bnxt_set_vf_spoofchk(struct net_device *, int, bool);
 int bnxt_set_vf_trust(struct net_device *dev, int vf_id, bool trust);
+int bnxt_set_vf_queues(struct net_device *dev, int vf_id, int min_txq,
+  int max_txq, int min_rxq, int max_rxq);
 int bnxt_sriov_configure(struct pci_dev *pdev, int num_vfs);
 void bnxt_sriov_disable(struct bnxt *);
 void bnxt_hwrm_exec_fwd_req(struct bnxt *);
-- 
1.8.3.1



[PATCH net-next RFC 0/3] net: Add support to configure SR-IOV VF queues.

2018-05-09 Thread Michael Chan
VF Queue resources are always limited and there is currently no
infrastructure to allow the admin. on the host to add or reduce queue
resources for any particular VF.  This RFC series adds the infrastructure
to do that and adds the functionality to the bnxt_en driver.

The "ip link set" command will subsequently be patched to support the new
operation.

Michael Chan (3):
  net: Add support to configure SR-IOV VF minimum and maximum queues.
  bnxt_en: Store min/max tx/rx rings for individual VFs.
  bnxt_en: Implement .ndo_set_vf_queues().

 drivers/net/ethernet/broadcom/bnxt/bnxt.c   |  1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt.h   |  5 ++
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 90 +++--
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.h |  2 +
 include/linux/if_link.h |  4 ++
 include/linux/netdevice.h   |  6 ++
 include/uapi/linux/if_link.h|  9 +++
 net/core/rtnetlink.c| 28 +++-
 8 files changed, 138 insertions(+), 7 deletions(-)

-- 
1.8.3.1



[PATCH net-next RFC 1/3] net: Add support to configure SR-IOV VF minimum and maximum queues.

2018-05-09 Thread Michael Chan
VF Queue resources are always limited and there is currently no
infrastructure to allow the admin. on the host to add or reduce queue
resources for any particular VF.  With ever increasing number of VFs
being supported, it is desirable to allow the admin. to configure queue
resources differently for the VFs.  Some VFs may require more or fewer
queues due to different bandwidth requirements or different number of
vCPUs in the VM.  This patch adds the infrastructure to do that by
adding IFLA_VF_QUEUES netlink attribute and a new .ndo_set_vf_queues()
to the net_device_ops.

Four parameters are exposed for each VF:

o min_tx_queues - Guaranteed or current tx queues assigned to the VF.

o max_tx_queues - Maximum but not necessarily guaranteed tx queues
  available to the VF.

o min_rx_queues - Guaranteed or current rx queues assigned to the VF.

o max_rx_queues - Maximum but not necessarily guaranteed rx queues
  available to the VF.

The "ip link set" command will subsequently be patched to support the new
operation to set the above parameters.

After the admin. makes a change to the above parameters, the corresponding
VF will have a new range of channels to set using ethtool -L.

Signed-off-by: Michael Chan <michael.c...@broadcom.com>
---
 include/linux/if_link.h  |  4 
 include/linux/netdevice.h|  6 ++
 include/uapi/linux/if_link.h |  9 +
 net/core/rtnetlink.c | 28 +---
 4 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index 622658d..8e81121 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -29,5 +29,9 @@ struct ifla_vf_info {
__u32 rss_query_en;
__u32 trusted;
__be16 vlan_proto;
+   __u32 min_tx_queues;
+   __u32 max_tx_queues;
+   __u32 min_rx_queues;
+   __u32 max_rx_queues;
 };
 #endif /* _LINUX_IF_LINK_H */
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 03ed492..30a3caf 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1023,6 +1023,8 @@ struct dev_ifalias {
  *  with PF and querying it may introduce a theoretical security risk.
  * int (*ndo_set_vf_rss_query_en)(struct net_device *dev, int vf, bool 
setting);
  * int (*ndo_get_vf_port)(struct net_device *dev, int vf, struct sk_buff *skb);
+ * int (*ndo_set_vf_queues)(struct net_device *dev, int vf, int min_txq,
+ * int max_txq, int min_rxq, int max_rxq);
  * int (*ndo_setup_tc)(struct net_device *dev, enum tc_setup_type type,
  *void *type_data);
  * Called to setup any 'tc' scheduler, classifier or action on @dev.
@@ -1272,6 +1274,10 @@ struct net_device_ops {
int (*ndo_set_vf_rss_query_en)(
   struct net_device *dev,
   int vf, bool setting);
+   int (*ndo_set_vf_queues)(struct net_device *dev,
+int vf,
+int min_txq, int max_txq,
+int min_rxq, int max_rxq);
int (*ndo_setup_tc)(struct net_device *dev,
enum tc_setup_type type,
void *type_data);
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index b852664..fc56a47 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -658,6 +658,7 @@ enum {
IFLA_VF_IB_NODE_GUID,   /* VF Infiniband node GUID */
IFLA_VF_IB_PORT_GUID,   /* VF Infiniband port GUID */
IFLA_VF_VLAN_LIST,  /* nested list of vlans, option for QinQ */
+   IFLA_VF_QUEUES, /* Min and Max TX/RX queues */
__IFLA_VF_MAX,
 };
 
@@ -748,6 +749,14 @@ struct ifla_vf_trust {
__u32 setting;
 };
 
+struct ifla_vf_queues {
+   __u32 vf;
+   __u32 min_tx_queues;
+   __u32 max_tx_queues;
+   __u32 min_rx_queues;
+   __u32 max_rx_queues;
+};
+
 /* VF ports management section
  *
  * Nested layout of set/get msg is:
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 8080254..7cf3582 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -921,7 +921,8 @@ static inline int rtnl_vfinfo_size(const struct net_device 
*dev,
 nla_total_size_64bit(sizeof(__u64)) +
 /* IFLA_VF_STATS_TX_DROPPED */
 nla_total_size_64bit(sizeof(__u64)) +
-nla_total_size(sizeof(struct ifla_vf_trust)));
+nla_total_size(sizeof(struct ifla_vf_trust)) +
+nla_total_size(sizeof(struct ifla_vf_queues)));
return size;
} else
return 0;
@@ -1181,6 +

[PATCH net-next 0/4] bnxt_en: Fixes for net-next.

2018-05-08 Thread Michael Chan
This series includes a bug fix for a regression in firmware message polling
introduced recently on net-next.  There are 3 additional minor fixes for
unsupported link speed checking, VF MAC address handling, and setting
PHY eeprom length.

Michael Chan (3):
  bnxt_en: Fix firmware message delay loop regression.
  bnxt_en: Check unsupported speeds in bnxt_update_link() on PF only.
  bnxt_en: Always forward VF MAC address to the PF.

Vasundhara Volam (1):
  bnxt_en: Read phy eeprom A2h address only when optical diagnostics is
supported.

 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 17 -
 drivers/net/ethernet/broadcom/bnxt/bnxt.h | 10 --
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 20 
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c   |  3 ++-
 4 files changed, 30 insertions(+), 20 deletions(-)

-- 
1.8.3.1



[PATCH net-next 1/4] bnxt_en: Fix firmware message delay loop regression.

2018-05-08 Thread Michael Chan
A recent change to reduce delay granularity waiting for firmware
reponse has caused a regression.  With a tighter delay loop,
the driver may see the beginning part of the response faster.
The original 5 usec delay to wait for the rest of the message
is not long enough and some messages are detected as invalid.

Increase the maximum wait time from 5 usec to 20 usec.  Also, fix
the debug message that shows the total delay time for the response
when the message times out.  With the new logic, the delay time
is not fixed per iteration of the loop, so we define a macro to
show the total delay time.

Fixes: 9751e8e71487 ("bnxt_en: reduce timeout on initial HWRM calls")
Signed-off-by: Michael Chan <michael.c...@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 12 
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  7 +++
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index efe5c72..168342a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -3530,6 +3530,8 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void 
*msg, u32 msg_len,
  HWRM_RESP_LEN_SFT;
valid = bp->hwrm_cmd_resp_addr + len - 1;
} else {
+   int j;
+
/* Check if response len is updated */
for (i = 0; i < tmo_count; i++) {
len = (le32_to_cpu(*resp_len) & HWRM_RESP_LEN_MASK) >>
@@ -3547,14 +3549,15 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void 
*msg, u32 msg_len,
 
if (i >= tmo_count) {
netdev_err(bp->dev, "Error (timeout: %d) msg {0x%x 
0x%x} len:%d\n",
-  timeout, le16_to_cpu(req->req_type),
+  HWRM_TOTAL_TIMEOUT(i),
+  le16_to_cpu(req->req_type),
   le16_to_cpu(req->seq_id), len);
return -1;
}
 
/* Last byte of resp contains valid bit */
valid = bp->hwrm_cmd_resp_addr + len - 1;
-   for (i = 0; i < 5; i++) {
+   for (j = 0; j < HWRM_VALID_BIT_DELAY_USEC; j++) {
/* make sure we read from updated DMA memory */
dma_rmb();
if (*valid)
@@ -3562,9 +3565,10 @@ static int bnxt_hwrm_do_send_msg(struct bnxt *bp, void 
*msg, u32 msg_len,
udelay(1);
}
 
-   if (i >= 5) {
+   if (j >= HWRM_VALID_BIT_DELAY_USEC) {
netdev_err(bp->dev, "Error (timeout: %d) msg {0x%x 
0x%x} len:%d v:%d\n",
-  timeout, le16_to_cpu(req->req_type),
+  HWRM_TOTAL_TIMEOUT(i),
+  le16_to_cpu(req->req_type),
   le16_to_cpu(req->seq_id), len, *valid);
return -1;
}
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 8df1d8b..a9c210e 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -539,6 +539,13 @@ struct rx_tpa_end_cmp_ext {
 #define HWRM_MIN_TIMEOUT   25
 #define HWRM_MAX_TIMEOUT   40
 
+#define HWRM_TOTAL_TIMEOUT(n)  (((n) <= HWRM_SHORT_TIMEOUT_COUNTER) ?  \
+   ((n) * HWRM_SHORT_MIN_TIMEOUT) :\
+   (HWRM_SHORT_TIMEOUT_COUNTER * HWRM_SHORT_MIN_TIMEOUT +  \
+((n) - HWRM_SHORT_TIMEOUT_COUNTER) * HWRM_MIN_TIMEOUT))
+
+#define HWRM_VALID_BIT_DELAY_USEC  20
+
 #define BNXT_RX_EVENT  1
 #define BNXT_AGG_EVENT 2
 #define BNXT_TX_EVENT  4
-- 
1.8.3.1



[PATCH net-next 4/4] bnxt_en: Always forward VF MAC address to the PF.

2018-05-08 Thread Michael Chan
The current code already forwards the VF MAC address to the PF, except
in one case.  If the VF driver gets a valid MAC address from the firmware
during probe time, it will not forward the MAC address to the PF,
incorrectly assuming that the PF already knows the MAC address.  This
causes "ip link show" to show zero VF MAC addresses for this case.

This assumption is not correct.  Newer firmware remembers the VF MAC
address last used by the VF and provides it to the VF driver during
probe.  So we need to always forward the VF MAC address to the PF.

The forwarded MAC address may now be the PF assigned MAC address and so we
need to make sure we approve it for this case.

Signed-off-by: Michael Chan <michael.c...@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c   | 2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index cd3ab78..dfa0839 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -8678,8 +8678,8 @@ static int bnxt_init_mac_addr(struct bnxt *bp)
memcpy(bp->dev->dev_addr, vf->mac_addr, ETH_ALEN);
} else {
eth_hw_addr_random(bp->dev);
-   rc = bnxt_approve_mac(bp, bp->dev->dev_addr);
}
+   rc = bnxt_approve_mac(bp, bp->dev->dev_addr);
 #endif
}
return rc;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
index cc21d87..a649108 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
@@ -923,7 +923,8 @@ static int bnxt_vf_configure_mac(struct bnxt *bp, struct 
bnxt_vf_info *vf)
if (req->enables & cpu_to_le32(FUNC_VF_CFG_REQ_ENABLES_DFLT_MAC_ADDR)) {
if (is_valid_ether_addr(req->dflt_mac_addr) &&
((vf->flags & BNXT_VF_TRUST) ||
-(!is_valid_ether_addr(vf->mac_addr {
+!is_valid_ether_addr(vf->mac_addr) ||
+ether_addr_equal(req->dflt_mac_addr, vf->mac_addr))) {
ether_addr_copy(vf->vf_mac_addr, req->dflt_mac_addr);
return bnxt_hwrm_exec_fwd_resp(bp, vf, msg_size);
}
-- 
1.8.3.1



  1   2   3   4   5   6   7   8   9   10   >