I built a Xenial test kernel with all the patches from the following

bug 1670518
         PCI: hv: Allocate physically contiguous hypercall params buffer
         PCI: hv: Make unnecessarily global IRQ masking functions static
         PCI: hv: Delete the device earlier from hbus->children for hot-remove
         PCI: hv: Fix hv_pci_remove() for hot-remove

bug 1672785 
         net/mlx4_core: Avoid delays during VF driver device shutdown

bug 1667531 
         tools: hv: Enable network manager for bonding scripts on RH
         [net-next] tools: hv: Add clean up function for Ubuntu config
         bcc5a76 tools: hv: Add a script to help bonding synthetic and VF NICs  

bug 1667527
        4a9b0933bdfc PCI: hv: Use device serial number as PCI domain

bug 1667007 
        d3de209 net/mlx4_core: Use cq quota in SRIOV when creating completion 

bug 1650058 
        14c84da90b0d net/mlx4_en: Fix bad WQE issue
        c46100f413ca net/mlx4_core: Fix racy CQ (Completion Queue) free
        f4f73e2e6308 net/mlx4_core: Fix when to save some qp context flags for 
dynamic VST to VGT transitions
        3c05ac20fe6e net/mlx4_core: Avoid command timeouts during VF driver 
device shutdown

The test kernel can be downloaded from:

You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.

  [Hyper-V][Mellanox] net/mlx4_core: Avoid delays during VF driver
  device shutdown

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Zesty:
  In Progress

Bug description:
  Mellanox has submitted the following patch upstream that's important
  for SR-IOV in Azure.

  Please integrate it into the Mellanox mlx4 drivers for lts-xenial,
  HWE, Zesty, and Azure custom.


  From: Jack Morgenstein <ja...@dev.mellanox.co.il>

  Some Hypervisors detach VFs from VMs by instantly causing an FLR event
  to be generated for a VF.

  In the mlx4 case, this will cause that VF's comm channel to be disabled
  before the VM has an opportunity to invoke the VF device's "shutdown"

  For such Hypervisors, there is a race condition between the VF's
  shutdown method and its internal-error detection/reset thread.

  The internal-error detection/reset thread (which runs every 5 seconds) also
  detects a disabled comm channel. If the internal-error detection/reset
  flow wins the race, we still get delays (while that flow tries repeatedly
  to detect comm-channel recovery).

  The cited commit fixed the command timeout problem when the
  internal-error detection/reset flow loses the race.

  This commit avoids the unneeded delays when the internal-error
  detection/reset flow wins.

  Fixes: d585df1c5ccf ("net/mlx4_core: Avoid command timeouts during VF driver 
device shutdown")
  Signed-off-by: Jack Morgenstein <ja...@dev.mellanox.co.il>
  Reported-by: Simon Xiao <six...@microsoft.com>
  Signed-off-by: Tariq Toukan <tar...@mellanox.com>
   drivers/net/ethernet/mellanox/mlx4/cmd.c  | 11 +++++++++++
   drivers/net/ethernet/mellanox/mlx4/main.c | 11 +++++++++++
   include/linux/mlx4/device.h               |  1 +
   3 files changed, 23 insertions(+)

To manage notifications about this bug go to:

Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to