** Also affects: linux (Ubuntu Groovy)
   Importance: Undecided
     Assignee: Canonical Kernel Team (canonical-kernel-team)
       Status: In Progress

You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.

  [UBUNTU 20.04] s390x/pci: enumerate pci functions per physical adapter

Status in Ubuntu on IBM z Systems:
  Fix Committed
Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Focal:
  Fix Released
Status in linux source package in Groovy:
  In Progress

Bug description:
  SRU Justification:


  * Mellanox CX5 port multi-pathing is broken on s390x due to non-
  standard topology of PCI IDs (phys. and virtual)

  * The Mellanox Connect-X 5 PCI driver (mlx5) implements multi-path
  that can be used to combine multiple networking ports to improve
  performance and reliability.

  * For that purpose, the mlx5 driver combines PCI functions based on
  topology information (the function number) as determined by their PCI

  * Currently the Linux on Z PCI bus does not reflect PCI topology
  information in the PCI ID. As a result, the mlx5 multi-path function
  is broken and cannot be activated.


  * Backport 1: https://launchpadlibrarian.net/479699471/0001-s390-pci-

  * Backport 2: https://launchpadlibrarian.net/479699482/0002-s390-pci-

  * Backport 3: https://launchpadlibrarian.net/479699492/0003-s390-pci-

  * Backport 4: https://launchpadlibrarian.net/479699497/0004-s390-pci-

  * Backport 5: https://launchpadlibrarian.net/479700706/0005-s390-pci-

  * Backport 6: https://launchpadlibrarian.net/479700712/0006-s390-pci-

  * Backport 7: https://launchpadlibrarian.net/479700739/0007-s390-pci-

  * Backport 8: https://launchpadlibrarian.net/479700769/0008-s390-pci-

  * Backport 9: https://launchpadlibrarian.net/479700786/0009-s390-pci-

  * Backport 10: https://launchpadlibrarian.net/479700794/0010-s390-pci-

  * Backport 11: https://launchpadlibrarian.net/479700798/0011-s390-pci-

  * Backport 12: https://launchpadlibrarian.net/479700799/0012-s390-pci-

  [Test Case]

  * Prepare an IBM z13 or LinuxONE III (or newer) system with two or
  more RoCE Express PCI 2(.1) adapters.

  * Assign the adapters (and it's virtual functions) to an LPAR.

  * Verify whether the physical and virtual functions are grouped in
  arbitrary order or in consecutive order - physical first (for example
  with lspci -t ...)

  [Regression Potential]

  * The regression potential can be considered as moderate, since:

  * It is purely s390x specific code (arch/s390/*
  drivers/iommu/s390-iommu.c and drivers/pci/hotplug/s390_pci_hpc.c -
  and some doc adjustments, too).

  * It largely affects zPCI, the s390x specific PCI code layer.

  * PCI cards available for s390x are optional cards (RoCE and zEDC) and
  not very wide-spread.

  * The situation described above affects the RoCE adapters only
  (Mellanox based).

  * The patches are also upstream accepted and available via linux-next,
  but to apply them to focal kernel 5.4 the above backports are needed.

  * However, the code is modified by several patches (12), hence there
  is a chance to break zPCI with them.

  * For upfront testing a PPA got created with a focal (master-next)
  kernel that incl. all the above patches.


  Today, the enumeration of PCI functions on s390x does not reflect
  which functions belongs to which physical adapter.

  Layout of a PCI function address on Linux:
  <root complex>:<bus>:<device>.<function>

  On s390x, each function is presented as individual root complex today,

  PCHID 0100 VF1 0000:00:00.0
  PCHID 0100 VF23 0001:00:00.0
  PCHID 0200 VF1 0002:00:00.0
  OCHID 0100 VF17 0003:00:00.0

  On other platforms, the addresses correctly reflect the actual HW
  configuration. Some device drivers (mlx5 for Mellanox adapters) group
  functions of one physical adapter by checking which PCI functions have
  identical values for <root complex>:<bus>:<device>. We need to use the
  same enumeration scheme to achieve this functionality on s390x.

  In this case, the two physical functions of a Mellanox adapter need to
  get function number 0 and 1, and all virtual functions need to use the
  same <root complex>:<bus> numbers with function/device numbers
  counting up.

  Required result (example with 4 VFs per PF):

  PCHID 0100 PF 0 0000:00:00.0
  PCHID 0100 PF 1 0000:00:00.1
  PCHID 0100 PF 0 VF 0 0000:00:00.2
  PCHID 0100 PF 0 VF 1 0000:00:00.3
  PCHID 0100 PF 0 VF 2 0000:00:00.4
  PCHID 0100 PF 0 VF 3 0000:00:00.5
  PCHID 0100 PF 1 VF 0 0000:00:00.6
  PCHID 0100 PF 1 VF 1 0000:00:00.7
  PCHID 0100 PF 1 VF 2 0000:00:00.8
  PCHID 0100 PF 1 VF 3 0000:00:00.9

  PCHID 0200 PF 0 0001:00:00.0

To manage notifications about this bug go to:

Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to