This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
artful' to 'verification-done-artful'. If the problem still exists,
change the tag 'verification-needed-artful' to 'verification-failed-
artful'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-artful

** Changed in: linux (Ubuntu Zesty)
       Status: In Progress => Won't Fix

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1732804

Title:
  [Zesty/Artful] On ARM64 PCIE physical function passthrough guest fails
  to boot

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Zesty:
  Won't Fix
Status in linux source package in Artful:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed

Bug description:
  [Impact]
  Passing through a physical function like the Mellanox PCIE ethernet 
controller causes the guest to fail booting, and host reports Hardware Error.
  == Host ==
  [109920.834703] {1}[Hardware Error]: Hardware error from APEI Generic 
Hardware Error Source: 4
  [109920.842142] {1}[Hardware Error]: event severity: recoverable
  [109920.847848] {1}[Hardware Error]:  precise tstamp: 2017-11-16 23:20:05
  [109920.854385] {1}[Hardware Error]:  Error 0, type: recoverable
  [109920.860111] {1}[Hardware Error]:   section_type: PCIe error
  [109920.865718] {1}[Hardware Error]:   port_type: 0, PCIe end point
  [109920.871708] {1}[Hardware Error]:   version: 3.0
  [109920.876343] {1}[Hardware Error]:   command: 0x0006, status: 0x0010
  [109920.882559] {1}[Hardware Error]:   device_id: 0000:01:00.0
  [109920.888113] {1}[Hardware Error]:   slot: 0
  [109920.892285] {1}[Hardware Error]:   secondary_bus: 0x00
  [109920.897489] {1}[Hardware Error]:   vendor_id: 0x15b3, device_id: 0x1013
  [109920.904172] {1}[Hardware Error]:   class_code: 000002
  [109920.909378] vfio-pci 0000:01:00.0: aer_status: 0x00040000, aer_mask: 
0x00000000
  [109920.916675] Malformed TLP
  [109920.916678] vfio-pci 0000:01:00.0: aer_layer=Transaction Layer, 
aer_agent=Receiver ID
  [109920.924573] vfio-pci 0000:01:00.0: aer_uncor_severity: 0x00062010
  [109920.930736] vfio-pci 0000:01:00.0:   TLP Header: 4a008040 00000100 
01000000 00000000
  [109920.938548] vfio-pci 0000:01:00.0: broadcast error_detected message
  [109921.965056] pcieport 0000:00:00.0: downstream link has been reset
  [109921.965062] vfio-pci 0000:01:00.0: broadcast mmio_enabled message
  [109921.965066] vfio-pci 0000:01:00.0: broadcast resume message
  [109921.965070] vfio-pci 0000:01:00.0: AER: Device recovery successful
  == Guest ==
  EFI stub: Booting Linux Kernel...
  EFI stub: EFI_RNG_PROTOCOL unavailable, no randomness supplied
  EFI stub: Using DTB from configuration table
  EFI stub: Exiting boot services and installing virtual address map...
  [    1.518252] kvm [1]: HYP mode not available
  [    2.578929] mlx5_core 0000:05:00.0: mlx5_core_set_issi:778:(pid 152): 
Failed to query ISSI err(-1) status(0) synd(0)
  [    2.582424] mlx5_core 0000:05:00.0: failed to set issi
  [    2.616756] mlx5_core 0000:05:00.0: mlx5_load_one failed with error code -1

  This is because, virtualization of physical functions are broken on
  systems with Maximum Payload Size bigger than 128. QDF2400 FW tries to
  maximize this setting. We have observed an MPS of 512 on QDF2400
  systems.

  [Fix]
  Patches are in linux-next:
  523184972b28 vfio/pci: Virtualize Maximum Payload Size
  cf0d53ba4947 vfio/pci: Virtualize Maximum Read Request Size

  [Testing]
  With the above patches applied the guest is able to boot when PCIE physical 
function is passthrough and we don't see the errors on the host system.
  == On the Guest ==
  ubuntu@ubuntu-pcitest:~$ lspci
  00:00.0 Host bridge: Red Hat, Inc. Device 0008
  00:01.0 PCI bridge: Red Hat, Inc. Device 000c
  00:01.1 PCI bridge: Red Hat, Inc. Device 000c
  00:01.2 PCI bridge: Red Hat, Inc. Device 000c
  00:01.3 PCI bridge: Red Hat, Inc. Device 000c
  00:01.4 PCI bridge: Red Hat, Inc. Device 000c
  00:01.5 PCI bridge: Red Hat, Inc. Device 000c
  01:00.0 Ethernet controller: Red Hat, Inc Virtio network device (rev 01)
  02:00.0 Communication controller: Red Hat, Inc Virtio console (rev 01)
  03:00.0 SCSI storage controller: Red Hat, Inc Virtio block device (rev 01)
  04:00.0 SCSI storage controller: Red Hat, Inc Virtio block device (rev 01)
  05:00.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]

  ubuntu@ubuntu-pcitest:~$ lsmod | grep mlx
  mlx5_core 471040 0
  devlink 36864 1 mlx5_core
  ptp 28672 1 mlx5_core

  [Regression Potential]
  Two patches to drivers/vfio/pci were cleanly cherry picked from linux-next 
and applied to Artful/Zesty. Tested on ARM64 QDF2400 system and no regressions 
were found.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1732804/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to