[Kernel-packages] [Bug 1867455] Re: Having more than one AMD Radeon RX 580 installed will trigger a PCIe bus fatal error during boot.

2020-07-02 Thread Kai-Heng Feng
Please test latest mainline kernel:
https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.8-rc3/

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1867455

Title:
  Having more than one AMD Radeon RX 580 installed will trigger a PCIe
  bus fatal error during boot.

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  I have a Dell R920 with three MSI Radeon RX 580 8G V1 cards, if more
  than one of these cards is installed in the system, a bus fatal error
  will be generated causing the system to reset itself when the amdgpu
  module is loaded and it tries to initialize the device.

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-modules-5.4.0-14-lowlatency 5.4.0-14.17
  ProcVersionSignature: Ubuntu 5.4.0-14.17-lowlatency 5.4.18
  Uname: Linux 5.4.0-14-lowlatency x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', 
'/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D9p', 
'/dev/snd/pcmC0D8p', '/dev/snd/pcmC0D7p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Sat Mar 14 12:20:47 2020
  Dependencies:
   
  MachineType: Dell Inc. PowerEdge R920
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-14-lowlatency 
root=ZFS=rpool/ROOT/ubuntu ro intel_iommu=on iommu=pt 
root=ZFS=rpool/ROOT/ubuntu intremap=no_x2apic_optout ipv6.disable=1 
mitigations=off
  PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No 
PulseAudio daemon running, or not running as session daemon.
  RelatedPackageVersions:
   linux-restricted-modules-5.4.0-14-lowlatency N/A
   linux-backports-modules-5.4.0-14-lowlatency  N/A
   linux-firmware   1.186
  RfKill:
   
  SourcePackage: linux-5.4
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 06/26/2019
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 1.9.0
  dmi.board.name: 0V7HD0
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A06
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr1.9.0:bd06/26/2019:svnDellInc.:pnPowerEdgeR920:pvr:rvnDellInc.:rn0V7HD0:rvrA06:cvnDellInc.:ct23:cvr:
  dmi.product.name: PowerEdge R920
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1867455/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1867455] Re: Having more than one AMD Radeon RX 580 installed will trigger a PCIe bus fatal error during boot.

2020-06-05 Thread Dimitri John Ledkov
** Package changed: linux-5.4 (Ubuntu) => linux (Ubuntu)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1867455

Title:
  Having more than one AMD Radeon RX 580 installed will trigger a PCIe
  bus fatal error during boot.

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  I have a Dell R920 with three MSI Radeon RX 580 8G V1 cards, if more
  than one of these cards is installed in the system, a bus fatal error
  will be generated causing the system to reset itself when the amdgpu
  module is loaded and it tries to initialize the device.

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-modules-5.4.0-14-lowlatency 5.4.0-14.17
  ProcVersionSignature: Ubuntu 5.4.0-14.17-lowlatency 5.4.18
  Uname: Linux 5.4.0-14-lowlatency x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', 
'/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D9p', 
'/dev/snd/pcmC0D8p', '/dev/snd/pcmC0D7p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Sat Mar 14 12:20:47 2020
  Dependencies:
   
  MachineType: Dell Inc. PowerEdge R920
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-14-lowlatency 
root=ZFS=rpool/ROOT/ubuntu ro intel_iommu=on iommu=pt 
root=ZFS=rpool/ROOT/ubuntu intremap=no_x2apic_optout ipv6.disable=1 
mitigations=off
  PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No 
PulseAudio daemon running, or not running as session daemon.
  RelatedPackageVersions:
   linux-restricted-modules-5.4.0-14-lowlatency N/A
   linux-backports-modules-5.4.0-14-lowlatency  N/A
   linux-firmware   1.186
  RfKill:
   
  SourcePackage: linux-5.4
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 06/26/2019
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 1.9.0
  dmi.board.name: 0V7HD0
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A06
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr1.9.0:bd06/26/2019:svnDellInc.:pnPowerEdgeR920:pvr:rvnDellInc.:rn0V7HD0:rvrA06:cvnDellInc.:ct23:cvr:
  dmi.product.name: PowerEdge R920
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1867455/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1867455] Re: Having more than one AMD Radeon RX 580 installed will trigger a PCIe bus fatal error during boot.

2020-03-14 Thread Nikolas Britton
It appears the firmware is not loading, looking through the previous
logs it appears the firmware has never loaded during any of the previous
testing. The Ubuntu 19.10 VM is having the same problem as the Ubuntu
20.04 host.

root@r920-cmwhv52:~# grep smu /var/log/kern.log
Mar 13 22:40:40 r920-cmwhv52 kernel: [   11.394925] amdgpu: [powerplay] 
hwmgr_sw_init smu backed is polaris10_smu
Mar 13 22:40:40 r920-cmwhv52 kernel: [   14.788159] smu firmware loading failed
Mar 13 23:35:05 r920-cmwhv52 kernel: [   11.368465] amdgpu: [powerplay] 
hwmgr_sw_init smu backed is polaris10_smu
Mar 13 23:35:05 r920-cmwhv52 kernel: [   14.750625] smu firmware loading failed
Mar 14 01:23:06 r920-cmwhv52 kernel: [   11.484064] amdgpu: [powerplay] 
hwmgr_sw_init smu backed is polaris10_smu
Mar 14 01:23:06 r920-cmwhv52 kernel: [   14.872698] smu firmware loading failed
Mar 14 11:44:19 r920-cmwhv52 kernel: [   11.487538] amdgpu: [powerplay] 
hwmgr_sw_init smu backed is polaris10_smu
Mar 14 11:44:19 r920-cmwhv52 kernel: [   14.869816] smu firmware loading failed
Mar 14 13:36:40 r920-cmwhv52 kernel: [  815.669131] amdgpu: [powerplay] 
hwmgr_sw_init smu backed is polaris10_smu
Mar 14 13:36:43 r920-cmwhv52 kernel: [  819.095834] smu firmware loading failed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-5.4 in Ubuntu.
https://bugs.launchpad.net/bugs/1867455

Title:
  Having more than one AMD Radeon RX 580 installed will trigger a PCIe
  bus fatal error during boot.

Status in linux-5.4 package in Ubuntu:
  New

Bug description:
  I have a Dell R920 with three MSI Radeon RX 580 8G V1 cards, if more
  than one of these cards is installed in the system, a bus fatal error
  will be generated causing the system to reset itself when the amdgpu
  module is loaded and it tries to initialize the device.

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-modules-5.4.0-14-lowlatency 5.4.0-14.17
  ProcVersionSignature: Ubuntu 5.4.0-14.17-lowlatency 5.4.18
  Uname: Linux 5.4.0-14-lowlatency x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', 
'/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D9p', 
'/dev/snd/pcmC0D8p', '/dev/snd/pcmC0D7p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Sat Mar 14 12:20:47 2020
  Dependencies:
   
  MachineType: Dell Inc. PowerEdge R920
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-14-lowlatency 
root=ZFS=rpool/ROOT/ubuntu ro intel_iommu=on iommu=pt 
root=ZFS=rpool/ROOT/ubuntu intremap=no_x2apic_optout ipv6.disable=1 
mitigations=off
  PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No 
PulseAudio daemon running, or not running as session daemon.
  RelatedPackageVersions:
   linux-restricted-modules-5.4.0-14-lowlatency N/A
   linux-backports-modules-5.4.0-14-lowlatency  N/A
   linux-firmware   1.186
  RfKill:
   
  SourcePackage: linux-5.4
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 06/26/2019
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 1.9.0
  dmi.board.name: 0V7HD0
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A06
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr1.9.0:bd06/26/2019:svnDellInc.:pnPowerEdgeR920:pvr:rvnDellInc.:rn0V7HD0:rvrA06:cvnDellInc.:ct23:cvr:
  dmi.product.name: PowerEdge R920
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-5.4/+bug/1867455/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1867455] Re: Having more than one AMD Radeon RX 580 installed will trigger a PCIe bus fatal error during boot.

2020-03-14 Thread Nikolas Britton
I noticed the follow in /var/log/kern.log when I passed through two
RX580 to the same Ubuntu 19.10 VM:

root@over-zfs-test-1:/var/log# grep amd kern.log
Mar 14 19:13:31 over-zfs-test-1 kernel: [0.00] Linux version 
5.3.0-40-generic (buildd@lcy01-amd64-026) (gcc version 9.2.1 20191008 (Ubuntu 
9.2.1-9ubuntu2)) #32-Ubuntu SMP Fri Jan 31 20:24:34 UTC 2020 (Ubuntu 
5.3.0-40.32-generic 5.3.18)
Mar 14 19:13:31 over-zfs-test-1 kernel: [5.896408] [drm] amdgpu kernel 
modesetting enabled.
Mar 14 19:13:31 over-zfs-test-1 kernel: [5.899443] amdgpu :06:00.0: 
remove_conflicting_pci_framebuffers: bar 0: 0x12 -> 0x13
Mar 14 19:13:31 over-zfs-test-1 kernel: [5.899495] amdgpu :06:00.0: 
remove_conflicting_pci_framebuffers: bar 2: 0x14 -> 0x14001f
Mar 14 19:13:31 over-zfs-test-1 kernel: [5.899560] amdgpu :06:00.0: 
remove_conflicting_pci_framebuffers: bar 5: 0xfc20 -> 0xfc23
Mar 14 19:13:31 over-zfs-test-1 kernel: [6.324289] amdgpu :06:00.0: GPU 
pci config reset
Mar 14 19:13:31 over-zfs-test-1 kernel: [6.844695] amdgpu :06:00.0: BAR 
2: releasing [mem 0x14-0x14001f 64bit pref]
Mar 14 19:13:31 over-zfs-test-1 kernel: [6.846800] amdgpu :06:00.0: BAR 
0: releasing [mem 0x12-0x13 64bit pref]
Mar 14 19:13:31 over-zfs-test-1 kernel: [6.851331] amdgpu :06:00.0: BAR 
0: assigned [mem 0x12-0x13 64bit pref]
Mar 14 19:13:31 over-zfs-test-1 kernel: [6.853799] amdgpu :06:00.0: BAR 
2: assigned [mem 0x14-0x14001f 64bit pref]
Mar 14 19:13:31 over-zfs-test-1 kernel: [7.055675] amdgpu :06:00.0: 
VRAM: 8192M 0x00F4 - 0x00F5 (8192M used)
Mar 14 19:13:31 over-zfs-test-1 kernel: [7.057405] amdgpu :06:00.0: 
GART: 256M 0x00FF - 0x00FF0FFF
Mar 14 19:13:31 over-zfs-test-1 kernel: [7.062079] [drm] amdgpu: 8192M of 
VRAM memory ready
Mar 14 19:13:31 over-zfs-test-1 kernel: [7.063325] [drm] amdgpu: 8192M of 
GTT memory ready.
Mar 14 19:13:31 over-zfs-test-1 kernel: [7.086747] amdgpu: [powerplay] 
hwmgr_sw_init smu backed is polaris10_smu
Mar 14 19:13:31 over-zfs-test-1 kernel: [   10.693155] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   12.530859] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   16.088944] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   17.909031] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   21.309891] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   23.014157] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   26.406895] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   28.115737] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   31.511862] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   33.210902] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   34.974621] amdgpu: [powerplay] SMU 
load firmware failed
Mar 14 19:13:31 over-zfs-test-1 kernel: [   34.975513] amdgpu: [powerplay] fw 
load failed
Mar 14 19:13:31 over-zfs-test-1 kernel: [   34.976809] amdgpu :06:00.0: 
amdgpu_device_ip_init failed
Mar 14 19:13:31 over-zfs-test-1 kernel: [   34.977443] amdgpu :06:00.0: 
Fatal error during GPU init
Mar 14 19:13:31 over-zfs-test-1 kernel: [   34.978183] [drm] amdgpu: finishing 
device.
Mar 14 19:13:31 over-zfs-test-1 kernel: [   34.992069] WARNING: CPU: 22 PID: 
361 at drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:930 
amdgpu_bo_unpin.cold+0x0/0x40 [amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   34.993699] Modules linked in: 
hid_generic usbhid hid amdgpu(+) crct10dif_pclmul crc32_pclmul 
ghash_clmulni_intel amd_iommu_v2 gpu_sched i2c_algo_bit qxl aesni_intel ttm 
drm_kms_helper aes_x86_64 syscopyarea crypto_simd sysfillrect sysimgblt cryptd 
ahci fb_sys_fops psmouse virtio_net glue_helper lpc_ich i2c_i801 libahci drm 
net_failover virtio_blk failover
Mar 14 19:13:31 over-zfs-test-1 kernel: [   34.997757] RIP: 
0010:amdgpu_bo_unpin.cold+0x0/0x40 [amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   35.007489]  
amdgpu_bo_free_kernel+0x70/0x120 [amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   35.008290]  
amdgpu_gfx_rlc_fini+0x4b/0x70 [amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   35.009090]  
gfx_v8_0_sw_fini+0xac/0x1a0 [amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   35.009905]  
amdgpu_device_fini+0x26b/0x4ac [amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   35.010697]  
amdgpu_driver_unload_kms+0x52/0xa0 [amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   35.011517]  
amdgpu_driver_load_kms.cold+0x39/0x5c [amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   35.013070]  
amdgpu_pci_probe+0xf7/0x160 [amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   35.023395]  amdgpu_init+0x83/0x8d 
[amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   35.037565] amdgpu :06:00.0: 
a855cc3c unpin not necessary
Mar 14 19:13:31 

[Kernel-packages] [Bug 1867455] Re: Having more than one AMD Radeon RX 580 installed will trigger a PCIe bus fatal error during boot.

2020-03-14 Thread Nikolas Britton
With the amdgpu module blacklisted on the host hypervisor, I was able to
successfully passthrough the GPU to a separate Ubuntu 19.10 VM, one GPU
per VM. See attached screenshot. So I would say this demonstrates that
the physical hardware is fine and that this is a software problem.

** Attachment added: "Screenshot of GPU passthrough to VMs"
   
https://bugs.launchpad.net/ubuntu/+source/linux-5.4/+bug/1867455/+attachment/5336977/+files/dual-vms-screenshot.png

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-5.4 in Ubuntu.
https://bugs.launchpad.net/bugs/1867455

Title:
  Having more than one AMD Radeon RX 580 installed will trigger a PCIe
  bus fatal error during boot.

Status in linux-5.4 package in Ubuntu:
  New

Bug description:
  I have a Dell R920 with three MSI Radeon RX 580 8G V1 cards, if more
  than one of these cards is installed in the system, a bus fatal error
  will be generated causing the system to reset itself when the amdgpu
  module is loaded and it tries to initialize the device.

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-modules-5.4.0-14-lowlatency 5.4.0-14.17
  ProcVersionSignature: Ubuntu 5.4.0-14.17-lowlatency 5.4.18
  Uname: Linux 5.4.0-14-lowlatency x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', 
'/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D9p', 
'/dev/snd/pcmC0D8p', '/dev/snd/pcmC0D7p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Sat Mar 14 12:20:47 2020
  Dependencies:
   
  MachineType: Dell Inc. PowerEdge R920
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-14-lowlatency 
root=ZFS=rpool/ROOT/ubuntu ro intel_iommu=on iommu=pt 
root=ZFS=rpool/ROOT/ubuntu intremap=no_x2apic_optout ipv6.disable=1 
mitigations=off
  PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No 
PulseAudio daemon running, or not running as session daemon.
  RelatedPackageVersions:
   linux-restricted-modules-5.4.0-14-lowlatency N/A
   linux-backports-modules-5.4.0-14-lowlatency  N/A
   linux-firmware   1.186
  RfKill:
   
  SourcePackage: linux-5.4
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 06/26/2019
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 1.9.0
  dmi.board.name: 0V7HD0
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A06
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr1.9.0:bd06/26/2019:svnDellInc.:pnPowerEdgeR920:pvr:rvnDellInc.:rn0V7HD0:rvrA06:cvnDellInc.:ct23:cvr:
  dmi.product.name: PowerEdge R920
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-5.4/+bug/1867455/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1867455] Re: Having more than one AMD Radeon RX 580 installed will trigger a PCIe bus fatal error during boot.

2020-03-14 Thread Nikolas Britton
After blacklisting the amdgpu module via echo "blacklist amdgpu" >
/etc/modprobe.d/blacklist-amdgpu.conf && update-initramfs -u; the system
will let me boot all the way up with more than one RX580 installed
(currently two are installed). Running modprobe amdgpu after the system
was already booted cause a kernel panic, which I have captured and
attached a screenshot of below.

root@r920-cmwhv52:~# lspci | grep AMD
22:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] (rev e7)
22:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI 
Audio [Radeon RX 470/480 / 570/580/590]
61:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] (rev e7)
61:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI 
Audio [Radeon RX 470/480 / 570/580/590]
root@r920-cmwhv52:~# lspci -vs 22:00
22:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] (rev e7) (prog-if 00 [VGA 
controller])
Subsystem: Micro-Star International Co., Ltd. [MSI] Radeon RX 580
Flags: fast devsel, IRQ 15, NUMA node 1
Memory at 3c0 (64-bit, prefetchable) [disabled] [size=256M]
Memory at 3c01000 (64-bit, prefetchable) [disabled] [size=2M]
I/O ports at 4000 [disabled] [size=256]
Memory at afe8 (32-bit, non-prefetchable) [disabled] [size=256K]
Expansion ROM at afec [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 

Capabilities: [150] Advanced Error Reporting
Capabilities: [200] Resizable BAR 
Capabilities: [270] Secondary PCI Express
Capabilities: [2b0] Address Translation Service (ATS)
Capabilities: [2c0] Page Request Interface (PRI)
Capabilities: [2d0] Process Address Space ID (PASID)
Capabilities: [320] Latency Tolerance Reporting
Capabilities: [328] Alternative Routing-ID Interpretation (ARI)
Capabilities: [370] L1 PM Substates
lspci: Unable to load libkmod resources: error -12

22:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI 
Audio [Radeon RX 470/480 / 570/580/590]
Subsystem: Micro-Star International Co., Ltd. [MSI] Ellesmere HDMI 
Audio [Radeon RX 470/480 / 570/580/590]
Flags: bus master, fast devsel, latency 0, IRQ 379, NUMA node 1
Memory at afefc000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [48] Vendor Specific Information: Len=08 
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 

Capabilities: [150] Advanced Error Reporting
Capabilities: [328] Alternative Routing-ID Interpretation (ARI)
Kernel driver in use: snd_hda_intel

root@r920-cmwhv52:~# lspci -vs 61:00
61:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] (rev e7) (prog-if 00 [VGA 
controller])
Subsystem: Micro-Star International Co., Ltd. [MSI] Radeon RX 580
Flags: fast devsel, IRQ 15, NUMA node 3
Memory at 33ff000 (64-bit, prefetchable) [disabled] [size=256M]
Memory at 33fefe0 (64-bit, prefetchable) [disabled] [size=2M]
I/O ports at 6000 [disabled] [size=256]
Memory at a7e8 (32-bit, non-prefetchable) [disabled] [size=256K]
Expansion ROM at a7ec [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 

Capabilities: [150] Advanced Error Reporting
Capabilities: [200] Resizable BAR 
Capabilities: [270] Secondary PCI Express
Capabilities: [2b0] Address Translation Service (ATS)
Capabilities: [2c0] Page Request Interface (PRI)
Capabilities: [2d0] Process Address Space ID (PASID)
Capabilities: [320] Latency Tolerance Reporting
Capabilities: [328] Alternative Routing-ID Interpretation (ARI)
Capabilities: [370] L1 PM Substates
lspci: Unable to load libkmod resources: error -12

61:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI 
Audio [Radeon RX 470/480 / 570/580/590]
Subsystem: 

[Kernel-packages] [Bug 1867455] Re: Having more than one AMD Radeon RX 580 installed will trigger a PCIe bus fatal error during boot.

2020-03-14 Thread Nikolas Britton
** Attachment added: "Screenshot of console just before system reboot"
   
https://bugs.launchpad.net/ubuntu/+source/linux-5.4/+bug/1867455/+attachment/5336947/+files/rpviewer-28.png

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-5.4 in Ubuntu.
https://bugs.launchpad.net/bugs/1867455

Title:
  Having more than one AMD Radeon RX 580 installed will trigger a PCIe
  bus fatal error during boot.

Status in linux-5.4 package in Ubuntu:
  New

Bug description:
  I have a Dell R920 with three MSI Radeon RX 580 8G V1 cards, if more
  than one of these cards is installed in the system, a bus fatal error
  will be generated causing the system to reset itself when the amdgpu
  module is loaded and it tries to initialize the device.

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-modules-5.4.0-14-lowlatency 5.4.0-14.17
  ProcVersionSignature: Ubuntu 5.4.0-14.17-lowlatency 5.4.18
  Uname: Linux 5.4.0-14-lowlatency x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', 
'/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D9p', 
'/dev/snd/pcmC0D8p', '/dev/snd/pcmC0D7p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Sat Mar 14 12:20:47 2020
  Dependencies:
   
  MachineType: Dell Inc. PowerEdge R920
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-14-lowlatency 
root=ZFS=rpool/ROOT/ubuntu ro intel_iommu=on iommu=pt 
root=ZFS=rpool/ROOT/ubuntu intremap=no_x2apic_optout ipv6.disable=1 
mitigations=off
  PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No 
PulseAudio daemon running, or not running as session daemon.
  RelatedPackageVersions:
   linux-restricted-modules-5.4.0-14-lowlatency N/A
   linux-backports-modules-5.4.0-14-lowlatency  N/A
   linux-firmware   1.186
  RfKill:
   
  SourcePackage: linux-5.4
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 06/26/2019
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 1.9.0
  dmi.board.name: 0V7HD0
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A06
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr1.9.0:bd06/26/2019:svnDellInc.:pnPowerEdgeR920:pvr:rvnDellInc.:rn0V7HD0:rvrA06:cvnDellInc.:ct23:cvr:
  dmi.product.name: PowerEdge R920
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-5.4/+bug/1867455/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1867455] Re: Having more than one AMD Radeon RX 580 installed will trigger a PCIe bus fatal error during boot.

2020-03-14 Thread Nikolas Britton
** Attachment added: "Screenshot of iDRAC Lifecycle Log"
   
https://bugs.launchpad.net/ubuntu/+source/linux-5.4/+bug/1867455/+attachment/5336948/+files/idrac-screenshot.png

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-5.4 in Ubuntu.
https://bugs.launchpad.net/bugs/1867455

Title:
  Having more than one AMD Radeon RX 580 installed will trigger a PCIe
  bus fatal error during boot.

Status in linux-5.4 package in Ubuntu:
  New

Bug description:
  I have a Dell R920 with three MSI Radeon RX 580 8G V1 cards, if more
  than one of these cards is installed in the system, a bus fatal error
  will be generated causing the system to reset itself when the amdgpu
  module is loaded and it tries to initialize the device.

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-modules-5.4.0-14-lowlatency 5.4.0-14.17
  ProcVersionSignature: Ubuntu 5.4.0-14.17-lowlatency 5.4.18
  Uname: Linux 5.4.0-14-lowlatency x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', 
'/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D9p', 
'/dev/snd/pcmC0D8p', '/dev/snd/pcmC0D7p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Sat Mar 14 12:20:47 2020
  Dependencies:
   
  MachineType: Dell Inc. PowerEdge R920
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-14-lowlatency 
root=ZFS=rpool/ROOT/ubuntu ro intel_iommu=on iommu=pt 
root=ZFS=rpool/ROOT/ubuntu intremap=no_x2apic_optout ipv6.disable=1 
mitigations=off
  PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No 
PulseAudio daemon running, or not running as session daemon.
  RelatedPackageVersions:
   linux-restricted-modules-5.4.0-14-lowlatency N/A
   linux-backports-modules-5.4.0-14-lowlatency  N/A
   linux-firmware   1.186
  RfKill:
   
  SourcePackage: linux-5.4
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 06/26/2019
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 1.9.0
  dmi.board.name: 0V7HD0
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A06
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr1.9.0:bd06/26/2019:svnDellInc.:pnPowerEdgeR920:pvr:rvnDellInc.:rn0V7HD0:rvrA06:cvnDellInc.:ct23:cvr:
  dmi.product.name: PowerEdge R920
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-5.4/+bug/1867455/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp