[Kernel-packages] [Bug 1867455] Re: Having more than one AMD Radeon RX 580 installed will trigger a PCIe bus fatal error during boot.

2020-03-14 Thread Nikolas Britton
It appears the firmware is not loading, looking through the previous
logs it appears the firmware has never loaded during any of the previous
testing. The Ubuntu 19.10 VM is having the same problem as the Ubuntu
20.04 host.

root@r920-cmwhv52:~# grep smu /var/log/kern.log
Mar 13 22:40:40 r920-cmwhv52 kernel: [   11.394925] amdgpu: [powerplay] 
hwmgr_sw_init smu backed is polaris10_smu
Mar 13 22:40:40 r920-cmwhv52 kernel: [   14.788159] smu firmware loading failed
Mar 13 23:35:05 r920-cmwhv52 kernel: [   11.368465] amdgpu: [powerplay] 
hwmgr_sw_init smu backed is polaris10_smu
Mar 13 23:35:05 r920-cmwhv52 kernel: [   14.750625] smu firmware loading failed
Mar 14 01:23:06 r920-cmwhv52 kernel: [   11.484064] amdgpu: [powerplay] 
hwmgr_sw_init smu backed is polaris10_smu
Mar 14 01:23:06 r920-cmwhv52 kernel: [   14.872698] smu firmware loading failed
Mar 14 11:44:19 r920-cmwhv52 kernel: [   11.487538] amdgpu: [powerplay] 
hwmgr_sw_init smu backed is polaris10_smu
Mar 14 11:44:19 r920-cmwhv52 kernel: [   14.869816] smu firmware loading failed
Mar 14 13:36:40 r920-cmwhv52 kernel: [  815.669131] amdgpu: [powerplay] 
hwmgr_sw_init smu backed is polaris10_smu
Mar 14 13:36:43 r920-cmwhv52 kernel: [  819.095834] smu firmware loading failed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-5.4 in Ubuntu.
https://bugs.launchpad.net/bugs/1867455

Title:
  Having more than one AMD Radeon RX 580 installed will trigger a PCIe
  bus fatal error during boot.

Status in linux-5.4 package in Ubuntu:
  New

Bug description:
  I have a Dell R920 with three MSI Radeon RX 580 8G V1 cards, if more
  than one of these cards is installed in the system, a bus fatal error
  will be generated causing the system to reset itself when the amdgpu
  module is loaded and it tries to initialize the device.

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-modules-5.4.0-14-lowlatency 5.4.0-14.17
  ProcVersionSignature: Ubuntu 5.4.0-14.17-lowlatency 5.4.18
  Uname: Linux 5.4.0-14-lowlatency x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', 
'/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D9p', 
'/dev/snd/pcmC0D8p', '/dev/snd/pcmC0D7p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Sat Mar 14 12:20:47 2020
  Dependencies:
   
  MachineType: Dell Inc. PowerEdge R920
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-14-lowlatency 
root=ZFS=rpool/ROOT/ubuntu ro intel_iommu=on iommu=pt 
root=ZFS=rpool/ROOT/ubuntu intremap=no_x2apic_optout ipv6.disable=1 
mitigations=off
  PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No 
PulseAudio daemon running, or not running as session daemon.
  RelatedPackageVersions:
   linux-restricted-modules-5.4.0-14-lowlatency N/A
   linux-backports-modules-5.4.0-14-lowlatency  N/A
   linux-firmware   1.186
  RfKill:
   
  SourcePackage: linux-5.4
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 06/26/2019
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 1.9.0
  dmi.board.name: 0V7HD0
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A06
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr1.9.0:bd06/26/2019:svnDellInc.:pnPowerEdgeR920:pvr:rvnDellInc.:rn0V7HD0:rvrA06:cvnDellInc.:ct23:cvr:
  dmi.product.name: PowerEdge R920
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-5.4/+bug/1867455/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1867455] Re: Having more than one AMD Radeon RX 580 installed will trigger a PCIe bus fatal error during boot.

2020-03-14 Thread Nikolas Britton
I noticed the follow in /var/log/kern.log when I passed through two
RX580 to the same Ubuntu 19.10 VM:

root@over-zfs-test-1:/var/log# grep amd kern.log
Mar 14 19:13:31 over-zfs-test-1 kernel: [0.00] Linux version 
5.3.0-40-generic (buildd@lcy01-amd64-026) (gcc version 9.2.1 20191008 (Ubuntu 
9.2.1-9ubuntu2)) #32-Ubuntu SMP Fri Jan 31 20:24:34 UTC 2020 (Ubuntu 
5.3.0-40.32-generic 5.3.18)
Mar 14 19:13:31 over-zfs-test-1 kernel: [5.896408] [drm] amdgpu kernel 
modesetting enabled.
Mar 14 19:13:31 over-zfs-test-1 kernel: [5.899443] amdgpu :06:00.0: 
remove_conflicting_pci_framebuffers: bar 0: 0x12 -> 0x13
Mar 14 19:13:31 over-zfs-test-1 kernel: [5.899495] amdgpu :06:00.0: 
remove_conflicting_pci_framebuffers: bar 2: 0x14 -> 0x14001f
Mar 14 19:13:31 over-zfs-test-1 kernel: [5.899560] amdgpu :06:00.0: 
remove_conflicting_pci_framebuffers: bar 5: 0xfc20 -> 0xfc23
Mar 14 19:13:31 over-zfs-test-1 kernel: [6.324289] amdgpu :06:00.0: GPU 
pci config reset
Mar 14 19:13:31 over-zfs-test-1 kernel: [6.844695] amdgpu :06:00.0: BAR 
2: releasing [mem 0x14-0x14001f 64bit pref]
Mar 14 19:13:31 over-zfs-test-1 kernel: [6.846800] amdgpu :06:00.0: BAR 
0: releasing [mem 0x12-0x13 64bit pref]
Mar 14 19:13:31 over-zfs-test-1 kernel: [6.851331] amdgpu :06:00.0: BAR 
0: assigned [mem 0x12-0x13 64bit pref]
Mar 14 19:13:31 over-zfs-test-1 kernel: [6.853799] amdgpu :06:00.0: BAR 
2: assigned [mem 0x14-0x14001f 64bit pref]
Mar 14 19:13:31 over-zfs-test-1 kernel: [7.055675] amdgpu :06:00.0: 
VRAM: 8192M 0x00F4 - 0x00F5 (8192M used)
Mar 14 19:13:31 over-zfs-test-1 kernel: [7.057405] amdgpu :06:00.0: 
GART: 256M 0x00FF - 0x00FF0FFF
Mar 14 19:13:31 over-zfs-test-1 kernel: [7.062079] [drm] amdgpu: 8192M of 
VRAM memory ready
Mar 14 19:13:31 over-zfs-test-1 kernel: [7.063325] [drm] amdgpu: 8192M of 
GTT memory ready.
Mar 14 19:13:31 over-zfs-test-1 kernel: [7.086747] amdgpu: [powerplay] 
hwmgr_sw_init smu backed is polaris10_smu
Mar 14 19:13:31 over-zfs-test-1 kernel: [   10.693155] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   12.530859] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   16.088944] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   17.909031] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   21.309891] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   23.014157] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   26.406895] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   28.115737] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   31.511862] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   33.210902] amdgpu: [powerplay]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   34.974621] amdgpu: [powerplay] SMU 
load firmware failed
Mar 14 19:13:31 over-zfs-test-1 kernel: [   34.975513] amdgpu: [powerplay] fw 
load failed
Mar 14 19:13:31 over-zfs-test-1 kernel: [   34.976809] amdgpu :06:00.0: 
amdgpu_device_ip_init failed
Mar 14 19:13:31 over-zfs-test-1 kernel: [   34.977443] amdgpu :06:00.0: 
Fatal error during GPU init
Mar 14 19:13:31 over-zfs-test-1 kernel: [   34.978183] [drm] amdgpu: finishing 
device.
Mar 14 19:13:31 over-zfs-test-1 kernel: [   34.992069] WARNING: CPU: 22 PID: 
361 at drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:930 
amdgpu_bo_unpin.cold+0x0/0x40 [amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   34.993699] Modules linked in: 
hid_generic usbhid hid amdgpu(+) crct10dif_pclmul crc32_pclmul 
ghash_clmulni_intel amd_iommu_v2 gpu_sched i2c_algo_bit qxl aesni_intel ttm 
drm_kms_helper aes_x86_64 syscopyarea crypto_simd sysfillrect sysimgblt cryptd 
ahci fb_sys_fops psmouse virtio_net glue_helper lpc_ich i2c_i801 libahci drm 
net_failover virtio_blk failover
Mar 14 19:13:31 over-zfs-test-1 kernel: [   34.997757] RIP: 
0010:amdgpu_bo_unpin.cold+0x0/0x40 [amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   35.007489]  
amdgpu_bo_free_kernel+0x70/0x120 [amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   35.008290]  
amdgpu_gfx_rlc_fini+0x4b/0x70 [amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   35.009090]  
gfx_v8_0_sw_fini+0xac/0x1a0 [amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   35.009905]  
amdgpu_device_fini+0x26b/0x4ac [amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   35.010697]  
amdgpu_driver_unload_kms+0x52/0xa0 [amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   35.011517]  
amdgpu_driver_load_kms.cold+0x39/0x5c [amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   35.013070]  
amdgpu_pci_probe+0xf7/0x160 [amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   35.023395]  amdgpu_init+0x83/0x8d 
[amdgpu]
Mar 14 19:13:31 over-zfs-test-1 kernel: [   35.037565] amdgpu :06:00.0: 
a855cc3c unpin not necessary
Mar 14 19:13:31 

[Kernel-packages] [Bug 1867455] Re: Having more than one AMD Radeon RX 580 installed will trigger a PCIe bus fatal error during boot.

2020-03-14 Thread Nikolas Britton
With the amdgpu module blacklisted on the host hypervisor, I was able to
successfully passthrough the GPU to a separate Ubuntu 19.10 VM, one GPU
per VM. See attached screenshot. So I would say this demonstrates that
the physical hardware is fine and that this is a software problem.

** Attachment added: "Screenshot of GPU passthrough to VMs"
   
https://bugs.launchpad.net/ubuntu/+source/linux-5.4/+bug/1867455/+attachment/5336977/+files/dual-vms-screenshot.png

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-5.4 in Ubuntu.
https://bugs.launchpad.net/bugs/1867455

Title:
  Having more than one AMD Radeon RX 580 installed will trigger a PCIe
  bus fatal error during boot.

Status in linux-5.4 package in Ubuntu:
  New

Bug description:
  I have a Dell R920 with three MSI Radeon RX 580 8G V1 cards, if more
  than one of these cards is installed in the system, a bus fatal error
  will be generated causing the system to reset itself when the amdgpu
  module is loaded and it tries to initialize the device.

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-modules-5.4.0-14-lowlatency 5.4.0-14.17
  ProcVersionSignature: Ubuntu 5.4.0-14.17-lowlatency 5.4.18
  Uname: Linux 5.4.0-14-lowlatency x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', 
'/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D9p', 
'/dev/snd/pcmC0D8p', '/dev/snd/pcmC0D7p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Sat Mar 14 12:20:47 2020
  Dependencies:
   
  MachineType: Dell Inc. PowerEdge R920
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-14-lowlatency 
root=ZFS=rpool/ROOT/ubuntu ro intel_iommu=on iommu=pt 
root=ZFS=rpool/ROOT/ubuntu intremap=no_x2apic_optout ipv6.disable=1 
mitigations=off
  PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No 
PulseAudio daemon running, or not running as session daemon.
  RelatedPackageVersions:
   linux-restricted-modules-5.4.0-14-lowlatency N/A
   linux-backports-modules-5.4.0-14-lowlatency  N/A
   linux-firmware   1.186
  RfKill:
   
  SourcePackage: linux-5.4
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 06/26/2019
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 1.9.0
  dmi.board.name: 0V7HD0
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A06
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr1.9.0:bd06/26/2019:svnDellInc.:pnPowerEdgeR920:pvr:rvnDellInc.:rn0V7HD0:rvrA06:cvnDellInc.:ct23:cvr:
  dmi.product.name: PowerEdge R920
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-5.4/+bug/1867455/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1867455] Re: Having more than one AMD Radeon RX 580 installed will trigger a PCIe bus fatal error during boot.

2020-03-14 Thread Nikolas Britton
After blacklisting the amdgpu module via echo "blacklist amdgpu" >
/etc/modprobe.d/blacklist-amdgpu.conf && update-initramfs -u; the system
will let me boot all the way up with more than one RX580 installed
(currently two are installed). Running modprobe amdgpu after the system
was already booted cause a kernel panic, which I have captured and
attached a screenshot of below.

root@r920-cmwhv52:~# lspci | grep AMD
22:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] (rev e7)
22:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI 
Audio [Radeon RX 470/480 / 570/580/590]
61:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] (rev e7)
61:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI 
Audio [Radeon RX 470/480 / 570/580/590]
root@r920-cmwhv52:~# lspci -vs 22:00
22:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] (rev e7) (prog-if 00 [VGA 
controller])
Subsystem: Micro-Star International Co., Ltd. [MSI] Radeon RX 580
Flags: fast devsel, IRQ 15, NUMA node 1
Memory at 3c0 (64-bit, prefetchable) [disabled] [size=256M]
Memory at 3c01000 (64-bit, prefetchable) [disabled] [size=2M]
I/O ports at 4000 [disabled] [size=256]
Memory at afe8 (32-bit, non-prefetchable) [disabled] [size=256K]
Expansion ROM at afec [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 

Capabilities: [150] Advanced Error Reporting
Capabilities: [200] Resizable BAR 
Capabilities: [270] Secondary PCI Express
Capabilities: [2b0] Address Translation Service (ATS)
Capabilities: [2c0] Page Request Interface (PRI)
Capabilities: [2d0] Process Address Space ID (PASID)
Capabilities: [320] Latency Tolerance Reporting
Capabilities: [328] Alternative Routing-ID Interpretation (ARI)
Capabilities: [370] L1 PM Substates
lspci: Unable to load libkmod resources: error -12

22:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI 
Audio [Radeon RX 470/480 / 570/580/590]
Subsystem: Micro-Star International Co., Ltd. [MSI] Ellesmere HDMI 
Audio [Radeon RX 470/480 / 570/580/590]
Flags: bus master, fast devsel, latency 0, IRQ 379, NUMA node 1
Memory at afefc000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [48] Vendor Specific Information: Len=08 
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 

Capabilities: [150] Advanced Error Reporting
Capabilities: [328] Alternative Routing-ID Interpretation (ARI)
Kernel driver in use: snd_hda_intel

root@r920-cmwhv52:~# lspci -vs 61:00
61:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Ellesmere [Radeon RX 470/480/570/570X/580/580X/590] (rev e7) (prog-if 00 [VGA 
controller])
Subsystem: Micro-Star International Co., Ltd. [MSI] Radeon RX 580
Flags: fast devsel, IRQ 15, NUMA node 3
Memory at 33ff000 (64-bit, prefetchable) [disabled] [size=256M]
Memory at 33fefe0 (64-bit, prefetchable) [disabled] [size=2M]
I/O ports at 6000 [disabled] [size=256]
Memory at a7e8 (32-bit, non-prefetchable) [disabled] [size=256K]
Expansion ROM at a7ec [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 
Capabilities: [50] Power Management version 3
Capabilities: [58] Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 

Capabilities: [150] Advanced Error Reporting
Capabilities: [200] Resizable BAR 
Capabilities: [270] Secondary PCI Express
Capabilities: [2b0] Address Translation Service (ATS)
Capabilities: [2c0] Page Request Interface (PRI)
Capabilities: [2d0] Process Address Space ID (PASID)
Capabilities: [320] Latency Tolerance Reporting
Capabilities: [328] Alternative Routing-ID Interpretation (ARI)
Capabilities: [370] L1 PM Substates
lspci: Unable to load libkmod resources: error -12

61:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere HDMI 
Audio [Radeon RX 470/480 / 570/580/590]
Subsystem: 

[Kernel-packages] [Bug 1867455] Re: Having more than one AMD Radeon RX 580 installed will trigger a PCIe bus fatal error during boot.

2020-03-14 Thread Nikolas Britton
** Attachment added: "Screenshot of console just before system reboot"
   
https://bugs.launchpad.net/ubuntu/+source/linux-5.4/+bug/1867455/+attachment/5336947/+files/rpviewer-28.png

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-5.4 in Ubuntu.
https://bugs.launchpad.net/bugs/1867455

Title:
  Having more than one AMD Radeon RX 580 installed will trigger a PCIe
  bus fatal error during boot.

Status in linux-5.4 package in Ubuntu:
  New

Bug description:
  I have a Dell R920 with three MSI Radeon RX 580 8G V1 cards, if more
  than one of these cards is installed in the system, a bus fatal error
  will be generated causing the system to reset itself when the amdgpu
  module is loaded and it tries to initialize the device.

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-modules-5.4.0-14-lowlatency 5.4.0-14.17
  ProcVersionSignature: Ubuntu 5.4.0-14.17-lowlatency 5.4.18
  Uname: Linux 5.4.0-14-lowlatency x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', 
'/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D9p', 
'/dev/snd/pcmC0D8p', '/dev/snd/pcmC0D7p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Sat Mar 14 12:20:47 2020
  Dependencies:
   
  MachineType: Dell Inc. PowerEdge R920
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-14-lowlatency 
root=ZFS=rpool/ROOT/ubuntu ro intel_iommu=on iommu=pt 
root=ZFS=rpool/ROOT/ubuntu intremap=no_x2apic_optout ipv6.disable=1 
mitigations=off
  PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No 
PulseAudio daemon running, or not running as session daemon.
  RelatedPackageVersions:
   linux-restricted-modules-5.4.0-14-lowlatency N/A
   linux-backports-modules-5.4.0-14-lowlatency  N/A
   linux-firmware   1.186
  RfKill:
   
  SourcePackage: linux-5.4
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 06/26/2019
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 1.9.0
  dmi.board.name: 0V7HD0
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A06
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr1.9.0:bd06/26/2019:svnDellInc.:pnPowerEdgeR920:pvr:rvnDellInc.:rn0V7HD0:rvrA06:cvnDellInc.:ct23:cvr:
  dmi.product.name: PowerEdge R920
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-5.4/+bug/1867455/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1867455] Re: Having more than one AMD Radeon RX 580 installed will trigger a PCIe bus fatal error during boot.

2020-03-14 Thread Nikolas Britton
** Attachment added: "Screenshot of iDRAC Lifecycle Log"
   
https://bugs.launchpad.net/ubuntu/+source/linux-5.4/+bug/1867455/+attachment/5336948/+files/idrac-screenshot.png

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-5.4 in Ubuntu.
https://bugs.launchpad.net/bugs/1867455

Title:
  Having more than one AMD Radeon RX 580 installed will trigger a PCIe
  bus fatal error during boot.

Status in linux-5.4 package in Ubuntu:
  New

Bug description:
  I have a Dell R920 with three MSI Radeon RX 580 8G V1 cards, if more
  than one of these cards is installed in the system, a bus fatal error
  will be generated causing the system to reset itself when the amdgpu
  module is loaded and it tries to initialize the device.

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-modules-5.4.0-14-lowlatency 5.4.0-14.17
  ProcVersionSignature: Ubuntu 5.4.0-14.17-lowlatency 5.4.18
  Uname: Linux 5.4.0-14-lowlatency x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', 
'/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D9p', 
'/dev/snd/pcmC0D8p', '/dev/snd/pcmC0D7p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Sat Mar 14 12:20:47 2020
  Dependencies:
   
  MachineType: Dell Inc. PowerEdge R920
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-14-lowlatency 
root=ZFS=rpool/ROOT/ubuntu ro intel_iommu=on iommu=pt 
root=ZFS=rpool/ROOT/ubuntu intremap=no_x2apic_optout ipv6.disable=1 
mitigations=off
  PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No 
PulseAudio daemon running, or not running as session daemon.
  RelatedPackageVersions:
   linux-restricted-modules-5.4.0-14-lowlatency N/A
   linux-backports-modules-5.4.0-14-lowlatency  N/A
   linux-firmware   1.186
  RfKill:
   
  SourcePackage: linux-5.4
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 06/26/2019
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 1.9.0
  dmi.board.name: 0V7HD0
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A06
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr1.9.0:bd06/26/2019:svnDellInc.:pnPowerEdgeR920:pvr:rvnDellInc.:rn0V7HD0:rvrA06:cvnDellInc.:ct23:cvr:
  dmi.product.name: PowerEdge R920
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-5.4/+bug/1867455/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1867455] [NEW] Having more than one AMD Radeon RX 580 installed will trigger a PCIe bus fatal error during boot.

2020-03-14 Thread Nikolas Britton
Public bug reported:

I have a Dell R920 with three MSI Radeon RX 580 8G V1 cards, if more
than one of these cards is installed in the system, a bus fatal error
will be generated causing the system to reset itself when the amdgpu
module is loaded and it tries to initialize the device.

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: linux-modules-5.4.0-14-lowlatency 5.4.0-14.17
ProcVersionSignature: Ubuntu 5.4.0-14.17-lowlatency 5.4.18
Uname: Linux 5.4.0-14-lowlatency x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
ApportVersion: 2.20.11-0ubuntu20
Architecture: amd64
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', 
'/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D9p', 
'/dev/snd/pcmC0D8p', '/dev/snd/pcmC0D7p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
Date: Sat Mar 14 12:20:47 2020
Dependencies:
 
MachineType: Dell Inc. PowerEdge R920
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 mgag200drmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-14-lowlatency 
root=ZFS=rpool/ROOT/ubuntu ro intel_iommu=on iommu=pt 
root=ZFS=rpool/ROOT/ubuntu intremap=no_x2apic_optout ipv6.disable=1 
mitigations=off
PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No 
PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-5.4.0-14-lowlatency N/A
 linux-backports-modules-5.4.0-14-lowlatency  N/A
 linux-firmware   1.186
RfKill:
 
SourcePackage: linux-5.4
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 06/26/2019
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 1.9.0
dmi.board.name: 0V7HD0
dmi.board.vendor: Dell Inc.
dmi.board.version: A06
dmi.chassis.type: 23
dmi.chassis.vendor: Dell Inc.
dmi.modalias: 
dmi:bvnDellInc.:bvr1.9.0:bd06/26/2019:svnDellInc.:pnPowerEdgeR920:pvr:rvnDellInc.:rn0V7HD0:rvrA06:cvnDellInc.:ct23:cvr:
dmi.product.name: PowerEdge R920
dmi.sys.vendor: Dell Inc.

** Affects: linux-5.4 (Ubuntu)
 Importance: Undecided
 Status: New


** Tags: amd64 apport-bug focal

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-5.4 in Ubuntu.
https://bugs.launchpad.net/bugs/1867455

Title:
  Having more than one AMD Radeon RX 580 installed will trigger a PCIe
  bus fatal error during boot.

Status in linux-5.4 package in Ubuntu:
  New

Bug description:
  I have a Dell R920 with three MSI Radeon RX 580 8G V1 cards, if more
  than one of these cards is installed in the system, a bus fatal error
  will be generated causing the system to reset itself when the amdgpu
  module is loaded and it tries to initialize the device.

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: linux-modules-5.4.0-14-lowlatency 5.4.0-14.17
  ProcVersionSignature: Ubuntu 5.4.0-14.17-lowlatency 5.4.18
  Uname: Linux 5.4.0-14-lowlatency x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu20
  Architecture: amd64
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', 
'/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D9p', 
'/dev/snd/pcmC0D8p', '/dev/snd/pcmC0D7p', '/dev/snd/pcmC0D3p', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Sat Mar 14 12:20:47 2020
  Dependencies:
   
  MachineType: Dell Inc. PowerEdge R920
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-5.4.0-14-lowlatency 
root=ZFS=rpool/ROOT/ubuntu ro intel_iommu=on iommu=pt 
root=ZFS=rpool/ROOT/ubuntu intremap=no_x2apic_optout ipv6.disable=1 
mitigations=off
  PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No 
PulseAudio daemon running, or not running as session daemon.
  RelatedPackageVersions:
   linux-restricted-modules-5.4.0-14-lowlatency N/A
   linux-backports-modules-5.4.0-14-lowlatency  N/A
   linux-firmware   1.186
  RfKill:
   
  SourcePackage: linux-5.4
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 06/26/2019
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 1.9.0
  dmi.board.name: 0V7HD0
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A06
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr1.9.0:bd06/26/2019:svnDellInc.:pnPowerEdgeR920:pvr:rvnDellInc.:rn0V7HD0:rvrA06:cvnDellInc.:ct23:cvr:
  dmi.product.name: PowerEdge R920
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-5.4/+bug/1867455/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1765232] Re: Kernel 4.15.0-15 breaks Dell PowerEdge 12th Gen servers

2018-04-24 Thread Nikolas Britton
I just did a fresh install of the April 24 daily build with kernel
4.15.0-19 on my Dell R820 and I can confirm this has been fix. Thank
you!

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765232

Title:
  Kernel 4.15.0-15 breaks Dell PowerEdge 12th Gen servers

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Released

Bug description:
  SRU Justification

  Impact: Since 4.15.0-15 some machines have been failing to boot due to
  IO hangs. This is caused by patches applied for LP #1759723, which
  assigned managed interrupt vectors and reply queues for all possible
  CPUs, not just present CPUs. Some drivers were not prepared to cope
  with this and end up selecting reply queues not mapped to an online
  CPU, causing IO hangs during boot.

  Fix: There are driver fixes available upstream, but there are 8-ish
  patches in total and we're extremely close to release, so the safer
  bet it to just revert the patches for LP #1759723. We can consider
  reintroducing them with required fixes at a later time.

  Regression Potential: This is obviously going to reintroduce the
  problem the patches were intended to fix. These are less serious than
  the problems which the patches introduced, and IBM has given their
  okay to revert them as well.

  Test Case: Verified to fix affected hardware on LP #1765232.

  ---

  For Ubuntu 18.04 amd64 server, I updated the kernel from 4.15.0-13 to
  4.15.0-15. The system hangs at boot up now. Rolling back to the old
  kernel everything works as expected. It appears to hang while trying
  to enumerate an SD card device, which isn't even installed. I have
  tried this on a Dell R620 and R820 and both have the exact same
  problem. I even downloaded a new installer iso from the daily builds
  to reinstall the OS and the installer hangs too now.

  Support for Dell PowerEdge 12th gen servers appears to be broken with
  kernel 4.15.0-15.

  https://imgur.com/VT9zd0w
  https://imgur.com/4BQPCXo

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765232/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1765232] Re: Kernel 4.15.0-15 breaks Dell PowerEdge 12th Gen servers

2018-04-21 Thread Nikolas Britton
The kernel from post #12 also works without issue for me too.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765232

Title:
  Kernel 4.15.0-15 breaks Dell PowerEdge 12th Gen servers

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Bionic:
  Confirmed

Bug description:
  For Ubuntu 18.04 amd64 server, I updated the kernel from 4.15.0-13 to
  4.15.0-15. The system hangs at boot up now. Rolling back to the old
  kernel everything works as expected. It appears to hang while trying
  to enumerate an SD card device, which isn't even installed. I have
  tried this on a Dell R620 and R820 and both have the exact same
  problem. I even downloaded a new installer iso from the daily builds
  to reinstall the OS and the installer hangs too now.

  Support for Dell PowerEdge 12th gen servers appears to be broken with
  kernel 4.15.0-15.

  https://imgur.com/VT9zd0w
  https://imgur.com/4BQPCXo

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765232/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1765232] Re: Kernel 4.15.0-15 breaks Dell PowerEdge 12th Gen servers

2018-04-20 Thread Nikolas Britton
I just tried the proposed kernel (4.15.0-18.19) and it still has the
same problem.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765232

Title:
  Kernel 4.15.0-15 breaks Dell PowerEdge 12th Gen servers

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Bionic:
  Confirmed

Bug description:
  For Ubuntu 18.04 amd64 server, I updated the kernel from 4.15.0-13 to
  4.15.0-15. The system hangs at boot up now. Rolling back to the old
  kernel everything works as expected. It appears to hang while trying
  to enumerate an SD card device, which isn't even installed. I have
  tried this on a Dell R620 and R820 and both have the exact same
  problem. I even downloaded a new installer iso from the daily builds
  to reinstall the OS and the installer hangs too now.

  Support for Dell PowerEdge 12th gen servers appears to be broken with
  kernel 4.15.0-15.

  https://imgur.com/VT9zd0w
  https://imgur.com/4BQPCXo

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765232/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1765105] Re: HP Smart Array driver fails to load, kernel panics boot process

2018-04-19 Thread Nikolas Britton
This bug appears to present with similar symptoms as this bug report:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765232

I presume that Dell doesn't use the hpsa driver, but both are hanging at
approximately the same time in the boot up process and the same
subsystems appear to be involved... I would speculate that they have
something in common that is triggering this.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765105

Title:
  HP Smart Array driver fails to load, kernel panics boot process

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Hi

  I have an HP ProLiant G6 with a Smart Array controller on Bionic,
  Kernel 4.15.0-15, that fails to load the hpsa driver for the Smart
  Array, (occasionally) drops to the initramfs shell, as it cannot find
  a root device in a reasonable amount of time, and then finally  kernel
  panics with the stack trace leading to the hpsa driver. Occasionally
  it will complain about resetting the logical volume scsi bus.
  Downgrading to the previous kernel 4.15.0-13 works just fine.

  Attached is a screenshot showing the kernel panic, and below is a link
  to a YouTube video of the whole boot process, a video recording from
  ILO. To save you the time of watching the whole thing, 39 seconds is
  the bus reset error, then it hangs around waiting for a root device,
  and the kernel panics at 4:16. The relevant information are separate
  below

  Video: https://youtu.be/45ka2Btgiqk

  
  lsb_release: 
  Description:  Ubuntu Bionic Beaver (development branch)
  Release:  18.04

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765105/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1765232] Re: Kernel 4.15.0-15 breaks Dell PowerEdge 12th Gen servers

2018-04-19 Thread Nikolas Britton
Re bot, the stated command can't be run because the system will not boot
to a bash prompt.

** Changed in: linux (Ubuntu)
   Status: Incomplete => Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1765232

Title:
  Kernel 4.15.0-15 breaks Dell PowerEdge 12th Gen servers

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  For Ubuntu 18.04 amd64 server, I updated the kernel from 4.15.0-13 to
  4.15.0-15. The system hangs at boot up now. Rolling back to the old
  kernel everything works as expected. It appears to hang while trying
  to enumerate an SD card device, which isn't even installed. I have
  tried this on a Dell R620 and R820 and both have the exact same
  problem. I even downloaded a new installer iso from the daily builds
  to reinstall the OS and the installer hangs too now.

  Support for Dell PowerEdge 12th gen servers appears to be broken with
  kernel 4.15.0-15.

  https://imgur.com/VT9zd0w
  https://imgur.com/4BQPCXo

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1765232/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp