[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
This bug was fixed in the package linux - 5.19.0-45.46 --- linux (5.19.0-45.46) kinetic; urgency=medium * kinetic/linux: 5.19.0-45.46 -proposed tracker (LP: #2023057) * Kinetic update: upstream stable patchset 2023-05-23 (LP: #2020599) - wifi: cfg80211: Partial revert "wifi: cfg80211: Fix use after free for wext" linux (5.19.0-44.45) kinetic; urgency=medium * kinetic/linux: 5.19.0-44.45 -proposed tracker (LP: #2019827) * Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 (LP: #2018470) - drm/amdgpu: Fix for BO move issue * CVE-2023-32233 - netfilter: nf_tables: deactivate anonymous set from preparation phase * CVE-2023-2612 - SAUCE: shiftfs: prevent lock unbalance in shiftfs_create_object() * CVE-2023-31436 - net: sched: sch_qfq: prevent slab-out-of-bounds in qfq_activate_agg * CVE-2023-1380 - wifi: brcmfmac: slab-out-of-bounds read in brcmf_get_assoc_ies() * conntrack mark is not advertised via netlink (LP: #2016269) - netfilter: ctnetlink: revert to dumping mark regardless of event type * 5.19 not reporting cgroups v1 blkio.throttle.io_serviced (LP: #2016186) - SAUCE: blk-throttle: Fix io statistics for cgroup v1 * [SRU] Backport request for hpwdt from upstream 6.1 to Jammy (LP: #2008751) - watchdog/hpwdt: Enable HP_WATCHDOG for ARM64 systems. - watchdog/hpwdt: Include nmi.h only if CONFIG_HPWDT_NMI_DECODING - [Config] Add arm64 option to CONFIG_HP_WATCHDOG * vmwgfx fails to reserve graphics buffer on aarch64 leading to blank display (LP: #2007001) - SAUCE: Revert "video/aperture: Disable and unregister sysfb devices via aperture helpers" * Ubuntu 22.04 raise abnormal NIC MSI-X requests with larger CPU cores (256) (LP: #2012335) - ice: Allow operation with reduced device MSI-X * Dell: Enable speaker mute hotkey LED indicator (LP: #2015972) - platform/x86: dell-laptop: Register ctl-led for speaker-mute * [SRU]With "Performance per Watt (DAPC)" enabled in the BIOS, Bootup time is taking longer than expected (LP: #2008527) - cpufreq: ACPI: Defer setting boost MSRs * [SRU][Jammy] CONFIG_PCI_MESON is not enabled (LP: #2007745) - [Config] arm64: Enable PCI_MESON module * Kinetic update: upstream stable patchset 2023-05-08 (LP: #2018948) - HID: asus: use spinlock to protect concurrent accesses - HID: asus: use spinlock to safely schedule workers - powerpc/mm: Rearrange if-else block to avoid clang warning - ARM: OMAP2+: Fix memory leak in realtime_counter_init() - arm64: dts: qcom: qcs404: use symbol names for PCIe resets - arm64: dts: qcom: msm8996-tone: Fix USB taking 6 minutes to wake up - arm64: dts: qcom: sm8150-kumano: Panel framebuffer is 2.5k instead of 4k - arm64: dts: qcom: sm6125: Reorder HSUSB PHY clocks to match bindings - arm64: dts: imx8m: Align SoC unique ID node unit address - ARM: zynq: Fix refcount leak in zynq_early_slcr_init - arm64: dts: mediatek: mt8183: Fix systimer 13 MHz clock description - arm64: dts: qcom: sdm845-db845c: fix audio codec interrupt pin name - arm64: dts: qcom: sc7180: correct SPMI bus address cells - arm64: dts: qcom: sc7280: correct SPMI bus address cells - arm64: dts: meson-gx: Fix Ethernet MAC address unit name - arm64: dts: meson-g12a: Fix internal Ethernet PHY unit name - arm64: dts: meson-gx: Fix the SCPI DVFS node name and unit address - arm64: dts: msm8992-bullhead: add memory hole region - arm64: dts: qcom: msm8992-bullhead: Fix cont_splash_mem size - arm64: dts: qcom: msm8992-bullhead: Disable dfps_data_mem - arm64: dts: qcom: ipq8074: correct USB3 QMP PHY-s clock output names - arm64: dts: qcom: ipq8074: fix Gen3 PCIe QMP PHY - arm64: dts: qcom: ipq8074: correct Gen2 PCIe ranges - arm64: dts: qcom: ipq8074: fix Gen3 PCIe node - arm64: dts: qcom: ipq8074: correct PCIe QMP PHY output clock names - arm64: dts: meson: remove CPU opps below 1GHz for G12A boards - ARM: OMAP1: call platform_device_put() in error case in omap1_dm_timer_init() - ARM: bcm2835_defconfig: Enable the framebuffer - ARM: s3c: fix s3c64xx_set_timer_source prototype - arm64: dts: ti: k3-j7200: Fix wakeup pinmux range - ARM: dts: exynos: correct wr-active property in Exynos3250 Rinato - ARM: imx: Call ida_simple_remove() for ida_simple_get - arm64: dts: amlogic: meson-gx: fix SCPI clock dvfs node name - arm64: dts: amlogic: meson-axg: fix SCPI clock dvfs node name - arm64: dts: amlogic: meson-gx: add missing SCPI sensors compatible - arm64: dts: amlogic: meson-gxl-s905d-sml5442tw: drop invalid clock-names property - arm64: dts: amlogic: meson-gx: add missing unit address to rng node name - arm64: dts: amlogic: meson-gxl: add missing unit address to eth-phy-mux node name - arm64: dts: amlogic: meson-gx-libretech-pc: fix update button name - arm64: dts:
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
I tested linux-nvidia 5.19.0-1014.14 from jammy-proposed on my kinetic install. Both GCN1 and GCN2 displays work. I get the ASAN debug message but everything works. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Status in linux source package in Kinetic: Fix Committed Status in linux source package in Mantic: Confirmed Bug description: [Impact] A regression caused by incomplete stable backports [Fix] commit 8273b4048664fff356fd10059033f0e2f5a422a1 Author: Arunpravin Paneer Selvam Date: Tue Oct 18 07:08:38 2022 -0700 drm/amdgpu: Fix for BO move issue [Test case] Install the update, check that display works again on amdgpu -- The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display continue on the ASPEED card. The ASPEED card is a very basic integrated card without hardware acceleration and featuring only one VGA output so that's unusable. As an additional information I know X11 never start on the ASPEED if there are discrete cards plugged in (tested last year). So right now that computer is sticking on Linux 5.19.0-23 which doesn't doesn't the graphic and btrfs bugs.
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
** Tags removed: verification-needed-jammy ** Tags added: verification-done-jammy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Status in linux source package in Kinetic: Fix Committed Status in linux source package in Mantic: Confirmed Bug description: [Impact] A regression caused by incomplete stable backports [Fix] commit 8273b4048664fff356fd10059033f0e2f5a422a1 Author: Arunpravin Paneer Selvam Date: Tue Oct 18 07:08:38 2022 -0700 drm/amdgpu: Fix for BO move issue [Test case] Install the update, check that display works again on amdgpu -- The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display continue on the ASPEED card. The ASPEED card is a very basic integrated card without hardware acceleration and featuring only one VGA output so that's unusable. As an additional information I know X11 never start on the ASPEED if there are discrete cards plugged in (tested last year). So right now that computer is sticking on Linux 5.19.0-23 which doesn't doesn't the graphic and btrfs bugs. The last kernel to not feature the graphic bug is Linux 5.19.0-29. Linux 5.19.0-31
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
Will the patch be dropped if the Nvidia kernel is not tested on AMD GPUs? Rebooting the computer affected by the bug will cost me 1 or 2 hours of work so that's a very high cost for a patch I already marked as verified. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Status in linux source package in Kinetic: Fix Committed Status in linux source package in Mantic: Confirmed Bug description: [Impact] A regression caused by incomplete stable backports [Fix] commit 8273b4048664fff356fd10059033f0e2f5a422a1 Author: Arunpravin Paneer Selvam Date: Tue Oct 18 07:08:38 2022 -0700 drm/amdgpu: Fix for BO move issue [Test case] Install the update, check that display works again on amdgpu -- The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display continue on the ASPEED card. The ASPEED card is a very basic integrated card without hardware acceleration and featuring only one VGA output so that's unusable. As an additional information I know X11 never start on the ASPEED if there are discrete cards plugged in (tested last year). So right now that computer is sticking on Linux 5.19.0-23
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
Should we really test that Nvidia kernel for a bug affecting AMD GPUs, otherwise the already verified fix would be dropped? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Status in linux source package in Kinetic: Fix Committed Status in linux source package in Mantic: Confirmed Bug description: [Impact] A regression caused by incomplete stable backports [Fix] commit 8273b4048664fff356fd10059033f0e2f5a422a1 Author: Arunpravin Paneer Selvam Date: Tue Oct 18 07:08:38 2022 -0700 drm/amdgpu: Fix for BO move issue [Test case] Install the update, check that display works again on amdgpu -- The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display continue on the ASPEED card. The ASPEED card is a very basic integrated card without hardware acceleration and featuring only one VGA output so that's unusable. As an additional information I know X11 never start on the ASPEED if there are discrete cards plugged in (tested last year). So right now that computer is sticking on Linux 5.19.0-23 which doesn't doesn't the graphic and btrfs bugs. The last kernel to not feature the graphic
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
Will this fix be backported to the Ubuntu 22.04 kernel? I can't use the amdgpu driver with my r7 260X, I had to update to kernel 6.3 to use the driver while the bug is fixed. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Status in linux source package in Kinetic: Fix Committed Status in linux source package in Mantic: Confirmed Bug description: [Impact] A regression caused by incomplete stable backports [Fix] commit 8273b4048664fff356fd10059033f0e2f5a422a1 Author: Arunpravin Paneer Selvam Date: Tue Oct 18 07:08:38 2022 -0700 drm/amdgpu: Fix for BO move issue [Test case] Install the update, check that display works again on amdgpu -- The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display continue on the ASPEED card. The ASPEED card is a very basic integrated card without hardware acceleration and featuring only one VGA output so that's unusable. As an additional information I know X11 never start on the ASPEED if there are discrete cards plugged in (tested last year). So right now that computer is sticking on Linux 5.19.0-23 which doesn't doesn't the graphic and btrfs
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
Why should we test an Nvidia kernel for a bug affecting AMD GPUs? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Status in linux source package in Kinetic: Fix Committed Status in linux source package in Mantic: Confirmed Bug description: [Impact] A regression caused by incomplete stable backports [Fix] commit 8273b4048664fff356fd10059033f0e2f5a422a1 Author: Arunpravin Paneer Selvam Date: Tue Oct 18 07:08:38 2022 -0700 drm/amdgpu: Fix for BO move issue [Test case] Install the update, check that display works again on amdgpu -- The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display continue on the ASPEED card. The ASPEED card is a very basic integrated card without hardware acceleration and featuring only one VGA output so that's unusable. As an additional information I know X11 never start on the ASPEED if there are discrete cards plugged in (tested last year). So right now that computer is sticking on Linux 5.19.0-23 which doesn't doesn't the graphic and btrfs bugs. The last kernel to not feature the graphic bug is Linux 5.19.0-29. Linux 5.19.0-31 is the first
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
This bug is awaiting verification that the linux- nvidia-5.19/5.19.0-1014.14 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-jammy' to 'verification-done-jammy'. If the problem still exists, change the tag 'verification-needed-jammy' to 'verification-failed-jammy'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: kernel-spammed-jammy-linux-nvidia-5.19 verification-needed-jammy -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Status in linux source package in Kinetic: Fix Committed Status in linux source package in Mantic: Confirmed Bug description: [Impact] A regression caused by incomplete stable backports [Fix] commit 8273b4048664fff356fd10059033f0e2f5a422a1 Author: Arunpravin Paneer Selvam Date: Tue Oct 18 07:08:38 2022 -0700 drm/amdgpu: Fix for BO move issue [Test case] Install the update, check that display works again on amdgpu -- The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
> - I still have an error message in dmesg: In this issue you actually discovered two independent bugs. The first was the regression, the second was the UBSAN issue. * The first fix is what you tested. * The second fix wasn't picked up yet. This is the commit that is now landed upstream for the second one: https://github.com/torvalds/linux/commit/58d9b9a14b47c2a3da6effcbb01607ad7edc0275 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Status in linux source package in Kinetic: Fix Committed Status in linux source package in Mantic: Confirmed Bug description: [Impact] A regression caused by incomplete stable backports [Fix] commit 8273b4048664fff356fd10059033f0e2f5a422a1 Author: Arunpravin Paneer Selvam Date: Tue Oct 18 07:08:38 2022 -0700 drm/amdgpu: Fix for BO move issue [Test case] Install the update, check that display works again on amdgpu -- The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display continue on the ASPEED card. The ASPEED card is a very basic integrated card without hardware acceleration and featuring only one VGA output so that's unusable. As an additional
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
** Tags removed: verification-needed-kinetic ** Tags added: verification-done-kinetic -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Status in linux source package in Kinetic: Fix Committed Status in linux source package in Mantic: Confirmed Bug description: [Impact] A regression caused by incomplete stable backports [Fix] commit 8273b4048664fff356fd10059033f0e2f5a422a1 Author: Arunpravin Paneer Selvam Date: Tue Oct 18 07:08:38 2022 -0700 drm/amdgpu: Fix for BO move issue [Test case] Install the update, check that display works again on amdgpu -- The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display continue on the ASPEED card. The ASPEED card is a very basic integrated card without hardware acceleration and featuring only one VGA output so that's unusable. As an additional information I know X11 never start on the ASPEED if there are discrete cards plugged in (tested last year). So right now that computer is sticking on Linux 5.19.0-23 which doesn't doesn't the graphic and btrfs bugs. The last kernel to not feature the graphic bug is Linux 5.19.0-29. Linux
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
I now see the original message was edited, with this words added: > [Test case] > Install the update, check that display works again on amdgpu I confirm display works again on amdgpu -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Status in linux source package in Kinetic: Fix Committed Status in linux source package in Mantic: Confirmed Bug description: [Impact] A regression caused by incomplete stable backports [Fix] commit 8273b4048664fff356fd10059033f0e2f5a422a1 Author: Arunpravin Paneer Selvam Date: Tue Oct 18 07:08:38 2022 -0700 drm/amdgpu: Fix for BO move issue [Test case] Install the update, check that display works again on amdgpu -- The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display continue on the ASPEED card. The ASPEED card is a very basic integrated card without hardware acceleration and featuring only one VGA output so that's unusable. As an additional information I know X11 never start on the ASPEED if there are discrete cards plugged in (tested last year). So right now that computer is sticking on Linux 5.19.0-23 which doesn't doesn't the graphic
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
I'm on 5.19.0-44.45 right now. What 5.19.0-44.45 is expected to fix? - The computer boots properly, I have both R7 240 and R9 390X displaying something fine, so that error is fixed. - I still have an error message in dmesg: ``` [7.609329] [7.610224] UBSAN: invalid-load in /build/linux-le9C0y/linux-5.19.0/drivers/gpu/drm/amd/amdgpu/../pm/amdgpu_dpm.c:1363:37 [7.611125] load of value 232 is not a valid value for type '_Bool' [7.612025] CPU: 14 PID: 400 Comm: systemd-udevd Not tainted 5.19.0-44-generic #45-Ubuntu [7.612928] Hardware name: Default string Default string/Default string, BIOS WRX80PRO-F1 08/04/2022 [7.613836] Call Trace: [7.614736] [7.615633] show_stack+0x4e/0x61 [7.616531] dump_stack_lvl+0x4a/0x6f [7.617429] dump_stack+0x10/0x18 [7.618333] ubsan_epilogue+0x9/0x3a [7.619231] __ubsan_handle_load_invalid_value.cold+0x42/0x47 [7.620124] amdgpu_dpm_is_overdrive_supported.cold+0x12/0x45 [amdgpu] [7.621402] default_attr_update+0x332/0x500 [amdgpu] [7.622641] amdgpu_pm_sysfs_init+0x16f/0x1e0 [amdgpu] [7.623871] amdgpu_device_init.cold+0x3b7/0x80a [amdgpu] [7.625107] amdgpu_driver_load_kms+0x1c/0x170 [amdgpu] [7.626277] amdgpu_pci_probe+0x15f/0x3c0 [amdgpu] [7.627419] local_pci_probe+0x47/0x90 [7.628270] pci_call_probe+0x55/0x190 [7.629107] pci_device_probe+0x84/0x120 [7.629934] really_probe+0x1df/0x3b0 [7.630767] __driver_probe_device+0x12c/0x1b0 [7.631596] driver_probe_device+0x24/0xd0 [7.632426] __driver_attach+0x10b/0x210 [7.633255] ? __device_attach_driver+0x170/0x170 [7.634087] bus_for_each_dev+0x90/0xe0 [7.634917] driver_attach+0x1e/0x30 [7.635740] bus_add_driver+0x187/0x230 [7.636562] driver_register+0x8f/0x100 [7.637379] __pci_register_driver+0x62/0x70 [7.638203] amdgpu_init+0x6a/0x1000 [amdgpu] [7.639307] ? 0xc05c [7.640118] do_one_initcall+0x5e/0x240 [7.640929] do_init_module+0x50/0x210 [7.641736] load_module+0xb7d/0xcd0 [7.642532] __do_sys_finit_module+0xc4/0x140 [7.643316] ? __do_sys_finit_module+0xc4/0x140 [7.644100] __x64_sys_finit_module+0x18/0x30 [7.644879] do_syscall_64+0x5b/0x90 [7.645661] ? ksys_mmap_pgoff+0x11d/0x260 [7.646444] ? exit_to_user_mode_prepare+0x30/0xb0 [7.647231] ? syscall_exit_to_user_mode+0x29/0x50 [7.648016] ? do_syscall_64+0x67/0x90 [7.648796] ? do_syscall_64+0x67/0x90 [7.649566] ? syscall_exit_to_user_mode+0x29/0x50 [7.650347] ? do_syscall_64+0x67/0x90 [7.651120] entry_SYSCALL_64_after_hwframe+0x63/0xcd [7.651888] RIP: 0033:0x7f790b1eec4d [7.652645] Code: 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 83 f1 0d 00 f7 d8 64 89 01 48 [7.653432] RSP: 002b:7fffa9b1d908 EFLAGS: 0246 ORIG_RAX: 0139 [7.654222] RAX: ffda RBX: 556a6508e430 RCX: 7f790b1eec4d [7.655015] RDX: RSI: 556a650933b0 RDI: 0015 [7.655815] RBP: 556a650933b0 R08: R09: 7f790b2cec60 [7.656610] R10: 0015 R11: 0246 R12: 0002 [7.657414] R13: 556a650795d0 R14: R15: 556a65079030 [7.658216] [7.659037] ``` -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Status in linux source package in Kinetic: Fix Committed Status in linux source package in Mantic: Confirmed Bug description: [Impact] A regression caused by incomplete stable backports [Fix] commit 8273b4048664fff356fd10059033f0e2f5a422a1 Author: Arunpravin Paneer Selvam Date: Tue Oct 18 07:08:38 2022 -0700 drm/amdgpu: Fix for BO move issue [Test case] Install the update, check that display works again on amdgpu -- The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
This bug is awaiting verification that the linux/5.19.0-44.45 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-kinetic' to 'verification-done-kinetic'. If the problem still exists, change the tag 'verification-needed-kinetic' to 'verification-failed-kinetic'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: kernel-spammed-kinetic-linux verification-needed-kinetic -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Status in linux source package in Kinetic: Fix Committed Status in linux source package in Mantic: Confirmed Bug description: [Impact] A regression caused by incomplete stable backports [Fix] commit 8273b4048664fff356fd10059033f0e2f5a422a1 Author: Arunpravin Paneer Selvam Date: Tue Oct 18 07:08:38 2022 -0700 drm/amdgpu: Fix for BO move issue [Test case] Install the update, check that display works again on amdgpu -- The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
** Changed in: linux (Ubuntu Kinetic) Importance: Undecided => Medium ** Changed in: linux (Ubuntu Kinetic) Status: Confirmed => Fix Committed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Status in linux source package in Kinetic: Fix Committed Status in linux source package in Mantic: Confirmed Bug description: [Impact] A regression caused by incomplete stable backports [Fix] commit 8273b4048664fff356fd10059033f0e2f5a422a1 Author: Arunpravin Paneer Selvam Date: Tue Oct 18 07:08:38 2022 -0700 drm/amdgpu: Fix for BO move issue [Test case] Install the update, check that display works again on amdgpu -- The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display continue on the ASPEED card. The ASPEED card is a very basic integrated card without hardware acceleration and featuring only one VGA output so that's unusable. As an additional information I know X11 never start on the ASPEED if there are discrete cards plugged in (tested last year). So right now that computer is sticking on Linux 5.19.0-23 which doesn't doesn't the graphic and btrfs bugs. The last
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
I guess this bug should concentrate on the regression. ** Description changed: + [Impact] + A regression caused by incomplete stable backports + + [Fix] + + commit 8273b4048664fff356fd10059033f0e2f5a422a1 + Author: Arunpravin Paneer Selvam + Date: Tue Oct 18 07:08:38 2022 -0700 + + drm/amdgpu: Fix for BO move issue + + [Test case] + + Install the update, check that display works again on amdgpu + + -- + The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display continue on the ASPEED card. The ASPEED card is a very basic integrated card without hardware acceleration and featuring only one VGA output so that's unusable. As an additional information I know X11 never start on the ASPEED if there are discrete cards plugged in (tested last year). So right now that computer is sticking on Linux 5.19.0-23 which doesn't doesn't the graphic and btrfs bugs. The last kernel to not feature the graphic bug is Linux 5.19.0-29. Linux 5.19.0-31 is the first one reproducing the graphic bug (the repository doesn't provide 5.19.0-30 for me to test). I also have reproduced the graphic bug when using the radeon driver instead of the amdgpu one. ProblemType: Bug DistroRelease: Ubuntu 22.10 Package: linux-image-generic 5.19.0.42.38 ProcVersionSignature: Ubuntu 5.19.0-23.24-generic 5.19.7 Uname:
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
OK here are the fixes for this identified upstream: Mantic (6.3+): * UBSAN issue fixed by https://gitlab.freedesktop.org/agd5f/linux/-/commit/3a5fb036af0a18436209fbb16e331edd26a07b3d Kinetic (5.19): * UBSAN issue fixed by https://gitlab.freedesktop.org/agd5f/linux/-/commit/3a5fb036af0a18436209fbb16e331edd26a07b3d * The hang issue was caused by Canonical backporting https://github.com/torvalds/linux/commit/312b4dc11d4f74bfe03ea25ffe04c1f2fdd13cb9 into 5.19 kernel but not taking the fix https://github.com/torvalds/linux/commit/8273b4048664fff356fd10059033f0e2f5a422a1 Canonical team please review these. ** Changed in: linux (Ubuntu Kinetic) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Status in linux source package in Kinetic: Confirmed Status in linux source package in Mantic: Confirmed Bug description: The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display continue on the ASPEED card. The ASPEED card is a very basic integrated card without hardware acceleration and featuring only one VGA output so that's unusable. As an additional information I know X11
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
** Changed in: linux (Ubuntu Kinetic) Status: Invalid => Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Status in linux source package in Kinetic: Incomplete Status in linux source package in Mantic: Confirmed Bug description: The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display continue on the ASPEED card. The ASPEED card is a very basic integrated card without hardware acceleration and featuring only one VGA output so that's unusable. As an additional information I know X11 never start on the ASPEED if there are discrete cards plugged in (tested last year). So right now that computer is sticking on Linux 5.19.0-23 which doesn't doesn't the graphic and btrfs bugs. The last kernel to not feature the graphic bug is Linux 5.19.0-29. Linux 5.19.0-31 is the first one reproducing the graphic bug (the repository doesn't provide 5.19.0-30 for me to test). I also have reproduced the graphic bug when using the radeon driver instead of the amdgpu one. ProblemType: Bug DistroRelease: Ubuntu 22.10 Package: linux-image-generic 5.19.0.42.38 ProcVersionSignature: Ubuntu
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
> Tainted: G In the two upstream bugs it was noted that there was an amdgpu dkms package in place. I believe that's where this issue likely was. Commit 63a9ab264a8c came in 6.3-rc1 and the commit it fixes was also in 6.3-rc1 (b1a9557a7d00). So at least one of the issues is probably invalid in Ubuntu's 5.19, but there are valid upstream bugs, including in 6.3 as there is still another patch to test for one of the problems. I'll adjust the tasks accordingly, as I think this should still be tracked to fix in mantic. ** Also affects: linux (Ubuntu Mantic) Importance: Undecided Status: Confirmed ** Also affects: linux (Ubuntu Kinetic) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Kinetic) Status: New => Invalid -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Status in linux source package in Kinetic: Invalid Status in linux source package in Mantic: Confirmed Bug description: The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display continue on the ASPEED card. The ASPEED card is a very basic integrated card without hardware acceleration and featuring only one VGA
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
** Tags added: patch -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Bug description: The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display continue on the ASPEED card. The ASPEED card is a very basic integrated card without hardware acceleration and featuring only one VGA output so that's unusable. As an additional information I know X11 never start on the ASPEED if there are discrete cards plugged in (tested last year). So right now that computer is sticking on Linux 5.19.0-23 which doesn't doesn't the graphic and btrfs bugs. The last kernel to not feature the graphic bug is Linux 5.19.0-29. Linux 5.19.0-31 is the first one reproducing the graphic bug (the repository doesn't provide 5.19.0-30 for me to test). I also have reproduced the graphic bug when using the radeon driver instead of the amdgpu one. ProblemType: Bug DistroRelease: Ubuntu 22.10 Package: linux-image-generic 5.19.0.42.38 ProcVersionSignature: Ubuntu 5.19.0-23.24-generic 5.19.7 Uname: Linux 5.19.0-23-generic x86_64 ApportVersion: 2.23.1-0ubuntu3.3 Architecture: amd64 CasperMD5CheckResult: unknown
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
This is the patch by Alex Deucher that is believed to fix the NULL pointer dereference. I have not tested it but the issue looks very close. It is needed anyway. > Guchun Chen > Regarding the NULL pointer access, it should be duplicated of #2388. And the > fix is "63a9ab264a8c drm/amd/pm/smu7: move variables to where they are used" . ** Patch added: "0001-drm-amd-pm-smu7-move-variables-to-where-they-are-use.patch" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2018470/+attachment/5671131/+files/0001-drm-amd-pm-smu7-move-variables-to-where-they-are-use.patch -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Bug description: The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display continue on the ASPEED card. The ASPEED card is a very basic integrated card without hardware acceleration and featuring only one VGA output so that's unusable. As an additional information I know X11 never start on the ASPEED if there are discrete cards plugged in (tested last year). So right now that computer is sticking on Linux 5.19.0-23 which doesn't doesn't the graphic and btrfs bugs. The last kernel to not
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
I've reported the issues upstream on drm side: - https://gitlab.freedesktop.org/drm/amd/-/issues/2540 Linux 5.19 amdgpu: NULL pointer on GCN2 (R9 390X Hawaii/Grenada) - https://gitlab.freedesktop.org/drm/amd/-/issues/2541 Linux 5.19 amdgpu: invalid load on GCN1 (R7 240 Oland) For the NULL pointer dereference, it looks like there is a patch there: - https://gitlab.freedesktop.org/drm/amd/-/issues/2388 https://gitlab.freedesktop.org/drm/amd/uploads/a004996ac0c868dfb032af3c35f7b2c6/0001-drm-amd-pm-smu7-move-variables-to-where-they-are-use.patch I have not tested the patch myself, but the bug this patch fixes looks very similar. ** Bug watch added: gitlab.freedesktop.org/drm/amd/-/issues #2540 https://gitlab.freedesktop.org/drm/amd/-/issues/2540 ** Bug watch added: gitlab.freedesktop.org/drm/amd/-/issues #2541 https://gitlab.freedesktop.org/drm/amd/-/issues/2541 ** Bug watch added: gitlab.freedesktop.org/drm/amd/-/issues #2388 https://gitlab.freedesktop.org/drm/amd/-/issues/2388 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Bug description: The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display
[Kernel-packages] [Bug 2018470] Re: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1
** Summary changed: - Linux 5.19 amdgpu: NULL pointer on GCN1 and invalid load on GCN2 + Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2018470 Title: Linux 5.19 amdgpu: NULL pointer on GCN2 and invalid load on GCN1 Status in linux package in Ubuntu: Confirmed Bug description: The day I updated from Ubuntu 22.04 to 22.10 some months ago, I had to stick on Linux 5.15 because 5.19 was not working with my computer. The last two days I spent time to find a way to run Linux 5.19, and found one version working: 5.19.0-23. Here are the versions I tested: - 5.19.0-23 - 5.19.0-29 - 5.19.0-31 - 5.19.0-42 In that list, only Linux 5.19.0-23 is working with that computer. There may be other versions that work I have not tested, but basically the breakages occurred after 5.19.0-23. I face two problems, let's talk about the first one, the graphic one still present in 5.19.0-42. It starts to occurs with 5.19.0-31 (5.19.0-29 is not affected): graphic breaks at the moment it should switch from low resolution display to high resolution display at the very beginning of startup. The computer is not completely broken, but the graphic is dead. X11 cannot start, trying to use the framebuffer, meaning the amdgpu driver is not functional). The second bug is the one I get with the 5.19.0-29 version. Linux 5.19.0-29 doesn't experience the graphic bug but has another issue that makes the computer unusable: some CPU got locked, and some btrfs process runs at 100% CPU, syncing never ends, even preventing to reboot. This bug is less important because I don't reproduce it on version 5.19.0-42, so if 5.19.0-42 fixes the graphic all will be fine. I have not updated to Ubuntu 23.04 yet because I'm afraid of newer kernels from it would leave my computer totally unusable, I have run Ubuntu 22.10 with Ubuntu 22.04's 5.15 kernel until today because of that fear. It actually took me two work days to test various combinations to boot the computer so I'm sticking on 5.19.0-29 for now, and I have limited time to test other options. I also tried various BIOS options, and also upgraded the BIOS…, and since that ThreadRipper PRO computer has very slow booting BIOS, trying various configurations or software versions that requires a reboot quickly eats-up whole hours. The attached logs may have traces of dkim modules like amdgpu-pro, but the first time I experienced the bug I had none of them. I reproduced the bug on a 5.19.0-42 kernel free of amdgpu-pro yesterday. I'm simply opening the ticket from my working environment, and I decided to not spend one more hour just to uninstall amdgpu-pro and reboot only to do that ticket. Here are some details on the hardware: - MOBO: Gigabyte WRX80-SU8-IPMI rev. 1.0 (BIOS version F5, also named WRX80PRO-F1 in dmidecode, dated 08/04/2022) https://www.gigabyte.com/Motherboard/WRX80-SU8-IPMI-rev-10 - RAM: 8× Kingston Server Premier 32GB DDR4 3200 MHz ECC CL22 2Rx8 PC4-25600 KSM32ED8/32ME 16Gbit Micron E - CPU: AMD Ryzen Threadripper PRO 3955WX 16-Cores (Castle Peak, Zen 2) - GPU: AMD Radeon R9 390X (Hawaii/Grenada, GCN2, amdgpu driver) - GPU: AMD Radeon R7 240 (Oland, GCN1, amdgpu driver) - GPU: ASPEED graphic Family rev 41 The ASPEED graphic is a small card integrated in the motherboard and part of the BMC, I cannot remove it. This may participate in the trouble. When the graphic works (Linux 5.19.0-23, Linux 5.19.0-29), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the ASPEED graphic goes off and the display continue on AMD cards. When the graphic doesn't work (5.19.0-31, 5.19.0-42), the boot is displayed on all AMD and ASPEED graphic output, then at the moment the graphic switches from low resolution to high resolution, the AMD cards display garbage but the display continue on the ASPEED card. The ASPEED card is a very basic integrated card without hardware acceleration and featuring only one VGA output so that's unusable. As an additional information I know X11 never start on the ASPEED if there are discrete cards plugged in (tested last year). So right now that computer is sticking on Linux 5.19.0-23 which doesn't doesn't the graphic and btrfs bugs. The last kernel to not feature the graphic bug is Linux 5.19.0-29. Linux 5.19.0-31 is the first one reproducing the graphic bug (the repository doesn't provide 5.19.0-30 for me to test). I also have reproduced the graphic bug when using the radeon driver instead of the amdgpu one. ProblemType: Bug DistroRelease: Ubuntu 22.10 Package: linux-image-generic 5.19.0.42.38 ProcVersionSignature: Ubuntu 5.19.0-23.24-generic 5.19.7 Uname: