Processed: Re: Bug#989705: Suspend to RAM hangs computer with nouveau driver and kernel 5.10.0-7-amd64 / 5.10.0-8-amd64
Processing control commands: > fixed -1 5.19.6-1 Bug #989705 [src:linux] linux-image-5.10.0-7-amd64: Suspend to RAM hangs computer with nouveau driver and kernel 5.10.0-7-amd64 Bug #979340 [src:linux] nouveau: Suspend hangs for 5 minutes before completing Bug #986700 [src:linux] nouveau: Suspend hangs for 5 minutes before completing Marked as fixed in versions linux/5.19.6-1. Marked as fixed in versions linux/5.19.6-1. Marked as fixed in versions linux/5.19.6-1. -- 979340: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=979340 986700: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=986700 989705: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=989705 Debian Bug Tracking System Contact ow...@bugs.debian.org with problems
Bug#989705: Suspend to RAM hangs computer with nouveau driver and kernel 5.10.0-7-amd64 / 5.10.0-8-amd64
Control: fixed -1 5.19.6-1 On woensdag 14 september 2022 06:45:18 CEST Computer Enthusiastic wrote: > An upstream patch has been released [1] > > [1] > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/? > h=v6.0-rc5=6b04ce966a738ecdd9294c9593e48513c0dc90aa That commit is part of v6.0-rc3 Fixed in 5.19 with commit 8e3ba23a67de984f4156f0663f1f603ff6c15815 which is part of 5.19.6. signature.asc Description: This is a digitally signed message part.
Bug#989705: Suspend to RAM hangs computer with nouveau driver and kernel 5.10.0-7-amd64 / 5.10.0-8-amd64
Hello, An upstream patch has been released [1] [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.0-rc5=6b04ce966a738ecdd9294c9593e48513c0dc90aa
Bug#989705: Suspend to RAM hangs computer with nouveau driver and kernel 5.10.0-7-amd64 / 5.10.0-8-amd64
On Mon, 15 Aug 2022 19:02:25 +0200 Computer Enthusiastic wrote: > Hello, > > To whom it may interests, I've successful tested the patch (see > previous message [0]) with Debian > kernels version 5.10.0-16-amd64 [1] from bullseye-security and, > 5.18.0-0 [2] from bullseye-backports > > [0] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=989705#64 > [1] http://security.debian.org/debian-security/pool/main/l/linux/linux-image-5.10.0-16-amd64-unsigned_5.10.127-2_amd64.deb > [2] https://packages.debian.org/bullseye-backports/linux-image-amd64 > . > > Hi, Thanks to Computer Enthusiastic for reporting this bug. I've been hit too for quite a long time so I had to stick to an ancient kernel to keep the hibernation ability functional. $ cat /etc/debian_version bookworm/sid $ dpkg -l linux-image-* | grep ^ii ii linux-image-5.10.0-13-amd64 5.10.106-1 amd64 Linux 5.10 for 64-bit PCs (signed) ii linux-image-5.10.0-16-amd64 5.10.127-1 amd64 Linux 5.10 for 64-bit PCs (signed) ii linux-image-5.18.0-4-amd64 5.18.16-1 amd64 Linux 5.18 for 64-bit PCs (signed) ii linux-image-5.7.0-2-amd64 5.7.10-1 amd64 Linux 5.7 for 64-bit PCs (signed) ii linux-image-5.7.0-3-amd64 5.7.17-1 amd64 Linux 5.7 for 64-bit PCs (signed) The kernel versions 5.7 executed the hibernation process properly. The later ones did not until I had the init_on_alloc=0 to my grub parameters. $ inxi -G Graphics: Device-1: NVIDIA G96C [GeForce 9400 GT] driver: nouveau v: kernel Display: x11 server: X.Org v: 1.20.11 with: Xwayland driver: X: loaded: modesetting gpu: nouveau resolution: 1920x1080~60Hz OpenGL: renderer: NV96 v: 3.3 Mesa 20.3.5 $ uname -a Linux birdynam 5.18.0-4-amd64 #1 SMP PREEMPT_DYNAMIC Debian 5.18.16-1 (2022-08-10) x86_64 GNU/Linux HTH Rudu
Bug#989705: Suspend to RAM hangs computer with nouveau driver and kernel 5.10.0-7-amd64 / 5.10.0-8-amd64
Hello, To whom it may interests, I've successful tested the patch (see previous message [0]) with Debian kernels version 5.10.0-16-amd64 [1] from bullseye-security and, 5.18.0-0 [2] from bullseye-backports [0] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=989705#64 [1] http://security.debian.org/debian-security/pool/main/l/linux/linux-image-5.10.0-16-amd64-unsigned_5.10.127-2_amd64.deb [2] https://packages.debian.org/bullseye-backports/linux-image-amd64 .
Bug#989705: Suspend to RAM hangs computer with nouveau driver and kernel 5.10.0-7-amd64 / 5.10.0-8-amd64
Hello, I've been successfully used an experimental "work in progress" patch with the kernel version 5.10.113 since last 25 may (see https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/-/issues/547#note_1438950) . I suspended and hibernated the system (Debian GNU/Linux 11.3 with the aforementioned patch) many times without any issue : # journalctl -S 2022-05-25 --no-pager | grep -e "PM: suspend entry\|hibernation: hibernation entry" May 25 08:39:48 debian kernel: PM: hibernation: hibernation entry May 26 08:05:44 debian kernel: PM: hibernation: hibernation entry May 28 12:28:50 debian kernel: PM: suspend entry (deep) May 28 12:39:52 debian kernel: PM: suspend entry (deep) May 28 13:34:55 debian kernel: PM: hibernation: hibernation entry May 28 18:17:35 debian kernel: PM: hibernation: hibernation entry May 30 08:56:31 debian kernel: PM: hibernation: hibernation entry Jun 01 22:54:45 debian kernel: PM: hibernation: hibernation entry Jun 02 17:44:35 debian kernel: PM: suspend entry (deep) Jun 02 23:49:38 debian kernel: PM: hibernation: hibernation entry Jun 04 10:54:04 debian kernel: PM: hibernation: hibernation entry Jun 05 17:34:17 debian kernel: PM: hibernation: hibernation entry Jun 06 08:18:21 debian kernel: PM: hibernation: hibernation entry Jun 06 18:11:40 debian kernel: PM: suspend entry (deep) Jun 06 19:27:39 debian kernel: PM: suspend entry (deep) Jun 06 23:02:24 debian kernel: PM: hibernation: hibernation entry Jun 08 08:53:02 debian kernel: PM: hibernation: hibernation entry Jun 08 08:58:41 debian kernel: PM: suspend entry (deep) Jun 08 08:59:32 debian kernel: PM: hibernation: hibernation entry Jun 09 01:11:56 debian kernel: PM: hibernation: hibernation entry Jun 09 21:08:22 debian kernel: PM: hibernation: hibernation entry Jun 11 09:00:25 debian kernel: PM: hibernation: hibernation entry Jun 21 23:35:30 debian kernel: PM: hibernation: hibernation entry I've successful tested the patch at least one time with upstream kernels version 5.10.117, 5.16.14 and 5.17.9, too. Using this patch, the workaround cited in previous messages is not required anymore with my hardware. $ inxi -G Graphics: Device-1: NVIDIA G96CM [GeForce 9600M GT] driver: nouveau v: kernel Display: x11 server: X.Org 1.20.11 driver: loaded: modesetting unloaded: fbdev,vesa resolution: 1280x800~60Hz OpenGL: renderer: NV96 v: 3.3 Mesa 20.3.5 The root cause of the issue is probably the cause, at least another, of other different issue (see https://gitlab.freedesktop.org/drm/nouveau/-/issues/156#note_1383820 ) It probably be useful to test the patch with different affected nvidia graphic cards. Hope that helps. From 70271cb0aa30e4523d39c3942e84b16fe18338f5 Mon Sep 17 00:00:00 2001 From: Karol Herbst Date: Mon, 16 May 2022 17:40:20 +0200 Subject: [PATCH] nouveau WIP --- drivers/gpu/drm/nouveau/nouveau_bo.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index 05076e530e7d..b6343741eda6 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -820,6 +820,7 @@ nouveau_bo_move_m2mf(struct ttm_buffer_object *bo, int evict, if (ret == 0) { ret = nouveau_fence_new(chan, false, ); if (ret == 0) { +nouveau_fence_wait(fence, false, false); ret = ttm_bo_move_accel_cleanup(bo, >base, evict, false, -- 2.35.3
Re: Bug#989705: Suspend to RAM hangs computer with nouveau driver and kernel 5.10.0-7-amd64 / 5.10.0-8-amd64
Hello, The bug affects kernel version 5.16, too. To whom it may interests, I tried to analyse the issue in more detail with kernel version 5.10.87: more info are available here [1] [2]. [1] https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/-/issues/547#note_1242815 [2] https://bugzilla.kernel.org/show_bug.cgi?id=213617#c5
Bug#989705: Suspend to RAM hangs computer with nouveau driver and kernel 5.10.0-7-amd64 / 5.10.0-8-amd64
Hello, I have the same suspend issue with newer Kernels. System: Thinkpad T410 Intel i5 M So I'm reluctant to upgrade my system to Debian 11. <<... I configured grub to start the kernel with the parameter init_on_alloc=0 >> This method worked on my system (tryed with EndevourOS). Thanks AchilleL. Now I have to investigate what the purpose of init_on_alloc=0 is ;-) Cheers! Jens On Tue, 31 Aug 2021 23:13:33 +0200 Achille L < computer.enthusias...@gmail.com> wrote: > Hello, > > I suppose I have identified that the issue was related to the > activation of the config parameter CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y > in Debian Kernel 5.10.0-8-amd64 from Debian Bullseye 11.0 (it was > disabled in the Debian kernel 4.19.0 from Debian Buster 10.11). This > parameter was activated in Debian wit h linux (5.8.3-1~exp1) > experimental on Mon, 24 Aug 2020 01:23:22 +0100 (see > https://metadata.ftp-master.debian.org/changelogs//main/l/linux/linux_5.10.46-4_changelog ) > > I discovered it bisecting (by hand) the diff of a working kernel > config file for Debian Kernel 5.10.0-8-amd64 (generated by me from > Debian kernel source code with make makeoldconfig using as template > the Debian kernel config-4.19.0-11-amd64) and the default kernel > config file from stock Debian Kernel 5.10.0-8-amd64 (see attachment); > the "hunk" of the diff that I detected was the number 151: > > --- linux-source-5.10/.config 2021-08-13 17:24:22.386243765 +0200 > +++ /boot/config-5.10.462021-08-01 10:27:12.0 +0200 > @@ -9063,7 +9063,7 @@ > # Memory initialization > # > CONFIG_INIT_STACK_NONE=y > -CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y > +# CONFIG_INIT_ON_ALLOC_DEFAULT_ON is not set > # CONFIG_INIT_ON_FREE_DEFAULT_ON is not set > # end of Memory initialization > # end of Kernel hardening options > > To verify this finding, I configured grub to start the kernel with the > parameter init_on_alloc=0: > > # If you change this file, run 'update-grub' afterwards to update > # /boot/grub/grub.cfg. > # For full documentation of the options in this file, see: > # info -f grub -n 'Simple configuration' > > GRUB_DEFAULT=0 > GRUB_TIMEOUT=5 > GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian` > GRUB_CMDLINE_LINUX_DEFAULT="no_console_suspend nouveau.debug=warn > init_on_alloc=0" > [...missing...] > > After that, of course, I update the grub with kernel boot > configuration with the command: > > update-grub2 > > The test with the stock Debian Bullseye (11.0) Kernel 5.10.0-8-amd64 > was successful: I'm repeatedly able to suspend to ram and suspend to > disk with parameter init_on_alloc set to 0 with the same kernel that > freeze with init_on_alloc set to 1. I haven't deepened yet in kernel > source code, but in theory the kernel feature activated by this > parameter [1] (erase area of newly allocated memory) could have side > effects with the buffer handling/eviction of memory from video memory > to system memory during suspend to ram or suspend to disk. > > You could give it a try, even if your GPU is two year younger then > mine (but they use the same nv50 kernel drm module).
Bug#989705: Suspend to RAM hangs computer with nouveau driver and kernel 5.10.0-7-amd64 / 5.10.0-8-amd64
Hello, I suppose I have identified that the issue was related to the activation of the config parameter CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y in Debian Kernel 5.10.0-8-amd64 from Debian Bullseye 11.0 (it was disabled in the Debian kernel 4.19.0 from Debian Buster 10.11). This parameter was activated in Debian with linux (5.8.3-1~exp1) experimental on Mon, 24 Aug 2020 01:23:22 +0100 (see https://metadata.ftp-master.debian.org/changelogs//main/l/linux/linux_5.10.46-4_changelog) I discovered it bisecting (by hand) the diff of a working kernel config file for Debian Kernel 5.10.0-8-amd64 (generated by me from Debian kernel source code with make makeoldconfig using as template the Debian kernel config-4.19.0-11-amd64) and the default kernel config file from stock Debian Kernel 5.10.0-8-amd64 (see attachment); the "hunk" of the diff that I detected was the number 151: --- linux-source-5.10/.config 2021-08-13 17:24:22.386243765 +0200 +++ /boot/config-5.10.462021-08-01 10:27:12.0 +0200 @@ -9063,7 +9063,7 @@ # Memory initialization # CONFIG_INIT_STACK_NONE=y -CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y +# CONFIG_INIT_ON_ALLOC_DEFAULT_ON is not set # CONFIG_INIT_ON_FREE_DEFAULT_ON is not set # end of Memory initialization # end of Kernel hardening options To verify this finding, I configured grub to start the kernel with the parameter init_on_alloc=0: # If you change this file, run 'update-grub' afterwards to update # /boot/grub/grub.cfg. # For full documentation of the options in this file, see: # info -f grub -n 'Simple configuration' GRUB_DEFAULT=0 GRUB_TIMEOUT=5 GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian` GRUB_CMDLINE_LINUX_DEFAULT="no_console_suspend nouveau.debug=warn init_on_alloc=0" [...missing...] After that, of course, I update the grub with kernel boot configuration with the command: update-grub2 The test with the stock Debian Bullseye (11.0) Kernel 5.10.0-8-amd64 was successful: I'm repeatedly able to suspend to ram and suspend to disk with parameter init_on_alloc set to 0 with the same kernel that freeze with init_on_alloc set to 1. I haven't deepened yet in kernel source code, but in theory the kernel feature activated by this parameter [1] (erase area of newly allocated memory) could have side effects with the buffer handling/eviction of memory from video memory to system memory during suspend to ram or suspend to disk. You could give it a try, even if your GPU is two year younger then mine (but they use the same nv50 kernel drm module). Let me know. [1] https://patchwork.kernel.org/project/linux-security-module/patch/20190626121943.131390-2-gli...@google.com/ --- linux-source-5.10/.config 2021-08-13 17:24:22.386243765 +0200 +++ /boot/config-5.10.46 2021-08-01 10:27:12.0 +0200 @@ -22,8 +22,8 @@ CONFIG_INIT_ENV_ARG_LIMIT=32 # CONFIG_COMPILE_TEST is not set CONFIG_LOCALVERSION="" -# CONFIG_LOCALVERSION_AUTO is not set -CONFIG_BUILD_SALT="5.10.0-8-amd64" +CONFIG_LOCALVERSION_AUTO=y +CONFIG_BUILD_SALT="4.19.0-11-amd64" CONFIG_HAVE_KERNEL_GZIP=y CONFIG_HAVE_KERNEL_BZIP2=y CONFIG_HAVE_KERNEL_LZMA=y @@ -114,8 +114,7 @@ CONFIG_TASK_DELAY_ACCT=y CONFIG_TASK_XACCT=y CONFIG_TASK_IO_ACCOUNTING=y -CONFIG_PSI=y -# CONFIG_PSI_DEFAULT_DISABLED is not set +# CONFIG_PSI is not set # end of CPU/Task time and stats accounting CONFIG_CPU_ISOLATION=y @@ -168,7 +167,7 @@ CONFIG_CGROUP_PIDS=y CONFIG_CGROUP_RDMA=y CONFIG_CGROUP_FREEZER=y -CONFIG_CGROUP_HUGETLB=y +# CONFIG_CGROUP_HUGETLB is not set CONFIG_CPUSETS=y CONFIG_PROC_PID_CPUSET=y CONFIG_CGROUP_DEVICE=y @@ -235,12 +234,11 @@ CONFIG_KALLSYMS_ALL=y CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y CONFIG_KALLSYMS_BASE_RELATIVE=y -CONFIG_BPF_LSM=y +# CONFIG_BPF_LSM is not set CONFIG_BPF_SYSCALL=y CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y # CONFIG_BPF_JIT_ALWAYS_ON is not set CONFIG_BPF_JIT_DEFAULT_ON=y -CONFIG_BPF_UNPRIV_DEFAULT_OFF=y # CONFIG_BPF_PRELOAD is not set CONFIG_USERFAULTFD=y CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y @@ -321,7 +319,7 @@ CONFIG_X86_MPPARSE=y # CONFIG_GOLDFISH is not set CONFIG_RETPOLINE=y -CONFIG_X86_CPU_RESCTRL=y +# CONFIG_X86_CPU_RESCTRL is not set # CONFIG_X86_EXTENDED_PLATFORM is not set CONFIG_X86_INTEL_LPSS=y CONFIG_X86_AMD_PLATFORM_DEVICE=y @@ -376,11 +374,11 @@ CONFIG_HPET_EMULATE_RTC=y CONFIG_DMI=y CONFIG_GART_IOMMU=y -CONFIG_MAXSMP=y -CONFIG_NR_CPUS_RANGE_BEGIN=8192 -CONFIG_NR_CPUS_RANGE_END=8192 -CONFIG_NR_CPUS_DEFAULT=8192 -CONFIG_NR_CPUS=8192 +# CONFIG_MAXSMP is not set +CONFIG_NR_CPUS_RANGE_BEGIN=2 +CONFIG_NR_CPUS_RANGE_END=512 +CONFIG_NR_CPUS_DEFAULT=64 +CONFIG_NR_CPUS=512 CONFIG_SCHED_SMT=y CONFIG_SCHED_MC=y CONFIG_SCHED_MC_PRIO=y @@ -412,7 +410,7 @@ CONFIG_MICROCODE=y CONFIG_MICROCODE_INTEL=y CONFIG_MICROCODE_AMD=y -# CONFIG_MICROCODE_OLD_INTERFACE is not set +CONFIG_MICROCODE_OLD_INTERFACE=y CONFIG_X86_MSR=m CONFIG_X86_CPUID=m # CONFIG_X86_5LEVEL is not set @@ -423,7 +421,7 @@ CONFIG_AMD_NUMA=y CONFIG_X86_64_ACPI_NUMA=y CONFIG_NUMA_EMU=y