Bug#1054514: linux-image-6.1.0-13-amd64: Debian VM with qxl graphics freezes frequently
Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting for once, to make this easily accessible to everyone. Gerd, it seems this regression[1] fell through the cracks. Could you please take a look? Or is there a good reason why this can't be addressed? Or was it dealt with and I just missed it? [1] apparently caused by 5a838e5d5825c8 ("drm/qxl: simplify qxl_fence_wait") [v5.13-rc1] from Gerd; for details see https://lore.kernel.org/regressions/ztgydqrlk6wx_...@eldamar.lan/ Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page. #regzbot poke On 24.10.23 23:39, Timo Lindfors wrote: > Hi, > > On Tue, 24 Oct 2023, Salvatore Bonaccorso wrote: >> Thanks for the excelent constructed report! I think it's best to >> forward this directly to upstream including the people for the >> bisected commit to get some idea. > > Thanks for the quick reply! > >> Can you reproduce the issue with 6.5.8-1 in unstable as well? > > Unfortunately yes: > > ansible@target:~$ uname -r > 6.5.0-3-amd64 > ansible@target:~$ time sudo ./reproduce.bash > Wed 25 Oct 2023 12:27:00 AM EEST starting round 1 > Wed 25 Oct 2023 12:27:24 AM EEST starting round 2 > Wed 25 Oct 2023 12:27:48 AM EEST starting round 3 > bug was reproduced after 3 tries > > real 0m48.838s > user 0m1.115s > sys 0m45.530s > > I also tested upstream tag v6.6-rc6: > > ... > + detected_version=6.6.0-rc6 > + '[' 6.6.0-rc6 '!=' 6.6.0-rc6 ']' > + exec ssh target sudo ./reproduce.bash > Wed 25 Oct 2023 12:37:16 AM EEST starting round 1 > Wed 25 Oct 2023 12:37:42 AM EEST starting round 2 > Wed 25 Oct 2023 12:38:10 AM EEST starting round 3 > Wed 25 Oct 2023 12:38:36 AM EEST starting round 4 > Wed 25 Oct 2023 12:39:01 AM EEST starting round 5 > Wed 25 Oct 2023 12:39:27 AM EEST starting round 6 > bug was reproduced after 6 tries > > > For completeness, here is also the grub_set_default_version.bash script > that I had to write to automate this (maybe these could be in debian > wiki?): > > #!/bin/bash > set -x > > version="$1" > > idx=$(expr $(grep "menuentry " /boot/grub/grub.cfg | sed 1d |grep -n > "'Debian GNU/Linux, with Linux $version'"|cut -d: -f1) - 1) > exec sudo grub-set-default "1>$idx" > > > > -Timo > > >
Bug#1054514: linux-image-6.1.0-13-amd64: Debian VM with qxl graphics freezes frequently
On Tue, Oct 24, 2023 at 11:09:10PM +0200, Salvatore Bonaccorso wrote: > Hi Timo, > > On Tue, Oct 24, 2023 at 11:14:32PM +0300, Timo Lindfors wrote: > > Package: src:linux > > Version: 6.1.55-1 > > Severity: normal > > > > Steps to reproduce: > > 1) Install Debian 12 as a virtual machine using virt-manager, choose qxl > >graphics card. You only need basic installation without wayland or X. > > 2) Login from the console and save thë following to reproduce.bash: > > > > #!/bin/bash > > > > chvt 3 > > for j in $(seq 80); do > > echo "$(date) starting round $j" > > if [ "$(journalctl --boot | grep "failed to allocate VRAM BO")" != "" ]; > > then > > echo "bug was reproduced after $j tries" > > exit 1 > > fi > > for i in $(seq 100); do > > dmesg > /dev/tty3 > > done > > done > > > > echo "bug could not be reproduced" > > exit 0 > > > > > > 3) Run chmod a+x reproduce.bash > > 4) Run ./reproduce.bash and wait for up to 20 minutes. > > > > Expected results: > > 4) The system prints a steady flow of text without kernel error messages > > > > Actual messages: > > 4) At some point the text stops flowing and the script prints "bug was > >reproduced". If you run "journalctl --boot" you see > > > > kernel: [TTM] Buffer eviction failed > > kernel: qxl :00:02.0: object_init failed for (3149824, 0x0001) > > kernel: [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO > > > > > > > > More info: > > 1) The bug does not occur if I downgrade the kernel to > >linux-image-5.10.0-26-amd64_5.10.197-1_amd64.deb from Debian 11. > > 2) I used the following test_linux.bash to bisect this issue against > >upstream source: > > > > #!/bin/bash > > set -x > > > > gitversion="$(git describe HEAD|sed 's@^v@@')" > > > > git checkout drivers/gpu/drm/ttm/ttm_bo.c include/drm/ttm/ttm_bo_api.h > > git show bec771b5e0901f4b0bc861bcb58056de5151ae3a | patch -p1 > > # Build > > cp ~/kernel.config .config > > # cp /boot/config-$(uname -r) .config > > # scripts/config --enable LOCALVERSION_AUTO > > # scripts/config --disable DEBUG_INFO > > # scripts/config --disable SYSTEM_TRUSTED_KEYRING > > # scripts/config --set-str SYSTEM_TRUSTED_KEYS '' > > # scripts/config --disable STACKPROTECTOR_STRONG > > make olddefconfig > > # make localmodconfig > > make -j$(nproc --all) bindeb-pkg > > rc="$?" > > if [ "$rc" != "0" ]; then > > exit 125 > > fi > > git checkout drivers/gpu/drm/ttm/ttm_bo.c include/drm/ttm/ttm_bo_api.h > > > > package="$(ls --sort=time ../linux-image-*_amd64.deb|head -n1)" > > version=$(echo $package | cut -d_ -f1|cut -d- -f3-) > > > > if [ "$gitversion" != "$version" ]; then > > echo "Build produced version $gitversion but got $version, ignoring" > > #exit 255 > > fi > > > > # Deploy > > scp $package target:a.deb > > ssh target sudo apt install ./a.deb > > ssh target rm -f a.deb > > ssh target ./grub_set_default_version.bash $version > > ssh target sudo shutdown -r now > > sleep 40 > > > > detected_version=$(ssh target uname -r) > > if [ "$detected_version" != "$version" ]; then > > echo "Booted to $detected_version but expected $version" > > exit 255 > > fi > > > > # Test > > exec ssh target sudo ./reproduce.bash > > > > > > Bisect printed the following log: > > > > git bisect start > > # bad: [ed29c2691188cf7ea2a46d40b891836c2bd1a4f5] drm/i915: Fix userptr so > > we do not have to worry about obj->mm.lock, v7. > > git bisect bad ed29c2691188cf7ea2a46d40b891836c2bd1a4f5 > > # bad: [762949bb1da78941b25e63f7e952af037eee15a9] drm: fix > > drm_mode_create_blob comment > > git bisect bad 762949bb1da78941b25e63f7e952af037eee15a9 > > # bad: [e40f97ef12772f8eb04b6a155baa1e0e2e8f3ecc] drm/gma500: Drop > > DRM_GMA600 config option > > git bisect bad e40f97ef12772f8eb04b6a155baa1e0e2e8f3ecc > > # bad: [5a838e5d5825c85556011478abde708251cc0776] drm/qxl: simplify > > qxl_fence_wait > > git bisect bad 5a838e5d5825c85556011478abde708251cc0776 > > # bad: [d2b6f8a179194de0ffc4886ffc2c4358d86047b8] Merge tag > > 'xfs-5.13-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux > > git bisect bad d2b6f8a179194de0ffc4886ffc2c4358d86047b8 > > # bad: [68a32ba14177d4a21c4a9a941cf1d7aea86d436f] Merge tag > > 'drm-next-2021-04-28' of git://anongit.freedesktop.org/drm/drm > > git bisect bad 68a32ba14177d4a21c4a9a941cf1d7aea86d436f > > # bad: [0698b13403788a646073fcd9b2294f2dce0ce429] drm/amdgpu: skip > > PP_MP1_STATE_UNLOAD on aldebaran > > git bisect bad 0698b13403788a646073fcd9b2294f2dce0ce429 > > # bad: [e1a5e6a8c48bf99ea374fb3e535661cfe226bca4] drm/doc: Add RFC section > > git bisect bad e1a5e6a8c48bf99ea374fb3e535661cfe226bca4 > > # bad: [ed29c2691188cf7ea2a46d40b891836c2bd1a4f5] drm/i915: Fix userptr so > > we do not have to worry about obj->mm.lock, v7. > > git bisect bad ed29c2691188cf7ea2a46d40b891836c2bd1a4f5 > > # bad: [2c8ab3339e398bbbcb0980933e266b93bedaae52] drm/i915: Pin timeline > > map after first timeline pin,
Bug#1054514: linux-image-6.1.0-13-amd64: Debian VM with qxl graphics freezes frequently
Hi, On Tue, 24 Oct 2023, Salvatore Bonaccorso wrote: Thanks for the excelent constructed report! I think it's best to forward this directly to upstream including the people for the bisected commit to get some idea. Thanks for the quick reply! Can you reproduce the issue with 6.5.8-1 in unstable as well? Unfortunately yes: ansible@target:~$ uname -r 6.5.0-3-amd64 ansible@target:~$ time sudo ./reproduce.bash Wed 25 Oct 2023 12:27:00 AM EEST starting round 1 Wed 25 Oct 2023 12:27:24 AM EEST starting round 2 Wed 25 Oct 2023 12:27:48 AM EEST starting round 3 bug was reproduced after 3 tries real0m48.838s user0m1.115s sys 0m45.530s I also tested upstream tag v6.6-rc6: ... + detected_version=6.6.0-rc6 + '[' 6.6.0-rc6 '!=' 6.6.0-rc6 ']' + exec ssh target sudo ./reproduce.bash Wed 25 Oct 2023 12:37:16 AM EEST starting round 1 Wed 25 Oct 2023 12:37:42 AM EEST starting round 2 Wed 25 Oct 2023 12:38:10 AM EEST starting round 3 Wed 25 Oct 2023 12:38:36 AM EEST starting round 4 Wed 25 Oct 2023 12:39:01 AM EEST starting round 5 Wed 25 Oct 2023 12:39:27 AM EEST starting round 6 bug was reproduced after 6 tries For completeness, here is also the grub_set_default_version.bash script that I had to write to automate this (maybe these could be in debian wiki?): #!/bin/bash set -x version="$1" idx=$(expr $(grep "menuentry " /boot/grub/grub.cfg | sed 1d |grep -n "'Debian GNU/Linux, with Linux $version'"|cut -d: -f1) - 1) exec sudo grub-set-default "1>$idx" -Timo
Bug#1054514: linux-image-6.1.0-13-amd64: Debian VM with qxl graphics freezes frequently
Hi Timo, On Tue, Oct 24, 2023 at 11:14:32PM +0300, Timo Lindfors wrote: > Package: src:linux > Version: 6.1.55-1 > Severity: normal > > Steps to reproduce: > 1) Install Debian 12 as a virtual machine using virt-manager, choose qxl >graphics card. You only need basic installation without wayland or X. > 2) Login from the console and save thë following to reproduce.bash: > > #!/bin/bash > > chvt 3 > for j in $(seq 80); do > echo "$(date) starting round $j" > if [ "$(journalctl --boot | grep "failed to allocate VRAM BO")" != "" ]; > then > echo "bug was reproduced after $j tries" > exit 1 > fi > for i in $(seq 100); do > dmesg > /dev/tty3 > done > done > > echo "bug could not be reproduced" > exit 0 > > > 3) Run chmod a+x reproduce.bash > 4) Run ./reproduce.bash and wait for up to 20 minutes. > > Expected results: > 4) The system prints a steady flow of text without kernel error messages > > Actual messages: > 4) At some point the text stops flowing and the script prints "bug was >reproduced". If you run "journalctl --boot" you see > > kernel: [TTM] Buffer eviction failed > kernel: qxl :00:02.0: object_init failed for (3149824, 0x0001) > kernel: [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO > > > > More info: > 1) The bug does not occur if I downgrade the kernel to >linux-image-5.10.0-26-amd64_5.10.197-1_amd64.deb from Debian 11. > 2) I used the following test_linux.bash to bisect this issue against >upstream source: > > #!/bin/bash > set -x > > gitversion="$(git describe HEAD|sed 's@^v@@')" > > git checkout drivers/gpu/drm/ttm/ttm_bo.c include/drm/ttm/ttm_bo_api.h > git show bec771b5e0901f4b0bc861bcb58056de5151ae3a | patch -p1 > # Build > cp ~/kernel.config .config > # cp /boot/config-$(uname -r) .config > # scripts/config --enable LOCALVERSION_AUTO > # scripts/config --disable DEBUG_INFO > # scripts/config --disable SYSTEM_TRUSTED_KEYRING > # scripts/config --set-str SYSTEM_TRUSTED_KEYS '' > # scripts/config --disable STACKPROTECTOR_STRONG > make olddefconfig > # make localmodconfig > make -j$(nproc --all) bindeb-pkg > rc="$?" > if [ "$rc" != "0" ]; then > exit 125 > fi > git checkout drivers/gpu/drm/ttm/ttm_bo.c include/drm/ttm/ttm_bo_api.h > > package="$(ls --sort=time ../linux-image-*_amd64.deb|head -n1)" > version=$(echo $package | cut -d_ -f1|cut -d- -f3-) > > if [ "$gitversion" != "$version" ]; then > echo "Build produced version $gitversion but got $version, ignoring" > #exit 255 > fi > > # Deploy > scp $package target:a.deb > ssh target sudo apt install ./a.deb > ssh target rm -f a.deb > ssh target ./grub_set_default_version.bash $version > ssh target sudo shutdown -r now > sleep 40 > > detected_version=$(ssh target uname -r) > if [ "$detected_version" != "$version" ]; then > echo "Booted to $detected_version but expected $version" > exit 255 > fi > > # Test > exec ssh target sudo ./reproduce.bash > > > Bisect printed the following log: > > git bisect start > # bad: [ed29c2691188cf7ea2a46d40b891836c2bd1a4f5] drm/i915: Fix userptr so we > do not have to worry about obj->mm.lock, v7. > git bisect bad ed29c2691188cf7ea2a46d40b891836c2bd1a4f5 > # bad: [762949bb1da78941b25e63f7e952af037eee15a9] drm: fix > drm_mode_create_blob comment > git bisect bad 762949bb1da78941b25e63f7e952af037eee15a9 > # bad: [e40f97ef12772f8eb04b6a155baa1e0e2e8f3ecc] drm/gma500: Drop DRM_GMA600 > config option > git bisect bad e40f97ef12772f8eb04b6a155baa1e0e2e8f3ecc > # bad: [5a838e5d5825c85556011478abde708251cc0776] drm/qxl: simplify > qxl_fence_wait > git bisect bad 5a838e5d5825c85556011478abde708251cc0776 > # bad: [d2b6f8a179194de0ffc4886ffc2c4358d86047b8] Merge tag > 'xfs-5.13-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux > git bisect bad d2b6f8a179194de0ffc4886ffc2c4358d86047b8 > # bad: [68a32ba14177d4a21c4a9a941cf1d7aea86d436f] Merge tag > 'drm-next-2021-04-28' of git://anongit.freedesktop.org/drm/drm > git bisect bad 68a32ba14177d4a21c4a9a941cf1d7aea86d436f > # bad: [0698b13403788a646073fcd9b2294f2dce0ce429] drm/amdgpu: skip > PP_MP1_STATE_UNLOAD on aldebaran > git bisect bad 0698b13403788a646073fcd9b2294f2dce0ce429 > # bad: [e1a5e6a8c48bf99ea374fb3e535661cfe226bca4] drm/doc: Add RFC section > git bisect bad e1a5e6a8c48bf99ea374fb3e535661cfe226bca4 > # bad: [ed29c2691188cf7ea2a46d40b891836c2bd1a4f5] drm/i915: Fix userptr so we > do not have to worry about obj->mm.lock, v7. > git bisect bad ed29c2691188cf7ea2a46d40b891836c2bd1a4f5 > # bad: [2c8ab3339e398bbbcb0980933e266b93bedaae52] drm/i915: Pin timeline map > after first timeline pin, v4. > git bisect bad 2c8ab3339e398bbbcb0980933e266b93bedaae52 > # bad: [2eb8e1a69d9f8cc9c0a75e327f854957224ba421] drm/i915/gem: Drop > relocation support on all new hardware (v6) > git bisect bad 2eb8e1a69d9f8cc9c0a75e327f854957224ba421 > # bad: [b5b6f6a610127b17f20c0ca03dd27beee4ddc2b2] drm/i915/gem: Drop legacy > execbuffer support
Bug#1054514: linux-image-6.1.0-13-amd64: Debian VM with qxl graphics freezes frequently
Package: src:linux Version: 6.1.55-1 Severity: normal Steps to reproduce: 1) Install Debian 12 as a virtual machine using virt-manager, choose qxl graphics card. You only need basic installation without wayland or X. 2) Login from the console and save thë following to reproduce.bash: #!/bin/bash chvt 3 for j in $(seq 80); do echo "$(date) starting round $j" if [ "$(journalctl --boot | grep "failed to allocate VRAM BO")" != "" ]; then echo "bug was reproduced after $j tries" exit 1 fi for i in $(seq 100); do dmesg > /dev/tty3 done done echo "bug could not be reproduced" exit 0 3) Run chmod a+x reproduce.bash 4) Run ./reproduce.bash and wait for up to 20 minutes. Expected results: 4) The system prints a steady flow of text without kernel error messages Actual messages: 4) At some point the text stops flowing and the script prints "bug was reproduced". If you run "journalctl --boot" you see kernel: [TTM] Buffer eviction failed kernel: qxl :00:02.0: object_init failed for (3149824, 0x0001) kernel: [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO More info: 1) The bug does not occur if I downgrade the kernel to linux-image-5.10.0-26-amd64_5.10.197-1_amd64.deb from Debian 11. 2) I used the following test_linux.bash to bisect this issue against upstream source: #!/bin/bash set -x gitversion="$(git describe HEAD|sed 's@^v@@')" git checkout drivers/gpu/drm/ttm/ttm_bo.c include/drm/ttm/ttm_bo_api.h git show bec771b5e0901f4b0bc861bcb58056de5151ae3a | patch -p1 # Build cp ~/kernel.config .config # cp /boot/config-$(uname -r) .config # scripts/config --enable LOCALVERSION_AUTO # scripts/config --disable DEBUG_INFO # scripts/config --disable SYSTEM_TRUSTED_KEYRING # scripts/config --set-str SYSTEM_TRUSTED_KEYS '' # scripts/config --disable STACKPROTECTOR_STRONG make olddefconfig # make localmodconfig make -j$(nproc --all) bindeb-pkg rc="$?" if [ "$rc" != "0" ]; then exit 125 fi git checkout drivers/gpu/drm/ttm/ttm_bo.c include/drm/ttm/ttm_bo_api.h package="$(ls --sort=time ../linux-image-*_amd64.deb|head -n1)" version=$(echo $package | cut -d_ -f1|cut -d- -f3-) if [ "$gitversion" != "$version" ]; then echo "Build produced version $gitversion but got $version, ignoring" #exit 255 fi # Deploy scp $package target:a.deb ssh target sudo apt install ./a.deb ssh target rm -f a.deb ssh target ./grub_set_default_version.bash $version ssh target sudo shutdown -r now sleep 40 detected_version=$(ssh target uname -r) if [ "$detected_version" != "$version" ]; then echo "Booted to $detected_version but expected $version" exit 255 fi # Test exec ssh target sudo ./reproduce.bash Bisect printed the following log: git bisect start # bad: [ed29c2691188cf7ea2a46d40b891836c2bd1a4f5] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v7. git bisect bad ed29c2691188cf7ea2a46d40b891836c2bd1a4f5 # bad: [762949bb1da78941b25e63f7e952af037eee15a9] drm: fix drm_mode_create_blob comment git bisect bad 762949bb1da78941b25e63f7e952af037eee15a9 # bad: [e40f97ef12772f8eb04b6a155baa1e0e2e8f3ecc] drm/gma500: Drop DRM_GMA600 config option git bisect bad e40f97ef12772f8eb04b6a155baa1e0e2e8f3ecc # bad: [5a838e5d5825c85556011478abde708251cc0776] drm/qxl: simplify qxl_fence_wait git bisect bad 5a838e5d5825c85556011478abde708251cc0776 # bad: [d2b6f8a179194de0ffc4886ffc2c4358d86047b8] Merge tag 'xfs-5.13-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux git bisect bad d2b6f8a179194de0ffc4886ffc2c4358d86047b8 # bad: [68a32ba14177d4a21c4a9a941cf1d7aea86d436f] Merge tag 'drm-next-2021-04-28' of git://anongit.freedesktop.org/drm/drm git bisect bad 68a32ba14177d4a21c4a9a941cf1d7aea86d436f # bad: [0698b13403788a646073fcd9b2294f2dce0ce429] drm/amdgpu: skip PP_MP1_STATE_UNLOAD on aldebaran git bisect bad 0698b13403788a646073fcd9b2294f2dce0ce429 # bad: [e1a5e6a8c48bf99ea374fb3e535661cfe226bca4] drm/doc: Add RFC section git bisect bad e1a5e6a8c48bf99ea374fb3e535661cfe226bca4 # bad: [ed29c2691188cf7ea2a46d40b891836c2bd1a4f5] drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v7. git bisect bad ed29c2691188cf7ea2a46d40b891836c2bd1a4f5 # bad: [2c8ab3339e398bbbcb0980933e266b93bedaae52] drm/i915: Pin timeline map after first timeline pin, v4. git bisect bad 2c8ab3339e398bbbcb0980933e266b93bedaae52 # bad: [2eb8e1a69d9f8cc9c0a75e327f854957224ba421] drm/i915/gem: Drop relocation support on all new hardware (v6) git bisect bad 2eb8e1a69d9f8cc9c0a75e327f854957224ba421 # bad: [b5b6f6a610127b17f20c0ca03dd27beee4ddc2b2] drm/i915/gem: Drop legacy execbuffer support (v2) git bisect bad b5b6f6a610127b17f20c0ca03dd27beee4ddc2b2 # bad: [06debd6e1b28029e6e77c41e59a162868f377897] Merge tag 'drm-intel-next-2021-03-16' of git://anongit.freedesktop.org/drm/drm-intel into drm-next git bisect bad 06debd6e1b28029e6e77c41e59a162868f377897 # good: [e19eede54240d64b4baf9b0df4dfb8191f7ae48b] Merge branch