Bug#1058991: firmware-misc-nonfree: Possible missing firmware for module nouveau, kernel crash in nouveau

2023-12-19 Thread Diederik de Haas
Control: tag -1 +patch

On dinsdag 19 december 2023 17:14:22 CET you wrote:
> > If you manually create those links from the above "+Link:" lines,
> > would that fix the issues?
> 
> I've added only the ga107 symlinks (since this is what is needed
> for my machine) with
> ...
> and this fixes all the issues

Thanks, submitted the following MR to get the Debian package updated:
https://salsa.debian.org/kernel-team/firmware-nonfree/-/merge_requests/80

signature.asc
Description: This is a digitally signed message part.


Bug#1058991: firmware-misc-nonfree: Possible missing firmware for module nouveau, kernel crash in nouveau

2023-12-19 Thread Vincent Lefevre
On 2023-12-19 14:32:31 +0100, Diederik de Haas wrote:
> If you manually create those links from the above "+Link:" lines,
> would that fix the issues?

I've added only the ga107 symlinks (since this is what is needed
for my machine) with

for i in acr nvdec sec2
do
  mkdir /usr/lib/firmware/nvidia/ga107/$i &&
  cd /usr/lib/firmware/nvidia/ga107/$i &&
  ln -s ../../ga102/$i/* .
done

and this fixes all the issues: the "Possible missing firmware" messages
(for ga107), the kernel crashes that were mentioned in the journalctl
output, and the black screen issue.

Thank you,

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#1058991: firmware-misc-nonfree: Possible missing firmware for module nouveau, kernel crash in nouveau

2023-12-19 Thread Diederik de Haas
Control: severity -1 important
Control: tag -1 moreinfo

On Tuesday, 19 December 2023 13:57:51 CET Vincent Lefevre wrote:
> Another piece of information: this is a regression.
> 
> With the 6.1.0-16-amd64 kernel from stable, "journalctl -b -g ga107"
> gives
> 
> Dec 19 04:57:07 qaa kernel: nouveau :01:00.0: NVIDIA GA107 (b77000a1)
> Dec 19 04:57:07 qaa kernel: nouveau :01:00.0: firmware: failed to load
> nvidia/ga107/nvdec/scrubber.bin (-2) Dec 19 04:57:07 qaa kernel: nouveau
> :01:00.0: firmware: failed to load nvidia/ga107/nvdec/scrubber.bin (-2)
> 
> and I don't have any issue with the machine.
> 
> With the 6.5.0-5-amd64 kernel, "journalctl -b -1 -g ga107" gives

Upstream kernel commit 4b569ded09fdadb0c14f797c8dae4e8bc4bbad9f added lines to 
load the firmware files and was merged into kernel 6.2, so that it doesn't show 
up in a 6.1 kernel is expected.

Upstream firmware commit 2c2be4215fe29870dcd9a059ff8778e73269ddc1 added the 
files 
but it seems the Link lines weren't added to the Debian package in commit
9714742762ab2b278fd0961652a4dd54ff82ea8b

```
$ git show 2c2be4215fe29870dcd9a059ff8778e73269ddc1 | grep Link
@@ -5182,6 +5182,71 @@ Link: nvidia/tu117/nvdec/scrubber.bin -> ../../tu116/
nvdec/scrubber.bin
 Link: nvidia/tu117/sec2/desc.bin -> ../../tu116/sec2/desc.bin
 Link: nvidia/tu117/sec2/image.bin -> ../../tu116/sec2/image.bin
 Link: nvidia/tu117/sec2/sig.bin -> ../../tu116/sec2/sig.bin
+Link: nvidia/ga103/acr/ucode_ahesasc.bin -> ../../ga102/acr/ucode_ahesasc.bin
+Link: nvidia/ga103/acr/ucode_asb.bin -> ../../ga102/acr/ucode_asb.bin
+Link: nvidia/ga103/acr/ucode_unload.bin -> ../../ga102/acr/ucode_unload.bin
+Link: nvidia/ga103/nvdec/scrubber.bin -> ../../ga102/nvdec/scrubber.bin
+Link: nvidia/ga103/sec2/desc.bin -> ../../ga102/sec2/desc.bin
+Link: nvidia/ga103/sec2/hs_bl_sig.bin -> ../../ga102/sec2/hs_bl_sig.bin
+Link: nvidia/ga103/sec2/image.bin -> ../../ga102/sec2/image.bin
+Link: nvidia/ga103/sec2/sig.bin -> ../../ga102/sec2/sig.bin
+Link: nvidia/ga104/acr/ucode_ahesasc.bin -> ../../ga102/acr/ucode_ahesasc.bin
+Link: nvidia/ga104/acr/ucode_asb.bin -> ../../ga102/acr/ucode_asb.bin
+Link: nvidia/ga104/acr/ucode_unload.bin -> ../../ga102/acr/ucode_unload.bin
+Link: nvidia/ga104/nvdec/scrubber.bin -> ../../ga102/nvdec/scrubber.bin
+Link: nvidia/ga104/sec2/desc.bin -> ../../ga102/sec2/desc.bin
+Link: nvidia/ga104/sec2/hs_bl_sig.bin -> ../../ga102/sec2/hs_bl_sig.bin
+Link: nvidia/ga104/sec2/image.bin -> ../../ga102/sec2/image.bin
+Link: nvidia/ga104/sec2/sig.bin -> ../../ga102/sec2/sig.bin
+Link: nvidia/ga106/acr/ucode_ahesasc.bin -> ../../ga102/acr/ucode_ahesasc.bin
+Link: nvidia/ga106/acr/ucode_asb.bin -> ../../ga102/acr/ucode_asb.bin
+Link: nvidia/ga106/acr/ucode_unload.bin -> ../../ga102/acr/ucode_unload.bin
+Link: nvidia/ga106/nvdec/scrubber.bin -> ../../ga102/nvdec/scrubber.bin
+Link: nvidia/ga106/sec2/desc.bin -> ../../ga102/sec2/desc.bin
+Link: nvidia/ga106/sec2/hs_bl_sig.bin -> ../../ga102/sec2/hs_bl_sig.bin
+Link: nvidia/ga106/sec2/image.bin -> ../../ga102/sec2/image.bin
+Link: nvidia/ga106/sec2/sig.bin -> ../../ga102/sec2/sig.bin
+Link: nvidia/ga107/acr/ucode_ahesasc.bin -> ../../ga102/acr/ucode_ahesasc.bin
+Link: nvidia/ga107/acr/ucode_asb.bin -> ../../ga102/acr/ucode_asb.bin
+Link: nvidia/ga107/acr/ucode_unload.bin -> ../../ga102/acr/ucode_unload.bin
+Link: nvidia/ga107/nvdec/scrubber.bin -> ../../ga102/nvdec/scrubber.bin
+Link: nvidia/ga107/sec2/desc.bin -> ../../ga102/sec2/desc.bin
+Link: nvidia/ga107/sec2/hs_bl_sig.bin -> ../../ga102/sec2/hs_bl_sig.bin
+Link: nvidia/ga107/sec2/image.bin -> ../../ga102/sec2/image.bin
+Link: nvidia/ga107/sec2/sig.bin -> ../../ga102/sec2/sig.bin
```

If you manually create those links from the above "+Link:" lines, would that 
fix the issues?

On Tuesday, 19 December 2023 13:36:17 CET Vincent Lefevre wrote:
> for the above firmware, there's no "acr" directory in nvidia/ga107:

The directory is not physically present, but it ought to consists of symlinks 
to the ga102 directory, which does have an `acr` directory.

signature.asc
Description: This is a digitally signed message part.


Bug#1058991: firmware-misc-nonfree: Possible missing firmware for module nouveau, kernel crash in nouveau

2023-12-19 Thread Vincent Lefevre
Another piece of information: this is a regression.

With the 6.1.0-16-amd64 kernel from stable, "journalctl -b -g ga107"
gives

Dec 19 04:57:07 qaa kernel: nouveau :01:00.0: NVIDIA GA107 (b77000a1)
Dec 19 04:57:07 qaa kernel: nouveau :01:00.0: firmware: failed to load 
nvidia/ga107/nvdec/scrubber.bin (-2)
Dec 19 04:57:07 qaa kernel: nouveau :01:00.0: firmware: failed to load 
nvidia/ga107/nvdec/scrubber.bin (-2)

and I don't have any issue with the machine.

With the 6.5.0-5-amd64 kernel, "journalctl -b -1 -g ga107" gives

Dec 19 04:21:16 qaa kernel: nouveau :01:00.0: NVIDIA GA107 (b77000a1)
Dec 19 04:21:16 qaa kernel: nouveau :01:00.0: firmware: failed to load 
nvidia/ga107/acr/ucode_ahesasc.bin (-2)
Dec 19 04:21:16 qaa kernel: nouveau :01:00.0: firmware: failed to load 
nvidia/ga107/acr/ucode_ahesasc.bin (-2)
Dec 19 04:21:16 qaa kernel: nouveau :01:00.0: firmware: direct-loading 
firmware nvidia/ga107/gr/NET_img.bin
Dec 19 04:21:16 qaa kernel: nouveau :01:00.0: firmware: direct-loading 
firmware nvidia/ga107/gr/fecs_bl.bin
Dec 19 04:21:16 qaa kernel: nouveau :01:00.0: firmware: direct-loading 
firmware nvidia/ga107/gr/fecs_sig.bin
Dec 19 04:21:16 qaa kernel: nouveau :01:00.0: firmware: direct-loading 
firmware nvidia/ga107/gr/gpccs_bl.bin
Dec 19 04:21:16 qaa kernel: nouveau :01:00.0: firmware: direct-loading 
firmware nvidia/ga107/gr/gpccs_sig.bin
Dec 19 04:21:16 qaa kernel: nouveau :01:00.0: firmware: failed to load 
nvidia/ga107/sec2/sig.bin (-2)
Dec 19 04:21:16 qaa kernel: nouveau :01:00.0: firmware: failed to load 
nvidia/ga107/sec2/sig.bin (-2)
Dec 19 04:21:16 qaa kernel: nouveau :01:00.0: firmware: failed to load 
nvidia/ga107/nvdec/scrubber.bin (-2)
Dec 19 04:21:16 qaa kernel: nouveau :01:00.0: firmware: failed to load 
nvidia/ga107/nvdec/scrubber.bin (-2)

and I get the crashes, and as soon as I log out or after
"xset dpms force off" is run, the screen remains black.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#1058991: firmware-misc-nonfree: Possible missing firmware for module nouveau, kernel crash in nouveau

2023-12-19 Thread Vincent Lefevre
On 2023-12-19 04:32:02 +0100, Vincent Lefevre wrote:
> firmware-misc-nonfree triggers the following warnings:
> 
> update-initramfs: Generating /boot/initrd.img-6.5.0-5-amd64
[...]
> W: Possible missing firmware /lib/firmware/nvidia/ga107/acr/ucode_ahesasc.bin 
> for module nouveau
[...]
> while https://forums.debian.net/viewtopic.php?t=155793 says that
> the solution is to install firmware-misc-nonfree!

I would add that even the section "Firmware missing from Debian" in
https://wiki.debian.org/Firmware does not apply because, for instance
for the above firmware, there's no "acr" directory in nvidia/ga107:

https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/nvidia/ga107
contains only a "gr" directory.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#1058991: firmware-misc-nonfree: Possible missing firmware for module nouveau, kernel crash in nouveau

2023-12-18 Thread Vincent Lefevre
Package: firmware-misc-nonfree
Version: 20230625-1
Severity: grave
Justification: renders package unusable

firmware-misc-nonfree triggers the following warnings:

update-initramfs: Generating /boot/initrd.img-6.5.0-5-amd64
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8156b-2.fw for module 
r8152
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8156a-2.fw for module 
r8152
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8153c-1.fw for module 
r8152
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8153b-2.fw for module 
r8152
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8153a-4.fw for module 
r8152
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8153a-3.fw for module 
r8152
W: Possible missing firmware /lib/firmware/rtl_nic/rtl8153a-2.fw for module 
r8152
W: Possible missing firmware /lib/firmware/i915/mtl_huc_gsc.bin for module i915
W: Possible missing firmware /lib/firmware/i915/mtl_guc_70.bin for module i915
W: Possible missing firmware /lib/firmware/nvidia/ga107/acr/ucode_ahesasc.bin 
for module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga106/acr/ucode_ahesasc.bin 
for module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga104/acr/ucode_ahesasc.bin 
for module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga103/acr/ucode_ahesasc.bin 
for module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga107/acr/ucode_asb.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga106/acr/ucode_asb.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga104/acr/ucode_asb.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga103/acr/ucode_asb.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga107/acr/ucode_unload.bin 
for module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga106/acr/ucode_unload.bin 
for module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga104/acr/ucode_unload.bin 
for module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga103/acr/ucode_unload.bin 
for module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga107/nvdec/scrubber.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga106/nvdec/scrubber.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga104/nvdec/scrubber.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga103/nvdec/scrubber.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga107/sec2/hs_bl_sig.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga107/sec2/sig.bin for module 
nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga107/sec2/image.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga107/sec2/desc.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga106/sec2/hs_bl_sig.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga106/sec2/sig.bin for module 
nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga106/sec2/image.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga106/sec2/desc.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga104/sec2/hs_bl_sig.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga104/sec2/sig.bin for module 
nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga104/sec2/image.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga104/sec2/desc.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga103/sec2/hs_bl_sig.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga103/sec2/sig.bin for module 
nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga103/sec2/image.bin for 
module nouveau
W: Possible missing firmware /lib/firmware/nvidia/ga103/sec2/desc.bin for 
module nouveau

while https://forums.debian.net/viewtopic.php?t=155793 says that
the solution is to install firmware-misc-nonfree!

And I get in the journalctl output:

Dec 19 04:02:54 qaa kernel: [ cut here ]
Dec 19 04:02:54 qaa kernel: nouveau :01:00.0: timeout
Dec 19 04:02:54 qaa kernel: WARNING: CPU: 7 PID: 1347 at 
drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.c:1792 
gf100_gr_init_ctxctl_ext+0x555/0x570 [nouveau]
Dec 19 04:02:54 qaa kernel: Modules linked in: cmac algif_hash algif_skcipher 
af_alg snd_hda_codec_hdmi qrtr bnep binfmt_misc snd_sof_pci_intel_tgl 
snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation 
snd_sof_intel_hda_mlink soundwire_cadence snd_sof_intel_hda snd_sof_pci 
snd_sof_xtensa_dsp intel_uncore_frequency snd_sof intel_uncore_frequency_common 
snd_sof_utils snd_soc_hdac_hda x86_pkg_temp_thermal snd_hda_ext_core 
intel_powerclamp snd_soc_acpi_intel_match btusb snd_soc_acpi coretemp 
snd_ctl_led btrtl snd_soc_core iwlmvm snd_hda_codec_realtek btbcm snd_compress 
btintel