Your message dated Fri, 18 Aug 2023 21:29:25 +0100 with message-id <2dadb28ca368809acbb9900196ab200e626ae565.ca...@adam-barratt.org.uk> and subject line Re: Bug#1044518: linux: "RIP: 0010:get_xsave_addr+0x9b/0xb0" stacktrace in early boot with -24 bullseye kernel has caused the Debian Bug report #1044518, regarding linux: "RIP: 0010:get_xsave_addr+0x9b/0xb0" stacktrace in early boot with -24 bullseye kernel to be marked as done.
This means that you claim that the problem has been dealt with. If this is not the case it is now your responsibility to reopen the Bug report if necessary, and/or fix the problem forthwith. (NB: If you are a system administrator and have no idea what this message is talking about, this may indicate a serious mail system misconfiguration somewhere. Please contact [email protected] immediately.) -- 1044518: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1044518 Debian Bug Tracking System Contact [email protected] with problems
--- Begin Message ---Source: linux Version: 5.10.179-5 User: [email protected] Usertags: needed-by-DSA-Team X-Debbugs-Cc: [email protected], [email protected] Hi, Since the kernels on both the host and guests were upgraded to 5.10.179-5 (from 5.10.179-3), the guests on one of our Ganeti clusters have been reporting as tainted. Looking at dmesg shows the following trace early in boot: [ 0.201347] RIP: 0010:get_xsave_addr+0x9b/0xb0 [ 0.201351] Code: 48 83 c4 08 5b e9 15 80 bc 00 80 3d 8d 7c 80 01 00 75 a8 48 c7 c7 97 de 6b b2 89 74 24 04 c6 05 79 7c 80 01 01 e8 f5 96 88 00 <0f> 0b 8b 74 24 04 eb 89 31 c0 e9 e6 7f bc 00 66 0f 1f 44 00 00 89 [ 0.201353] RSP: 0000:ffffffffb2c03ec8 EFLAGS: 00010282 [ 0.201356] RAX: 0000000000000000 RBX: ffffffffb2e6a600 RCX: ffffffffb2cb3768 [ 0.201358] RDX: c0000000ffffefff RSI: 00000000ffffefff RDI: 0000000000000247 [ 0.201359] RBP: ffffffffb2e6a4a0 R08: 0000000000000000 R09: ffffffffb2c03ce8 [ 0.201361] R10: ffffffffb2c03ce0 R11: ffffffffb2ccb7a8 R12: 0000000000000246 [ 0.201362] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 0.201365] FS: 0000000000000000(0000) GS:ffff9588fbc00000(0000) knlGS:0000000000000000 [ 0.201367] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.201368] CR2: ffff9588fffff000 CR3: 000000008260a001 CR4: 00000000007308b0 [ 0.201373] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 0.201374] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 0.201376] Call Trace: [ 0.201383] identify_cpu+0x51f/0x540 [ 0.201389] identify_boot_cpu+0xc/0x94 [ 0.201392] arch_cpu_finalize_init+0x5/0x47 [ 0.201395] start_kernel+0x4ec/0x599 [ 0.201401] secondary_startup_64_no_verify+0xb0/0xbb [ 0.201406] ---[ end trace d7d9074a88473cb2 ]--- The systems seem to be running OK, but the stacktrace presumably points to an issue somewhere. A sample kvm invocation for an affected guest is ganeti04 18354 30.1 0.5 6015620 1114084 ? Sl Aug11 832:22 /usr/bin/kvm -name geo1.debian.org -m 1024 -smp 2 -pidfile /var/run/ganeti/kvm-hypervisor/pid/geo1.debian.org -device virtio-balloon -daemonize -D /var/log/ganeti/kvm/geo1.debian.org.log -machine pc-i440fx-5.2 -monitor unix:/var/run/ganeti/kvm-hypervisor/ctrl/geo1.debian.org.monitor,server,nowait -serial unix:/var/run/ganeti/kvm-hypervisor/ctrl/geo1.debian.org.serial,server,nowait -usb -display none -cpu host -uuid 36cf5fbc-1414-4b27-874e-ea3153150aa9 -device virtio-rng-pci,bus=pci.0,addr=0x1e,max-bytes=1024,period=1000 -global isa-fdc.fdtypeA=none -netdev type=tap,id=nic-6e9afdf8-ccaf-42e8,fd=10 -device virtio-net-pci,id=nic-6e9afdf8-ccaf-42e8,bus=pci.0,addr=0xd,netdev=nic-6e9afdf8-ccaf-42e8,mac=aa:00:00:46:8f:08 -incoming tcp:172.29.182.13:8102 -qmp unix:/var/run/ganeti/kvm-hypervisor/ctrl/geo1.debian.org.qmp,server,nowait -qmp unix:/var/run/ganeti/kvm-hypervisor/ctrl/geo1.debian.org.kvmd,server,nowait -boot c -device virtio-blk-pci,id=disk-8a45befd-be45-4b75,bus=pci.0,addr=0xc,drive=disk-8a45befd-be45-4b75 -drive file=/var/run/ganeti/instance-disks/geo1.debian.org:0,format=raw,if=none,aio=threads,cache=none,discard=unmap,id=disk-8a45befd-be45-4b75,auto-read-only=off -runas ganeti04 It seems that buster guests on the same host are unaffected, with similar-looking command lines. The host's CPUs are Intel Xeon Silver 4110. Our other x86-64 clusters either use AMD CPUs (also with "-cpu host") or Xeon E5-2699 v3 CPUs, with "-cpu Haswell-noTSX". Regards, Adam
--- End Message ---
--- Begin Message ---Version: 5.10.191-1 Hi, On Tue, 2023-08-15 at 23:08 +0200, Salvatore Bonaccorso wrote: > Hi Adam, > > On Tue, Aug 15, 2023 at 10:48:35PM +0200, Salvatore Bonaccorso wrote: > > Control: tags -1 + upstream > > > > Hi Adam, > > > > On Tue, Aug 15, 2023 at 10:06:16PM +0200, Salvatore Bonaccorso > > wrote: > > > Hi Adam, > > > > > > On Tue, Aug 15, 2023 at 09:37:36PM +0200, Salvatore Bonaccorso > > > wrote: > > > > Control: tags -1 + confirmed > > > > > > > > Hi Adam, > > > > > > > > On Tue, Aug 15, 2023 at 06:26:59PM +0100, Adam D. Barratt > > > > wrote: > > > > > On Sun, 2023-08-13 at 18:21 +0100, Adam D. Barratt wrote: > > > > > > Since the kernels on both the host and guests were upgraded > > > > > > to > > > > > > 5.10.179-5 (from 5.10.179-3), the guests on one of our > > > > > > Ganeti > > > > > > clusters > > > > > > have been reporting as tainted. Looking at dmesg shows the > > > > > > following > > > > > > trace early in boot: > > > > > > [...] > > Quick summary: v5.10.190 upstream exhibit the same problem, so not > > a > > backporting problem, and v5.10.191-rc1 for the upcoming 5.10.191 > > seems > > to fix the issue. > > This should be fixed by b3607269ff57 ("x86/pkeys: Revert a5eff7259790 > ("x86/pkeys: Add PKRU value to init_fpstate")")[1] upstream, which is > going to be a pplied in 5.10.191. > > [1] > https://git.kernel.org/linus/b3607269ff57fd3c9690cb25962c5e4b91a0fd3b > I'm happy to confirm that the 5.10.191-1 kernel fixes this issue for us; closing appropriately. Regards, Adam
--- End Message ---

