He Zhe <zhe...@windriver.com> writes:
(Cc: Paolo) > Hi All, > > We are experiencing a general protection fault with qemu-system-i386 as > follow. > This can be reproduced with kernel v5.15 and latest v6.2-rc3 as we found so > far. > > It would work well if we reverted the commit > 2f8a21d8ff3af484a37edc8ea61d127ec1529ab5 ("target/i386: Enable AVX cpuid bits > when using TCG") > introduced since qemu 7.2. > > We also tried setting cpu to Broadwell and Icelake-Server and got the same > error. > > ./qemu-system-i386 -object rng-random,filename=/dev/urandom,id=rng0 > -device virtio-rng-pci,rng=rng0 -drive > file=/tmp/rootfs.ext4,if=virtio,format=raw -usb -device usb-tablet > -usb -device usb-kbd -cpu Haswell -machine q35,i8042=off -smp 4 -m > 8192 -m 8192 -smp cpus=8 -serial mon:stdio -serial null -nographic > -kernel /tmp/bzImage -append 'root=/dev/vda rw ip=dhcp console=ttyS0 > console=ttyS1 oprofile.timer=1 tsc=reliable no_timer_check > rcupdate.rcu_expedited=1 ' > > [ OK ] Started System Logging Service. > [ 204.194033] traps: named[280] general protection fault ip:b7ef8545 > sp:bf8d5a1c error:0 > [ 204.198913] audit: type=1701 audit(1673507379.204:2): > auid=4294967295 uid=997 gid=996 ses=4294967295 subj=kernel pid=280 > comm="named" ex1 > [ 204.219923] ------------[ cut here ]------------ > [ 204.220455] Bad FPU state detected at > restore_fpregs_from_fpstate+0x3a/0x78, reinitializing FPU > registers. > [ 204.221442] WARNING: CPU: 4 PID: 274 at ../arch/x86/mm/extable.c:127 > fixup_exception+0x3f0/0x41c > [ 204.223147] Modules linked in: > [ 204.223945] CPU: 4 PID: 274 Comm: rs:main Q:Reg Not tainted 6.2.0-rc3 #1 > [ 204.224769] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), > BIOS rel-1.16.1-0-g3208b098f51a-prebuilt.qemu.org 04/01/2014 > [ 204.226061] EIP: fixup_exception+0x3f0/0x41c > [ 204.226533] Code: ff ff 8d 74 26 00 0f 0b ba 4c c9 dc d1 e9 10 fd > ff ff b1 01 89 44 24 04 c7 04 24 e0 44 98 d1 88 0d 69 87 cc d1 e8 8c > bf > [ 204.228038] EAX: 0000005e EBX: d1aee764 ECX: 00000027 EDX: 00000001 > [ 204.228498] ESI: c18efee4 EDI: 0000000d EBP: c18efe58 ESP: c18efddc > [ 204.229102] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000086 > [ 204.229662] CR0: 80050033 CR2: bf8d5d54 CR3: 02aaf000 CR4: 001506d0 > [ 204.230408] Call Trace: > [ 204.232101] ? restore_fpregs_from_fpstate+0x3a/0x78 > [ 204.232733] ? __switch_to_asm+0x1c/0xe4 > [ 204.233028] ? __schedule+0x28c/0x844 > [ 204.233362] ? _raw_spin_lock+0x10/0x34 > [ 204.233829] exc_general_protection+0x81/0x340 > [ 204.234403] ? futex_wait+0xb4/0x190 > [ 204.234818] ? exc_bounds+0xa4/0xa4 > [ 204.235054] handle_exception+0x133/0x133 > [ 204.235629] EIP: restore_fpregs_from_fpstate+0x3a/0x78 It looks like this is failing on: /* * Use XRSTORS to restore context if it is enabled. XRSTORS supports compact * XSAVE area format. */ #define XSTATE_XRESTORE(st, lmask, hmask) \ asm volatile(ALTERNATIVE(XRSTOR, \ XRSTORS, X86_FEATURE_XSAVES) \ "\n" \ "3:\n" \ _ASM_EXTABLE_TYPE(661b, 3b, EX_TYPE_FPU_RESTORE) \ : \ : "D" (st), "m" (*st), "a" (lmask), "d" (hmask) \ : "memory") possibly triggering an exception when doing XRSTORS (but its hard to follow the alternative code). The xrstors instruction is tested by check-tcg but maybe there is a kernel mode subtly that is missed. Hopefully Paolo can see better than me. > [ 205.769853] EIP: entry_SYSENTER_32+0xe0/0xf1 > [ 205.769887] Code: 8b 54 24 30 8b 4c 24 3c 8e 64 24 24 5b 83 c4 08 > 5e 5f 5d 89 c4 eb 0b 0f 20 d8 0d 00 10 00 00 0f 22 d8 0f ba 34 24 09 > 96 > [ 205.769913] EAX: 00000000 EBX: 012b373c ECX: b69feff0 EDX: b7f59549 > [ 205.769933] ESI: 00000000 EDI: 00000000 EBP: ffffffff ESP: ff8b0000 > [ 205.769952] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 EFLAGS: 00000282 > [ 205.769975] CR0: 80050033 CR2: bf602e00 CR3: 02aaf000 CR4: 001506d0 > [ 205.799858] systemd (1) used greatest stack depth: 5568 bytes left > [ 205.799994] Kernel panic - not syncing: Attempted to kill init! > exitcode=0x0000000b > [ 205.805801] Kernel Offset: disabled > [ 205.806723] ---[ end Kernel panic - not syncing: Attempted to kill init! > exitcode=0x0000000b ]--- > > System hangs... > > > Regards, > Zhe -- Alex Bennée Virtualisation Tech Lead @ Linaro