* Daniel P. Berrangé: > On Mon, Jul 03, 2023 at 06:03:08PM +0200, Pierrick Bouvier wrote: >> Hi everyone, >> >> Recently (in d135f781 [1], between v7.0.0 and v8.0.0), qemu-user default cpu >> was updated to "max" instead of qemu32/qemu64. >> >> This change "broke" qemu self emulation if this new default cpu is used. >> >> $ ./qemu-x86_64 ./qemu-x86_64 --version >> qemu-x86_64: ../util/cacheflush.c:212: init_cache_info: Assertion `(isize & >> (isize - 1)) == 0' failed. >> qemu: uncaught target signal 6 (Aborted) - core dumped >> Aborted >> >> By setting cpu back to qemu64, it works again. >> $ ./qemu-x86_64 -cpu qemu64 ./qemu-x86_64 --version >> qemu-x86_64 version 8.0.50 (v8.0.0-2317-ge125b08ed6) >> Copyright (c) 2003-2023 Fabrice Bellard and the QEMU Project developers >> >> Commenting assert does not work, as qemu aligned malloc fail shortly after. >> >> I'm willing to fix it, but I'm not sure what is the issue with "max" cpu >> exactly. Is it missing CPU cache line, or something else? > > I've observed GLibC is issuing CPUID leaf 0x8000_001d > > QEMU 'max' CPU model doesn't defnie xlevel, so QEMU makes it default > to the same as min_xlevel, which is calculated to be 0x8000_000a. > > cpu_x86_cpuid() in QEMU sees CPUID leaf 0x8000_001d is above 0x8000_000a, > and so considers it an invaild CPUID and thus forces it to report > 0x0000_000d which is supposedly what an invalid CPUID leaf should do. > > > Net result: glibc is asking for 0x8000_001d, but getting back data > for 0x0000_000d. > > This doesn't end happily for obvious reasons, getting garbage for > the dcache sizes. > > > The 'qemu64' CPU model also gets CPUID leaf 0x8000_001d capped back > to 0x0000_000d, but crucially qemu64 lacks the 'xsave' feature bit, > so QEMU returns all-zeroes for CPUID leaf 0x0000_000d. Still not > good, but this makes glibc report 0 for DCACHE_*, which in turn > avoids tripping up the nested qemu which queries DCACHE sysconf. > > So the problem is thus more widespread than just 'max' CPU model. > > Any QEMU CPU model with vendor=AuthenticAMD and the xsave feature, > and the xlevel unset, will cause glibc to report garbage for the > L1D cache info > > Any QEMU CPU model with vendor=AuthenticAMD and without the xsave > feature, and the xlevel unset, will cause glibc to report zeroes > for L1D cache info > > Neither is good, but the latter at least doesn't trip up the > nested QEMU when it queries L1D cache info. > > I'm unsure if QEMU's behaviour is correct with calculating the > default 'xlevel' values for 'max', but I'm assuming the xlevel > was correct for Opteron_G4/5 since those are explicitly set > in the code for along time.
We are tracking this as: New AMD cache size computation logic does not work for some CPUs, hypervisors <https://sourceware.org/bugzilla/show_bug.cgi?id=30428> I filed it after we resolved the earlier crashes because the data is clearly not accurate. I was also able to confirm that impacts more than just hypervisors. Sajan posted a first patch: [PATCH] x86: Fix for cache computation on AMD legacy cpus. <https://sourceware.org/pipermail/libc-alpha/2023-June/148763.html> However, it changes the reported cache sizes on some older CPUs compared to what we had before (although the values are no longer zero at least). Thanks, Florian