On 7/3/2025 12:14 PM, Mi, Dapeng wrote:
> On 6/20/2025 5:27 PM, Zhao Liu wrote:
>> For a long time, the default cache models used in CPUID 0x2 and
>> 0x4 were inconsistent and had a FIXME note from Eduardo at commit
>> 5e891bf8fd50 ("target-i386: Use #defines instead of magic numbers for
>> CPUID cache info"):
>>
>> "/*FIXME: CPUID leaf 2 descriptor is inconsistent with CPUID leaf 4 */".
>>
>> This difference is wrong, in principle, both 0x2 and 0x4 are used for
>> Intel's cache description. 0x2 leaf is used for ancient machines while
>> 0x4 leaf is a subsequent addition, and both should be based on the same
>> cache model. Furthermore, on real hardware, 0x4 leaf should be used in
>> preference to 0x2 when it is available.
>>
>> Revisiting the git history, that difference occurred much earlier.
>>
>> Current legacy_l2_cache_cpuid2 (hardcode: "0x2c307d"), which is used for
>> CPUID 0x2 leaf, is introduced in commit d8134d91d9b7 ("Intel cache info,
>> by Filip Navara."). Its commit message didn't said anything, but its
>> patch [1] mentioned the cache model chosen is "closest to the ones
>> reported in the AMD registers". Now it is not possible to check which
>> AMD generation this cache model is based on (unfortunately, AMD does not
>> use 0x2 leaf), but at least it is close to the Pentium 4.
>>
>> In fact, the patch description of commit d8134d91d9b7 is also a bit
>> wrong, the original cache model in leaf 2 is from Pentium Pro, and its
>> cache descriptor had specified the cache line size ad 32 byte by default,
>> while the updated cache model in commit d8134d91d9b7 has 64 byte line
>> size. But after so many years, such judgments are no longer meaningful.
>>
>> On the other hand, for legacy_l2_cache, which is used in CPUID 0x4 leaf,
>> is based on Intel Core Duo (patch [2]) and Core2 Duo (commit e737b32a3688
>> ("Core 2 Duo specification (Alexander Graf).")
>>
>> The patches of Core Duo and Core 2 Duo add the cache model for CPUID
>> 0x4, but did not update CPUID 0x2 encoding. This is the reason that
>> Intel Guests use two cache models in 0x2 and 0x4 all the time.
>>
>> Of course, while no Core Duo or Core 2 Duo machines have been found for
>> double checking, this still makes no sense to encode different cache
>> models on a single machine.
>>
>> Referring to the SDM and the real hardware available, 0x2 leaf can be
>> directly encoded 0xFF to instruct software to go to 0x4 leaf to get the
>> cache information, when 0x4 is available.
>>
>> Therefore, it's time to clean up Intel's default cache models. As the
>> first step, add "x-consistent-cache" compat option to allow newer
>> machines (v10.1 and newer) to have the consistent cache model in CPUID
>> 0x2 and 0x4 leaves.
>>
>> This doesn't affect the CPU models with CPUID level < 4 ("486",
>> "pentium", "pentium2" and "pentium3"), because they have already had the
>> special default cache model - legacy_intel_cpuid2_cache_info.
>>
>> [1]: 
>> https://lore.kernel.org/qemu-devel/5b31733c0709081227w3e5f1036odbc649edfdc8c...@mail.gmail.com/
>> [2]: https://lore.kernel.org/qemu-devel/478b65c8.2080...@csgraf.de/
>>
>> Cc: Alexander Graf <ag...@csgraf.de>
>> Signed-off-by: Zhao Liu <zhao1....@intel.com>
>> ---
>>  hw/i386/pc.c      | 4 +++-
>>  target/i386/cpu.c | 7 ++++++-
>>  target/i386/cpu.h | 7 +++++++
>>  3 files changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>> index b2116335752d..ad2d6495ebde 100644
>> --- a/hw/i386/pc.c
>> +++ b/hw/i386/pc.c
>> @@ -81,7 +81,9 @@
>>      { "qemu64-" TYPE_X86_CPU, "model-id", "QEMU Virtual CPU version " v, },\
>>      { "athlon-" TYPE_X86_CPU, "model-id", "QEMU Virtual CPU version " v, },
>>  
>> -GlobalProperty pc_compat_10_0[] = {};
>> +GlobalProperty pc_compat_10_0[] = {
>> +    { TYPE_X86_CPU, "x-consistent-cache", "false" },
>> +};
>>  const size_t pc_compat_10_0_len = G_N_ELEMENTS(pc_compat_10_0);
>>  
>>  GlobalProperty pc_compat_9_2[] = {};
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index 0a2c32214cc3..2f895bf13523 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -8931,7 +8931,11 @@ static void x86_cpu_realizefn(DeviceState *dev, Error 
>> **errp)
>>          /* Build legacy cache information */
>>          env->cache_info_cpuid2.l1d_cache = &legacy_l1d_cache;
>>          env->cache_info_cpuid2.l1i_cache = &legacy_l1i_cache;
>> -        env->cache_info_cpuid2.l2_cache = &legacy_l2_cache_cpuid2;
>> +        if (!cpu->consistent_cache) {
>> +            env->cache_info_cpuid2.l2_cache = &legacy_l2_cache_cpuid2;
>> +        } else {
>> +            env->cache_info_cpuid2.l2_cache = &legacy_l2_cache;
>> +        }
> This would encode the valid L1 and L3 cache descriptors and "0xff"
> descriptor into CPUID leaf 0x2 when there is CPUID leaf 0x4. It seems a
> little bit of ambiguous to mix "0xff" descriptor with other valid
> descriptors and it isn't identical with real HW. Do we consider to make it
> identical with real HW? Thanks.

Just found the subsequent patch 05/16 has addressed this concern. Please
ignore this comment.


>
>
>>          env->cache_info_cpuid2.l3_cache = &legacy_l3_cache;
>>  
>>          env->cache_info_cpuid4.l1d_cache = &legacy_l1d_cache;
>> @@ -9457,6 +9461,7 @@ static const Property x86_cpu_properties[] = {
>>       * own cache information (see x86_cpu_load_def()).
>>       */
>>      DEFINE_PROP_BOOL("legacy-cache", X86CPU, legacy_cache, true),
>> +    DEFINE_PROP_BOOL("x-consistent-cache", X86CPU, consistent_cache, true),
>>      DEFINE_PROP_BOOL("legacy-multi-node", X86CPU, legacy_multi_node, false),
>>      DEFINE_PROP_BOOL("xen-vapic", X86CPU, xen_vapic, false),
>>  
>> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
>> index 5910dcf74d42..3c7e59ffb12a 100644
>> --- a/target/i386/cpu.h
>> +++ b/target/i386/cpu.h
>> @@ -2259,6 +2259,13 @@ struct ArchCPU {
>>       */
>>      bool legacy_cache;
>>  
>> +    /*
>> +     * Compatibility bits for old machine types.
>> +     * If true, use the same cache model in CPUID leaf 0x2
>> +     * and 0x4.
>> +     */
>> +    bool consistent_cache;
>> +
>>      /* Compatibility bits for old machine types.
>>       * If true decode the CPUID Function 0x8000001E_ECX to support multiple
>>       * nodes per processor

Reply via email to