On Wed, Jul 09, 2025 at 11:03:19 +0200, Hector CAO wrote: > From: Hector Cao <hector....@canonical.com> > > Add documentation on the way libvirt displays the Host CPU > model and capabilities (features). There is an implicit > expection from users to get the CPU model name matching the
s/expection/expectation/ > CPU model they are running on, however, this does not happen > most of the time. As a consequence, having a documentation > is useful both for users to align their expectation and for > us to point to a place where the situation is clearly explained. > > Signed-off-by: Hector Cao <hector....@canonical.com> > --- > docs/formatcaps.rst | 79 +++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 79 insertions(+) > > diff --git a/docs/formatcaps.rst b/docs/formatcaps.rst > index 7e525487e7..43098ee2cf 100644 > --- a/docs/formatcaps.rst > +++ b/docs/formatcaps.rst > @@ -371,3 +371,82 @@ capabilities enabled in the chip and BIOS you will see: > </guest> > > </capabilities> > + > +Host CPU model and features > +~~~~~~~~~~~~~~~~~ Title underline too short. And I think this section should be a subsection of Host capabilities. So please move it there and change ~ to something else (^ for example) to indicate a nested heading. > +As described in the (`Host capabilities`_) section, libvirt exposes to > +users the list of Host CPU features. Libvirt has a special way to > +expose this list: Then there's no need to reference the Host capabilities section anymore. > Instead of providing the full - and thereby often > +very long - set of features, libvirt specifies a CPU model name as s/name // (the CPU model is baseline + features, while the CPU model *name* would mean just the baseline itself) Typographic rules would dictate either using n-dash with spaces around or m-dash with no space. But it seams getting those from rst may be tricky so I suggest using "(and thereby often very long)" instead. > +baseline and additional features on top of it. > + > +Example: > + > +:: > + > + <capabilities> > + > + <host> > + <uuid>7b55704c-29f4-11b2-a85c-9dc6ff50623f</uuid> > + <cpu> > + <arch>x86_64</arch> > + <model>Skylake-Client-noTSX-IBRS</model> > + <vendor>Intel</vendor> > + ... > + <feature name='ds'/> > + <feature name='acpi'/> > + <feature name='ss'/> > + <feature name='ht'/> > + <feature name='tm'/> > + ... I don't think it's necessary to copy a part of the capabilities XML here, we can just refer the full example below. > + > +The ideal case would be that the baseline CPU model definition matches > +exactly the CPU present in the system and no additional feature is > +needed to express the capabilities of the CPU. For example, if you are > +running on a Server CPU you bought as ``Icelake`` type, the returned > +CPU model name could be ``Icelake-Server``. However, this ideal > +situation rarely happens, for two main reasons: > + > +- Manufacturers do not ship one (=1) type of CPU under a given name, I would remove the "(=1)" part, or maybe even rewrite this as "... ship a single type ..." > + there are various different SKUs with different features under the > + same name. Yet it is not practical to have a database listing all of > + those variants. Instead Libvirt only has a list of a baseline CPU > + model names - roughly one per generation - in > + ``/usr/share/libvirt/cpu_map/``. I think mentioning cpu_map path is going too much into internals mainly because those files are not supposed to be consumed by users as their format is not guaranteed to be backward compatible. We could just drop the last sentence. > + > +- Some features might be in the hardware, but unavailable for various > + reasons (BIOS and kernel configuration, disabled for security, > + ...). One typical example where this situation happens is related to > + the TSX mitigation [1]. As a mitigation to the TAA side channel Use "`TSX mitigation <https://docs.kernel.org/arch/x86/tsx_async_abort.html>`_" instead of "TSX mitigation [1]". > + attack, the Linux kernel disables by default TSX and its 2 features, > + ``rtm`` and ``hle``. Since many Linux distributions keep this safer > + default behavior these 2 features appear as disabled. "disabled" might be confusing as host capabilities do not show any features as disabled. Maybe "missing" would be a better word here? > +It chooses the named baseline model that shares the greatest number of > +features (CPUID bits and MSR features) with the actual CPU present in > +the machine and then lists the remaining named features as differences > +to that known name. As a consequence, the list of detected features > +is rarely a perfect match to a baseline model name. Sometimes that > +just means that you'll get the right name, but still a long list of > +features enabled or disabled on top of it. At other times it might > +even lead to a different named baseline model, usually an older CPU > +generation, being closer to the features libvirt finds in the CPU > +present in the system. In that cases it is closer to express the > +capabilities via an older name e.g. ``Broadwell`` plus some features > +than calling it ``Icelake`` with many more features disabled. Due to > +all that Libvirt might sometimes display the an unexpected CPU model > +name, but that is fine - the purpose is not to confirm what > +generation-branding the chip was sold by, but instead the shortest set > +of named baseline model +/- features to express its capabilities. > + > +Some effort has been done to address these situations (like ``-noTSX`` > +variants are added to cover the missing TSX features mentioned above) > +and offer users the ability to more often see the CPU model name they > +expect, but this can never be fully complete. Therefore users *should > +not* expect to have the reported CPU model name to have any > +implications other than that of a named baseline to build the complete > +available feature set of the Host CPU. > + > +[1] https://docs.kernel.org/arch/x86/tsx_async_abort.html I would just drop these paragraphs completely. Host capabilities cannot express disabled features. Only the CPU model in domain capabilities can describe a CPU as a base model with some features disabled and other features added on top. But in the case the baseline selection relies mostly on CPU signatures to select the right CPU model name regardless on what features are actually available. We use a heuristics based on the list of extra enabled/disabled features only when the host's CPU signatures does not match anything in our database. And even in that case the CPU model selection process is a bit more complicated. Thus it's better to just don't go into the details. Especially when we just want users to ignore the CPU model name in host capabilities. I suggest something along the following lines instead: Because for backward compatibility reasons host capabilities cannot list features that would need to be removed from the baseline model to describe the host CPU, libvirt has to often use a rather old CPU model, for example, ``Broadwell`` rather than ``Icelake``. Therefore Therefore users *shouldnot* expect the reported CPU model name to have anyimplications other than that of a named baseline to build the complete available feature set of the Host CPU. ``Domain capabilities <formatdomaincaps.html#cpu-configuration>``_ do not have such limitation and the ``host-model`` CPU definition would show the correct CPU model in almost all cases. Jirka