On Mon, 11 May 2026 17:56:15 +0100,
Paolo Bonzini <[email protected]> wrote:
> 
> On 5/11/26 18:38, David Woodhouse wrote:
> > Not *everything* is in CPUID; one recent exception that comes to mind
> > is the SUPPRESS_EOI_BROADCAST quirk. But on x86 we preserve the
> > existing behaviour of older kernels — even when that behaviour doesn't
> > make much sense, as with SUPPRESS_EOI_BROADCAST where older KVM would
> > *advertise* the feature, but not actually *implement* it. Nevertheless,
> > that remains the default behaviour of future kernels unless userspace
> > explicitly opts in to fully enable (or disable) the feature.
> > 
> > But this documentation update isn't even asking for that compatible-by-
> > default behaviour, even though that is the right thing to do. It's only
> > asking that it be *possible* to reinstate the old behaviour, for
> > userspace that *knows* about the change and explicitly wants to go back
> > to the old way to remain compatible.
> 
> Yep, these are the "quirks"---if it's too early for Arm to commit to
> that, I guess it's fine.

Compatible by default means nothing, because userspace needs to
discover the combined capabilities of the host and KVM. This is not a
"CPU model" architecture.

If userspace is not a total joke, it will read all the ID registers,
and configure what it wants to see, assuming it is a feature that can
be configured (not everything can, because the architecture itself is
not fully backward compatible).

Yes, this is buggy at times, because the combinatorial explosion of
CPU capabilities and supported features makes it pretty hard to test
(and really nobody actually does). But overall, it works, and QEMU is
growing an infrastructure to manage it in a "user friendly" way.

But really, this isn't what David is asking. He's demanding "bug for
bug" compatibility. For that, we have two possible cases:

- this is a behaviour that, while undesirable, is allowed by the
  architecture: fine, we preserve the behaviour and add another way to
  expose the one we really want. it is ugly, but we manage.

- this is a behaviour that is not allowed by the architecture: we fix
  it for good. We do that on every release. Some minor, some much more
  visible. And there is no way we will add this sort of "bring the
  bugs back" type of behaviours. Specially when it is really obvious
  that no SW can make any reasonable use of the defect. We allow
  userspace to keep behaving as before, but the guest will not see a
  non-compliant behaviour.

That being said, there is a way out of that: convince people in charge
of the architecture that the non-compliant KVM behaviour is actually
valuable, and deserves to be tolerated. This has happened before (VHE
only and NV2 only, just to name two recent changes).

Other terrible hacks (such as GICv3's GICD_TYPER.num_LPIs which KVM
doesn't support) were added at the request of cloud vendors that David
might be familiar with, so it isn't like it is a brand new process.

And once it is in the architecture, it becomes a behaviour that is
allowed to be exposed to a guest, for better or worse.

These are the rules we have followed since we started KVM/arm, and I
intend to stick to them.

Thanks,

        M.

-- 
Without deviation from the norm, progress is not possible.

Reply via email to