On 10/13/25 10:22, Zhao Liu wrote:
On Fri, Oct 10, 2025 at 08:40:56PM +0300, Michael Tokarev wrote:
..>>> I found the previous 2 fixes were merged into stable 10.0:

24778b1c7ee7aca9721ed4757b0e0df0c16390f7
3d26cb65c27190e57637644ecf6c96b8c3d246a3

Should stable 10.0 revert these 2 fixes, to ensure migration
compatibility?

Sorry for late...just return from vacation.

I returned from vacation today too :)

Now when I think about it.

There were at least 2 point releases of 10.0.x (10.0.4 & 10.0.5)
with these 2 patches already.

EMM, it seems 10.0.x (x < 4) can't migrate to 10.0.y (4 <= y <= 5),
right? If so, could we treat this behavior as a regression?

It is a regression in 10.0.4 indeed.  But it already lasted for
2 stable releases (10.0.4 & 10.0.5).  So by reverting the above
mentioned two changes in 10.0.6, we'll make yet another regression,
now when migrating from 10.0.[45] to 10.0.6. This is why I thought
it might be an idea to keep just one regression in 10.0.x, so to
say.  Especially since these changes already fixes issues with
existing guests, so by reverting them, we'll bring them back to
10.0.x.

It is an either-or combination.  It is not bad either way, I'm just
thinking what is best currently.

And with my limited understanding of the migration issue in the context
(for which I asked for clarification some 5 or 6 times already), it
feels to me like "pretending" these above 2 mentioned above patches has
always been part of 10.0.x, - declare that migration wont work from
10.0.[1-3] (or [1-5]?) to subsequent versions, and be done with it.

And modify the 2 properties introduced by:

6529f31e0d target/i386: add compatibility property for pdcm feature
e9efa4a771 target/i386: add compatibility property for arch_capabilities

to be part of pc_compat_9_2 machine, not 10.0..

Hopefully it's understandable what I mean.

Reverting them in 10.0 will make
10.0 to be non-migratable with itself (10.0.5 can't be migrated
to 10.0.6 if we'll release 10.0.6 with these 2 patches reverted).

Also, as far as I can see (and I asked about this some 5 times
already, with no one answering - is it that difficult?) - we
should pick this series (pdcm, arch-capabilities) to 10.1.x stable
series too, since we can't migrate from previous versions to 10.1
which has the two changes mentioned above.

I think so. in this series, Paolo added compat options in pc_compat_10_0
so it should be picked to stable v10.1.

Again, I asked about this some 5 times already, with no single
answer.

It looks to me - since the breakage is already done, and both 10.0
and 10.1 is broken, we should declare the current situation as a
status quo, and do the following:

1. keep the above mentioned 24778b1c7ee7a and 3d26cb65c27190e5 in
    10.0.x (instead of reverting them);

2. pick up this 2 patches (fix cross migration issue with missing
    pdcm, arch-capabilities) to 10.1.x (it should be done either way,
    I think);

IIUC, if we picked current compat options to stable v10.1, then stable
v10.1 requires previous v10.0 sets the pdcm & arch-cap bits (i.e., do
not apply the fixes or revert the previous fix).

Ugh.  Confusion++ :)  As you wrote yourself right above, "Paolo added
compat options in pc_compat_10_0, so it should be picked up to stable
10.1".  This point "2" is exactly this case I'm talking about.  Two
commits:

6529f31e0d target/i386: add compatibility property for pdcm feature
e9efa4a771 target/i386: add compatibility property for arch_capabilities

should be picked up for 10.1.x.

This "2" point is not (yet) about 10.0.x.


So it seems the reverts are unavoidable on v10.0?

(Let's see what Paolo and the other maintainers think.)

For 10.0, there are 2 either-or options: either we revert, or we
pretend these has always been in 10.0.x and compensate, like I described
in my previous email in this thread (to which you're replying) and
re-describing now.

3. on top of these 2 "missing features: pdcm, arch-capabilities",
    make the crossing line for before-10.0, not for before-10.1 series, -
    ie, consider 10.0 *also* has these properties, but 9.2 and before
    are not.

This issue is indeed quite tricky. Sometimes people (including myself)
assume that backporting fixes to the stable branch can avoid adding a
compat option. Now it seems the compat option is the better choice, as
users need to ensure migration rather than downtime before upgrading to
the stable version :-(.
It's a good (hopefully) lesson for me myself, - I blindly picked up
a change which felt like an innocent (I even mentioned that in a commit
- it's a "cleanup patch") - just so a subsequent change in this area
applies cleanly.  But it wasn't a cleanup, and it wasn't trivial at
all.  So I must be much more careful the next time.  I'm talking about
3d26cb65c2 "Move adjustment of CPUID_EXT_PDCM..".

Speaking of the other change - it fixed a real bug which I hit myself,
and I had no idea it's tricky - actually no one had this idea until
e9efa4a771 "property for arch_capabilities".  So yes, this is a "sh*t
happens" case :)


Ok.

So, back to the situation and the plan (two of them).


1. It looks like we agree we should pick

6529f31e0d target/i386: add compatibility property for pdcm feature
e9efa4a771 target/i386: add compatibility property for arch_capabilities

to 10.1.x, to make migration from older versions to 10.1.x work.


2.  For 10.0.x, we've two options:

 2.a.  Revert
    e9efa4a771 "do not expose ARCH_CAPABILITIES"
    3d26cb65c2 "Move adjustment of CPUID_EXT_PDCM"
  as you initially suggested and already reviewed.

  This will make 10.0.[45] "bad" wrt migration, and will re-create the
  issues these 2 commits fixed, but will make next 10.0.x as good as
  initial 10.0.0 wrt migration.

 2.b.  Instead of reverting these two which are already in 10.0.[45],
  pretend 10.0 always had these 2 commits, and adjust subsequent
  qemu versions just like we did with 2 "add compatibility property"
  changes, but make it to be 9.2-compat property, not 10.0-compat
  property.

  This - as far as I can see - will make 10.0.[0-3] to be "bad" wrt
  migration, but not subsequent 10.0.x ones.  And will keep the bugs
  fixed in 10.0.x too.

But again, I don't understand the migration logic well, so don't know
if it even makes sense.  2.b, if deemed to be good, will be the first
in history (I think) to introduce compat properties for past machine
types.

Please excuse me for so much text :)

Thank you!

/mjt

Reply via email to