Re: [RFC v1 0/9] kho: granular compatibility and header decoupling

Pratyush Yadav Tue, 09 Jun 2026 07:48:06 -0700

On Mon, Jun 08 2026, Pasha Tatashin wrote:

> On 06-08 13:26, Mike Rapoport wrote:
>> On 2026-06-07 13:43:09+00:00, Pasha Tatashin wrote:
>> > On 06-07 14:58, Mike Rapoport wrote:
>> > 
>> > > On Fri, 05 Jun 2026 03:32:26 +0000, Pasha Tatashin 
>> > > <[email protected]> wrote:
[...]
>> > External users only need to include the headers they actually use. For
>> > example, LUO shouldn't have to pull vmalloc or radix tree KHO
>> > declarations, and memfd does not need block.
>> > 
>> > From a maintenance point of view, it is much easier to catch ABI
>> > changes when the file with the appropriate version has been changed,
>> > and most likely the version of that file should be updated. If a single
>> > header contains compatibility versions for several different data
>> > structures, it is easier to miss the correct version update.
>> 
>> No matter in what files the definition lives, someone can forget to
>> update version and we may miss it during review.


Perhaps we should have some tests (maybe with kunit?) that can catch
this? If you change the format, the test fails. So you'd have to go and
update the test, and at that point it should be more obvious that ABI
version needs bumping.

[...]
>> 
>> Sorry I wasn't clear. I agree that kho_vmalloc, block and radix tree
>> should have their own versioning rather than rely on global KHO version.
>> 
>> What I don't like in your proposal is mixing versioning of a component
>> with its dependencies.
>> 
>> I think that versioning should be completely local to each component.
>> LUO should not care about kho_block "on wire" layout. This should be
>> encapsulated in kho_block.
>
> That is a fair point.
>
> As I mentioned in my previous reply, we can definitely look into making 
> the version checking more modular. For example, each component could 
> implement a standard compatibility-checking interface.
>
> These checks could run early in boot to determine whether each component 
> is capable of accepting the incoming preserved data format.
>
> Whenever the component is later used by LUO, memfd, etc., we can query 
> that cached status. This achieves four key benefits:
>
> 1. It avoids delaying the compatibility check to the actual time of data 
> retrieval, which is too late to safely abort.
>
> 2. It prevents a local incompatibility from triggering a global kernel 
> panic, allowing us to handle failures gracefully for just that specific 
> component or session.

I think the right time to do the compatibility check is _before_ kexec.
That is the only point where you can safely abort. Once you boot into
the new kernel and discover you can't understand the passed data, you
are in a bad spot already and should reboot. I don't think think you
really can gracefully handle these failures.

For example, say you fail to understand the incoming PCI data. So you
have no idea which devices are participating in live update and cannot
correctly probe any of them. Which effectively means you cannot resume
any of your guests since you have no idea how to restore their device
state. The only path you are left with is to reboot. I haven't read the
IOMMU series, but I imagine the same story applies there.

For a more benign example, let's assume one of your memfds that back VM
memory fail to restore.

In this case, you can safely leak that memory and run the other guests,
but at that point the host is in impaired state. You don't want to keep
running it in this state. You likely either do a reboot, or if you feel
more adventurous, you do another live update.

In either case, there is no "safely abort" after the kexec happens.

So I think our energy is better spent solving the versioning story
_before_ kexec. After kexec I think it is perfectly fine to error out
and panic or expect a reboot. You can't salvage much at that point
anyway.

And I think how the versioning format looks also should be based on the
design of this pre-kexec check, not the other way round.

>
> 3. It keeps the local version local, as you suggested, so it is checked 
> only by the consumers of that specific component.
>
> 4. It provides a clean path for backward compatibility, as components 
> can individually decide whether they understand the incoming data 
> format.
>
[...]
>> 
>> Actually FDT "compatible" handles versioning nicer than composite strings
>> You can have
>> 
>>      compatible="kho-v4", "vmalloc-v1", "radix-v1", "block-v2";
>> 
>> and check fdt_node_check_compatible("vmalloc-v1") for vmalloc and
>> fdt_node_check_compatible("block-v2") for block.

I agree. Even if we don't use FDT, something more structured than
composite strings would be nice to have.

>
> That is actually very similar to what I am proposing—individual version 
> tokens (which in my current series are concatenated into a composite 
> compatibility string separated by ';').
>
> But let's not get too fixated on the composite string formatting. I 
> actually really like what you are proposing: using integers for versions 
> and having each registered component carry its own "NAME" and version 
> number in the KHO FDT.

There is another nice thing about numbers that Logan (+cc) recently
pointed out. You can tell which one is bigger.

At some point I think we will support multiple versions of a data
structure to allow for upgrades. At that point, it will help to know
which one is "newer". So if both kernel versions support version 3 and
4, you can use 4 to serialize.

This of course is harder to do with strings.

>
>> And we wouldn't need to reimplement string parsing ;-)
>> 
>> But yeah, I do see value of making components versioning and KHO global
>> versioning independent. I just don't like composite strings and I don't
>> like mixing versioning with dependencies.
>> 
>> Since we are moving from FDT for the most things, version should become
>> a number rather than a string and version compatibility should be
[...]

-- 
Regards,
Pratyush Yadav

Re: [RFC v1 0/9] kho: granular compatibility and header decoupling

Reply via email to