On Thu, Jun 11, 2026 at 8:58 PM Petr Mladek <[email protected]> wrote: > > On Tue 2026-06-09 18:00:55, Petr Mladek wrote: > > On Sun 2026-06-07 21:16:55, Yafang Shao wrote: > > I would write something like: > > > > <proposal> > > The practice shows that the current semantic of the patch.replace flag is > > not ideal. > > > > The atomic replace is disabled by default. And the no-replace mode allows > > wild installation of many livepatches in parallel. The author and > > administrator are fully responsible for preventing problems caused > > by producing and installing incompatible livepatches. > > > > The most safe atomic replace mode must be explicitly enabled by > > setting "patch.replace = true". It is all or nothing. The livepatch > > with enabled .replace will always replace all already installed > > livepatches. It makes it very safe but it might be too harsh. > > > > Improve the situation by switching "bool .replace" flag to > > "u32 .replace_set" and and updating its semantic. > > > > Any .replace_set value might be associated with a set of livepatched > > symbols, callbacks, shadow variable and state IDs. > > > > A livepatch with a particular .replace_set number will atomically > > rreplace any already installed livepatch with the same .replace_set > > number. By definition, there can only ever be one active livepatch > > for a given replace_set number. > > > > On the contrary, livepatches with a different .replace_set number > > must not modify the same function, or use the state with the same > > ID [*]. Any attempt to load an incompatible livepatch will be > > rejected. > > > > Summary: > > > > The most safe mode when any livepatch replaces any other livepatch > > will be the default. Note that all livepatches must keep > > .replace_set = 0. > > > > It will be possible to install more livepatches in parallel by > > using different .replace_set numbers. The livepatches might be > > updated independently using the atomic replace feature as long > > as the new version does not break compatibility. The kernel will > > reject a livepatch from a different replace set when it would > > want to modify the same function or livepatch state from > > another replace set. > > > > [*] The compatibility check of callbacks and shadow variables will > > be improved later by reworking their semantic. There is a work > > in progress, see [0] > > </proposal> > > > > > Link: https://github.com/pmladek/linux/tree/klp-state-transfer-v1-iter12 > > > [0] > > > > I have realized that I actually sent "v1-iter12" to the public > > mailing list as the official v1. So we could use: > > > > Link: https://lore.kernel.org/all/[email protected]/ > > [0] > > > > > > New idea: > > > > I have briefly discussed the new semantic with Miroslav when I met > > him in person. And he was a bit concerned. We as an OS distributor > > might want to be sure that our livepatches can be installed the most > > safe way. So, we still might want to preserve the "replace all" > > semantic to make sure that our livepatches will not break anything. > > I thought more about it and we would need some solution to preserve > the replace_all functionality. > > There were recently reported few serious 0-day vulnerabilities. > We discussed a possibility to ship a quick fix with a livepatch. > Or that customers might want to fix it themself by a livepatch. > Such a livepatch would need to be installed in parallel to > the official livepatch fixing older bugs. But the next official > cumulative livepatch would need to replace it. > > The above scenario will not longer work with the current > "replace_set" handling. The hotfix would need to use another > "replace_set" so that it can be installed in parallel. > But the next cumulative livepatch won't be able to replace > it because it would need to modify the same function. > > I consulted this with AI (claude-sonet-4.6) and it gave the following > feedback/ideas ;-) > > > I though about 4 approaches approaches: > > > > 1. Make .replace_set=0 special so that it will always replace > > everything. Similar to the current .replace=true mode. > > > > Customers will still be able to install custom livepatches > > later with .replace_set != 0. But the "0" livepatch will > > always wipe them out. > > This is not ideal because it is asymetric. Why is "0" special? > > > > 2. Use two flags in the livepatch, for example > > > > a. Rename .replace to .replace_all. The livepatch with this > > flag set will always wipe all other livepatches. > > > > b. Add .replace_set which will allow to install more livepatches > > in parallel, replace the livepatches with the same .replace_set > > atomically, and check the compatibility. As described above. > > > > It is a bit more complicated. But it is more compatible with > > the current state. And it removes the special meaning of > > .replace_set == 0. > > This looks more straightforward. But the fact that "replace_all" > replaces everything brings back the problem with the original > "replace" flag. So, it makes this whole exercise more or less > pointless. > > I had another idea with storing list of fixed bugs/CVEs in each > livepatch. Independent fixes might be fixed by independent > livepatches. Then a cumulative livepatch would replace only > the livepatches which fixed the same bugs before. > > And (claude-sonnet-4.6) came with an interesting simplification. > > We could add: > > struct klp_patch { > [...] > unsigned int replace_set; > const unsigned int *supersedes; /* Zero terminated array of > replace_set IDs */ > [...] > } > > So that the cumulative livepatch might optionally define > another "replace_set"s which would be replaced. > > This would work well when both cumulative livepatches and the hotfix > are provided by the same vendor or group. > > We could also allow to change it dynamically by adding an module > option to the cumulative livepatch, .e.g supersedes=id[,id]* > We could add some support into the kernel for handling the module > parameter a standard way.
I prefer this option because it allows us to dynamically set the supersedes at runtime, avoiding hardcoded values at build time. This flexibility is essential for handling complex production environments. > > It is not trivial. But it is also not horribly complex. > It looks like a good compromise between the requirements and > code complexity. > > We really need input from others here. Happy to hear what others think as well. -- Regards Yafang
