On Fri, Sep 19, 2025 at 6:14 AM Pasha Tatashin <[email protected]> wrote: > > On Thu, Sep 18, 2025 at 6:26 PM Cong Wang <[email protected]> wrote: > > > > This patch series introduces multikernel architecture support, enabling > > multiple independent kernel instances to coexist and communicate on a > > single physical machine. Each kernel instance can run on dedicated CPU > > cores while sharing the underlying hardware resources. > > > > The multikernel architecture provides several key benefits: > > - Improved fault isolation between different workloads > > - Enhanced security through kernel-level separation > > - Better resource utilization than traditional VM (KVM, Xen etc.) > > - Potential zero-down kernel update with KHO (Kernel Hand Over) > > Hi Cong, > > Thank you for submitting this; it is an exciting series.
Thanks for your feedback, Pasha. > > I experimented with this approach about five years ago for a Live > Update scenario. It required surprisingly little work to get two OSes > to boot simultaneously on the same x86 hardware. The procedure I Yes, I totally agree. > followed looked like this: > > 1. Create an immutable kernel image bundle: kernel + initramfs. > 2. The first kernel is booted with memmap parameters, setting aside > the first 1G for its own operation, the second 1G for the next kernel > (reserved), and the rest as PMEM for the VMs. > 3. In the first kernel, we offline one CPU and kexec the second kernel > with parameters that specify to use only the offlined CPU as the boot > CPU and to keep the other CPUs offline (i.e., smp_init does not start > other CPUs). The memmap specify the first 1G reserved, and the 2nd 1G > for its own operations, and the rest is PMEM. > 4. Passing the VMs worked by suspending them in the old kernel. > 5. The other CPUs are onlined in the new kernel (thus killing the old kernel). > 6. The VMs are resumed in the new kernel. Exactly. > > While this approach was easy to get to the experimental PoC, it has > some fundamental problems that I am not sure can be solved in the long > run, such as handling global machine states like interrupts. I think > the Orphaned VM approach (i.e., keeping VCPUs running through the Live > Update procedure) is more reliable and likely to succeed for > zero-downtime kernel updates. Indeed, migrating hardware resources gracefully is indeed challenging for both VM or multikernel, especially when not interrupting the applications. I am imagining that KHO could establish a kind of protocol between two kernels to migrate resources. The device-tree-inspired abstraction looks neat to me, it is pretty much like protobuf but in kernel-space. Although I believe multikernel helps, there are still tons of details needed to consider. Therefore, I hope my proposal inspires people to think deeper and discuss together, and hopefully come up with better ideas. Thanks for sharing your thoughts.
