On Mon, Sep 22, 2025 at 03:41:18PM -0700, Cong Wang wrote: > On Mon, Sep 22, 2025 at 7:28 AM Stefan Hajnoczi <[email protected]> wrote: > > > > On Sat, Sep 20, 2025 at 02:40:18PM -0700, Cong Wang wrote: > > > On Fri, Sep 19, 2025 at 2:27 PM Stefan Hajnoczi <[email protected]> > > > wrote: > > > > > > > > On Thu, Sep 18, 2025 at 03:25:59PM -0700, Cong Wang wrote: > > > > > This patch series introduces multikernel architecture support, > > > > > enabling > > > > > multiple independent kernel instances to coexist and communicate on a > > > > > single physical machine. Each kernel instance can run on dedicated CPU > > > > > cores while sharing the underlying hardware resources. > > > > > > > > > > The multikernel architecture provides several key benefits: > > > > > - Improved fault isolation between different workloads > > > > > - Enhanced security through kernel-level separation > > > > > > > > What level of isolation does this patch series provide? What stops > > > > kernel A from accessing kernel B's memory pages, sending interrupts to > > > > its CPUs, etc? > > > > > > It is kernel-enforced isolation, therefore, the trust model here is still > > > based on kernel. Hence, a malicious kernel would be able to disrupt, > > > as you described. With memory encryption and IPI filtering, I think > > > that is solvable. > > > > I think solving this is key to the architecture, at least if fault > > isolation and security are goals. A cooperative architecture where > > nothing prevents kernels from interfering with each other simply doesn't > > offer fault isolation or security. > > Kernel and kernel modules can be signed today, kexec also supports > kernel signing via kexec_file_load(). It migrates at least untrusted > kernels, although kernels can be still exploited via 0-day.
Kernel signing also doesn't protect against bugs in one kernel interfering with another kernel. > > > > On CPU architectures that offer additional privilege modes it may be > > possible to run a supervisor on every CPU to restrict access to > > resources in the spawned kernel. Kernels would need to be modified to > > call into the supervisor instead of accessing certain resources > > directly. > > > > IOMMU and interrupt remapping control would need to be performed by the > > supervisor to prevent spawned kernels from affecting each other. > > That's right, security vs performance. A lot of times we have to balance > between these two. This is why Kata Container today runs a container > inside a VM. > > This largely depends on what users could compromise, there is no single > right answer here. > > For example, in a fully-controlled private cloud, security exploits are > probably not even a concern. Sacrificing performance for a non-concern > is not reasonable. > > > > > This seems to be the price of fault isolation and security. It ends up > > looking similar to a hypervisor, but maybe it wouldn't need to use > > virtualization extensions, depending on the capabilities of the CPU > > architecture. > > Two more points: > > 1) Security lockdown. Security lockdown transforms multikernel from > "0-day means total compromise" to "0-day means single workload > compromise with rapid recovery." This is still a significant improvement > over containers where a single kernel 0-day compromises everything > simultaneously. I don't follow. My understanding is that multikernel currently does not prevent spawned kernels from affecting each other, so a kernel 0-day in multikernel still compromises everything? > > 2) Rapid kernel updates: A more practical way to eliminate 0-day > exploits is to update kernel more frequently, today the major blocker > is the downtime required by kernel reboot, which is what multikernel > aims to resolve. If kernel upgrades are the main use case for multikernel, then I guess isolation is not necessary. Two kernels would only run side-by-side for a limited period of time and they would have access to the same workloads. Stefan > > I hope this helps. > > Regards, > Cong Wang >
signature.asc
Description: PGP signature
