On Wed, Oct 17, 2018 at 02:04:46PM -0700, Mike Larkin wrote:
> On Fri, Oct 05, 2018 at 01:50:10PM +0200, Sergio Lopez wrote:
> > Right now, vmd already features an excellent privsep model to ensure
> > the process servicing the VM requests to the outside world is running
> > with the lowest possible privileges.
> > 
> > I was wondering if we could take a step further, servicing each virtio
> > device from a different process. This design would simplify the
> > implementation and maintenance of those devices, improve the privsep
> > model and increase the resilience of VMs (the crash of a process
> > servicing a device won't bring down the whole VM, and a mechanism to
> > recover from this scenario could be explored).
> 
> Our model is generally to not try to recover from crashes like that. Indeed,
> you *want* to crash so the underlying bug can be found and fixed.

With separate processes you'll still have a crash with core dump from
the process servicing the device, with the additional advantage of being
able to, optionally, try to recover device or debug it independently. 

> >  - An in-kernel IRQ chip. This one is the most controversial, as it
> > means emulating a device from a privileged domain, but I'm pretty sure
> > a lapic implementation with enough functionality to serve *BSD/Linux
> > Guests can be small and simple enough to be easily auditable.
> 
> This needs to be done, for a number of reasons (device emulation being just
> one). pd@ and I are working on how to implement this using features in
> recent CPUs, since much of the LAPIC emulation can now be handled by the
> CPU itself. We're thinking skylake and later will be the line in the sand
> for this. Otherwise the software emulation is more complex and more prone
> to bugs. I've resisted the urge to put this stuff in the kernel for exactly
> that reason, but with later model CPUs we may be in better shape. We may
> also decide to focus solely on x2APIC. If you're interested in helping in
> this area, I'll keep you in the loop.

Sure, I'd like to help. I'm quite familiar with KVM's in-kernel irqchip
implementation.

> > Do you think it's worth exploring this model? What are feelings
> > regarding the in-kernel IRQ chip?
> > 
> > Sergio (slp).
> > 
> 
> All things considered, I'm less sold on the idea of splitting out devices
> into their own processes. I don't see any compelling reason. But we do
> need an IOAPIC and LAPIC implementation at some point, as you point out.

Well, vmd is still small enough (thankfully) to be able debug it easily.
But, eventually, it'll grow in both size and complexity (specially with
SMP support, I/O optimizations, additional storage backends...) and
having a high degree of modularity really helps here.

In fact, the QEMU/KVM community is starting to consider going this route
(but QEMU is *huge*). [1]

Anyways, this is not a "now-or-never" thing. As you suggest, we can work
now on kickfd and IOAPIC/LAPIC, which are useful by themselves. And when
those are in place, I'll be able to write a PoC so we can evaluate its
benefits and drawbacks.

Sergio (slp).


[1] https://www.linux-kvm.org/images/f/fc/KVM_FORUM_multi-process.pdf

Reply via email to