On Tue, 25 Nov 2025 at 19:15, Christian König <[email protected]> wrote: > > On 11/25/25 10:08, Dave Airlie wrote: > > On Tue, 25 Nov 2025 at 18:11, Christian König <[email protected]> > > wrote: > >> > >> On 11/25/25 08:59, John Hubbard wrote: > >>> On 11/24/25 11:54 PM, Christian König wrote: > >>>> On 11/25/25 08:49, Dave Airlie wrote: > >>>>> On Tue, 25 Nov 2025 at 17:45, Christian König > >>>>> <[email protected]> wrote: > >>> ... > >>>> My question is why exactly is nova separated into nova-core and > >>>> nova-drm? That doesn't seem to be necessary in the first place. > >>>> > >>> The idea is that nova-core allows building up a separate software stack > >>> for > >>> VFIO, without pulling in any DRM-specific code that a hypervisor (for > >>> example) > >>> wouldn't need. That makes for a smaller, more security-auditable set of > >>> code > >>> for that case. > >> > >> Well that is the same argument used by some AMD team to maintain a > >> separate out of tree hypervisor for nearly a decade. > >> > >> Additional to that the same argument has also been used to justify the KFD > >> node as alternative API to DRM for compute. > >> > >> Both cases have proven to be extremely bad ideas. > >> > >> Background is that except for all the legacy stuff the DRM API is actually > >> very well thought through and it is actually quite hard to come up with > >> something similarly well. > >> > > > > Well you just answered your own question, why is AMD maintaining GIM > > instead of solving this upstream with a split model? the nova-core/drm > > split would be perfect for GIM. > > No, it won't. > > We have the requirement to work with GEM objects and DMA-buf file descriptors > in the hypervisor as well. > > And my suspicion is that you end up with the same requirements in nova as > well in which case you end up interchanging handles with DRM as well. > > We have seen the same for KFD and it turned out to be an absolutely horrible > interaction. > > > kfd was a terrible idea, and we don't intend to offer userspace > > multiple APIs with nova, nova-drm will be the primary userspace API > > provider. nova-core will not provide userspace API, it will provide an > > API to nova-drm and an API to the vgpu driver which will provide it's > > own userspace API without graphics or compute, just enough to setup > > VFs. > > Ok, then why do you need nova-core in the first place? E.g. where should be > the vgpu driver and what interface does it provide?
The ask is for a driver for cloud providers to run on their hypervisors that does just enough to manage the VFs through VFIO without having a complete drm driver or any drm infrastructure loaded. The nice pictures are here https://lore.kernel.org/all/[email protected]/ You will only be loading one of nova-drm or the vfio driver at least in supported systems, depending on the GPU configuration, whether we allow users to do things like that isn't well decided. So far I haven't heard anything about needing dma-buf interactions at that level, and maybe Zhi has more insight into the future there. Dave.
