On Tue, 25 Nov 2025 at 19:15, Christian König <[email protected]> wrote:
>
> On 11/25/25 10:08, Dave Airlie wrote:
> > On Tue, 25 Nov 2025 at 18:11, Christian König <[email protected]> 
> > wrote:
> >>
> >> On 11/25/25 08:59, John Hubbard wrote:
> >>> On 11/24/25 11:54 PM, Christian König wrote:
> >>>> On 11/25/25 08:49, Dave Airlie wrote:
> >>>>> On Tue, 25 Nov 2025 at 17:45, Christian König 
> >>>>> <[email protected]> wrote:
> >>> ...
> >>>> My question is why exactly is nova separated into nova-core and 
> >>>> nova-drm? That doesn't seem to be necessary in the first place.
> >>>>
> >>> The idea is that nova-core allows building up a separate software stack 
> >>> for
> >>> VFIO, without pulling in any DRM-specific code that a hypervisor (for 
> >>> example)
> >>> wouldn't need. That makes for a smaller, more security-auditable set of 
> >>> code
> >>> for that case.
> >>
> >> Well that is the same argument used by some AMD team to maintain a 
> >> separate out of tree hypervisor for nearly a decade.
> >>
> >> Additional to that the same argument has also been used to justify the KFD 
> >> node as alternative API to DRM for compute.
> >>
> >> Both cases have proven to be extremely bad ideas.
> >>
> >> Background is that except for all the legacy stuff the DRM API is actually 
> >> very well thought through and it is actually quite hard to come up with 
> >> something similarly well.
> >>
> >
> > Well you just answered your own question, why is AMD maintaining GIM
> > instead of solving this upstream with a split model? the nova-core/drm
> > split would be perfect for GIM.
>
> No, it won't.
>
> We have the requirement to work with GEM objects and DMA-buf file descriptors 
> in the hypervisor as well.
>
> And my suspicion is that you end up with the same requirements in nova as 
> well in which case you end up interchanging handles with DRM as well.
>
> We have seen the same for KFD and it turned out to be an absolutely horrible 
> interaction.
>
> > kfd was a terrible idea, and we don't intend to offer userspace
> > multiple APIs with nova, nova-drm will be the primary userspace API
> > provider. nova-core will not provide userspace API, it will provide an
> > API to nova-drm and an API to the vgpu driver which will provide it's
> > own userspace API without graphics or compute, just enough to setup
> > VFs.
>
> Ok, then why do you need nova-core in the first place? E.g. where should be 
> the vgpu driver and what interface does it provide?

The ask is for a driver for cloud providers to run on their
hypervisors that does just enough to manage the VFs through VFIO
without having a complete drm driver or any drm infrastructure loaded.

The nice pictures are here
https://lore.kernel.org/all/[email protected]/

You will only be loading one of nova-drm or the vfio driver at least
in supported systems, depending on the GPU configuration, whether we
allow users to do things like that isn't well decided.

So far I haven't heard anything about needing dma-buf interactions at
that level, and maybe Zhi has more insight into the future there.

Dave.

Reply via email to