On Thu, May 15, 2025 at 2:00 AM Danilo Krummrich <d...@kernel.org> wrote: > > On Wed, May 14, 2025 at 10:53:16AM -0700, Rob Clark wrote: > > From: Rob Clark <robdcl...@chromium.org> > > > > Eases migration for drivers where VAs don't hold hard references to > > their associated BO, avoiding reference loops. > > > > In particular, msm uses soft references to optimistically keep around > > mappings until the BO is distroyed. Which obviously won't work if the > > VA (the mapping) is holding a reference to the BO. > > Ick! This is all complicated enough. Allow drivers to bypass the proper > reference counting for GEM objects in the context of VM_BO structures seems > like > an insane footgun. > > I don't understand why MSM would need weak references here. Why does msm need > that, but nouveau, Xe, panthor, PowerVR do not?
Most of those drivers were designed (and had their UABI designed) with gpuvm, or at least sparse, in mind from the get go. I'm not sure about nouveau, but I guess it just got lucky that it's UABI semantics fit having the VMA hold a reference to the BO. Unfortunately, msm pre-dates sparse.. and in the beginning there was only a single global VM, multiple VMs was something retrofitted ~6yrs (?) back. For existing msm, the VMA(s) are implicitly torn down when the GEM obj is freed. This won't work with the VMA(s) holding hard references to the BO. When userspace opts-in to "VM_BIND" mode, which it has to do before the VM is created, then we don't set this flag, the VMA holds a hard reference to the BO as it does with other drivers. But consider this use-case, which is perfectly valid for old (existing) userspace: 1) Userspace creates a BO 2) Submits rendering referencing the BO 3) Immediately closes the BO handle, without waiting for the submit to complete In this case, the submit holds a reference to the BO which holds a reference to the VMA. Everything is torn down gracefully when the submit completes. But if the VMA held a hard reference to the BO then you'd have a reference loop. So there really is no other way to use gpuvm _and_ maintain backwards compatibility with the semantics of the pre-VM_BIND UAPI without this flag. Fortunately DRM_GPUVM_VA_WEAK_REF is minimally intrusive. Otherwise I probably would have had to fork my own copy of gpuvm. BR, -R