On Fri, 2025-11-28 at 11:06 +0100, Christian König wrote:
> On 11/27/25 12:10, Philipp Stanner wrote:
> > On Thu, 2025-11-13 at 15:51 +0100, Christian König wrote:
> > > This should allow amdkfd_fences to outlive the amdgpu module.
> > >
> > > v2: implement Felix suggestion to lock the fence while signaling it.
> > >
> > > Signed-off-by: Christian König <[email protected]>
> > > ---
> > >
> > >
[…]
> > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> > > b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> > > index a085faac9fe1..8fac70b839ed 100644
> > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> > > @@ -1173,7 +1173,7 @@ static void kfd_process_wq_release(struct
> > > work_struct *work)
> > > synchronize_rcu();
> > > ef = rcu_access_pointer(p->ef);
> > > if (ef)
> > > - dma_fence_signal(ef);
> > > + amdkfd_fence_signal(ef);
> > >
> > > kfd_process_remove_sysfs(p);
> > > kfd_debugfs_remove_process(p);
> > > @@ -1990,7 +1990,6 @@ kfd_process_gpuid_from_node(struct kfd_process *p,
> > > struct kfd_node *node,
> > > static int signal_eviction_fence(struct kfd_process *p)
> > > {
> > > struct dma_fence *ef;
> > > - int ret;
> > >
> > > rcu_read_lock();
> > > ef = dma_fence_get_rcu_safe(&p->ef);
> > > @@ -1998,10 +1997,10 @@ static int signal_eviction_fence(struct
> > > kfd_process *p)
> > > if (!ef)
> > > return -EINVAL;
> > >
> > > - ret = dma_fence_signal(ef);
> > > + amdkfd_fence_signal(ef);
> > > dma_fence_put(ef);
> > >
> > > - return ret;
> > > + return 0;
> >
> > Oh wait, that's the code I'm also touching in my return code series!
> >
> > https://lore.kernel.org/dri-devel/[email protected]/
> >
> >
> > Does this series then solve the problem Felix pointed out in
> > evict_process_worker()?
>
> No it doesn't, I wasn't aware that the higher level code actually needs the
> status. After all Felix is the maintainer of this part.
>
> This patch here needs to be rebased on top of yours and changed accordingly
> to still return the fence status correctly.
>
> But thanks for pointing that out.
Alright, so my (repaired, v2) status-code-removal series shall enter
drm-misc-next first, and then your series here. ACK?
P.