On Wed, 6 May 2026 06:21:42 -0700 Rob Clark <[email protected]> wrote:
> On Wed, May 6, 2026 at 5:16 AM Boris Brezillon > <[email protected]> wrote: > > > > The following race can currently happen: > > > > | Thread 0 in `drm_gem_lru_scan` | Thread 1 in > > `drm_gem_object_release` | > > | - | - > > | > > | move obj1 with refcount==0 to `still_in_lru` | > > | > > | move obj2 with refcount!=0 to `still_in_lru` | > > | > > | mutex_unlock | > > | > > | shrink obj2 | > > | > > | | lru = obj1->lru; // > > `still_in_lru` | > > | mutex_lock | > > | > > | move obj1 back to the original lru | > > | > > | mutex_unlock | > > | > > | return | > > | > > | | dereference `still_in_lru` > > | > > > > Move the drm_gem_lru_move_tail_locked() after the > > kref_get_unless_zero() check so that we don't end up with a > > vanishing LRU when we hit drm_gem_object_release(). We also need to > > remove the skipped object from its LRU, otherwise we'll keep hitting > > it on subsequent loop iterations until it's actually removed from the > > list in the drm_gem_release(). > > > > Fixes: e7c2af13f811 ("drm/gem: Add LRU/shrinker helper") > > Reported-by: Chia-I Wu <[email protected]> > > Closes: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/86 > > Signed-off-by: Boris Brezillon <[email protected]> > > Reviewed-by: Chia-I Wu <[email protected]> > > --- > > drivers/gpu/drm/drm_gem.c | 14 +++++++++----- > > 1 file changed, 9 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c > > index fca42949eb2b..97cf63de0112 100644 > > --- a/drivers/gpu/drm/drm_gem.c > > +++ b/drivers/gpu/drm/drm_gem.c > > @@ -1660,15 +1660,19 @@ drm_gem_lru_scan(struct drm_gem_lru *lru, > > if (!obj) > > break; > > > > - drm_gem_lru_move_tail_locked(&still_in_lru, obj); > > - > > /* > > * If it's in the process of being freed, gem_object->free() > > - * may be blocked on lock waiting to remove it. So just > > - * skip it. > > + * may be blocked on lock waiting to remove it. So just > > remove > > + * it from its current LRU and skip it. > > */ > > - if (!kref_get_unless_zero(&obj->refcount)) > > + if (!kref_get_unless_zero(&obj->refcount)) { > > + if (obj->lru) > > + drm_gem_lru_remove_locked(obj); > > if we are iterating a LRU.. and lru->lock is held, shouldn't obj->lru > always be non-null? Right, the != NULL check is not needed here.
