Re: [PATCH net-next V9 02/14] devlink: Add helpers to lock nested-in instances

Cosmin Ratiu Wed, 01 Apr 2026 03:33:11 -0700

On Tue, 2026-03-31 at 16:55 -0700, Jacob Keller wrote:
> On 3/31/2026 5:20 AM, Cosmin Ratiu wrote:
> > On Mon, 2026-03-30 at 19:08 -0700, Jakub Kicinski wrote:
> > > This is an AI-generated review of your patch. The human sending
> > > this
> > > email has considered the AI review valid, or at least plausible.
> > > ---
> > > devlink: Add helpers to lock nested-in instances
> > > 
> > > Upcoming code will need to obtain a reference to locked nested-in
> > > devlink instances. Add helpers to lock, obtain an already locked
> > > reference and unlock/unref the nested-in instance.
> > > 
> > > > diff --git a/net/devlink/core.c b/net/devlink/core.c
> > > > index eeb6a71f5f56e..db11248df7120 100644
> > > > --- a/net/devlink/core.c
> > > > +++ b/net/devlink/core.c
> > > 
> > > [ ... ]
> > > 
> > > > +/* Returns the nested in devlink object and validates its lock
> > > > is
> > > > held. */
> > > > +struct devlink *devlink_nested_in_get_locked(struct
> > > > devlink_rel
> > > > *rel)
> > > > +{
> > > > +       struct devlink *devlink;
> > > > +       unsigned long index;
> > > > +
> > > > +       if (!rel)
> > > > +               return NULL;
> > > > +       index = rel->nested_in.devlink_index;
> > > > +       devlink = xa_find(&devlinks, &index, index,
> > > > DEVLINK_REGISTERED);
> > > > +       if (devlink)
> > > > +               devl_assert_locked(devlink);
> > > 
> > > Can this dereference a freed pointer if the lock is not actually
> > > held?
> > > 
> > > xa_find() acquires and releases the RCU read lock internally.
> > > When it
> > > returns,
> > > the devlink pointer is no longer protected by RCU or a reference
> > > count.
> > > 
> > > If a caller invokes this helper without holding the devlink lock
> > > (which is the
> > > exact scenario the assertion is designed to detect), another
> > > thread
> > > could
> > > concurrently unregister and free the devlink.
> > > 
> > > If the RCU grace period expires before the assertion executes,
> > > devl_assert_locked() would dereference freed memory. Should the
> > > lookup and
> > > assertion be enclosed within an explicit rcu_read_lock() block to
> > > safely
> > > assert the lock?
> > 
> > This function is meant to be invoked by callers already holding all
> > locks from a given instance up the nesting tree. Calling it outside
> > this context could indeed lead to a race as described, where
> > another
> > entity unregisters a devlink about-to-be-asserted on.
> > 
> 
> Hmm. I'm struggling to follow this. If you already expect the parent
> to
> hold the nested devlink's lock, it must have a pointer to this
> devlink
> instance. In that case, why would you even need
> devlink_nested_in_get_locked in the first place?


After some more intense staring, I realized that intermediate instances
don't actually need to be locked, only the ancestor needs to. With that
in mind, the code get simplified:
- devlink_nested_in_get_locked and devlink_nested_in_put_unlock can be
removed.
- recursive unlocking in devl_rate_unlock is gone.

> 
> > All current callers hold the locks, but to make it more robust, I
> > will
> > add the rcu_read_lock/unlock section around the find+assert.
> > 
> 
> You should at least do that. It is the correct pattern when dealing
> with
> reference counting data structures from the xarray. That's already
> how
> devlinks_xa_get() is implemented.
> 
> > > 
> > > > +       return devlink;
> > > > +}
> > > > +
> > > > +void devlink_nested_in_put_unlock(struct devlink_rel *rel)
> > > > +{
> > > > +       struct devlink *devlink =
> > > > devlink_nested_in_get_locked(rel);
> > > 
> > > Could this corrupt the mutex state and cause a refcount
> > > underflow?
> > > 
> > > This helper takes a devlink_rel pointer instead of the devlink
> > > pointer
> > > acquired by devlink_nested_in_get_lock(), and performs a
> > > secondary
> > > global
> > > lookup to find the devlink.
> > > 
> > > If a caller mistakenly calls this in an error cleanup path where
> > > they
> > > did not
> > > actually acquire the lock, the global xa_find() will still locate
> > > the
> > > registered devlink. This would execute devl_unlock() and
> > > devlink_put() on a
> > > devlink the current thread does not own.
> > > 
> 
> If the caller already held the lock, why is
> devlink_nested_in_put_unlock
> calling the devl_unlock instead of the caller anyways? That seems
> confusing. Wouldn't the normal pattern be to
> devlink_nested_in_get_lock()? Oh, that is a separate function. Ok I
> see.
> 
> > > Would it be safer for unlock/put helpers to take the exact
> > > pointer
> > > returned by
> > > the lock/get helper to ensure safe resource cleanup?
> > 
> > 2 issues here:
> > 1) Mistakenly calling this without having acquired the lock. This
> > is
> > akin to saying mutex_unlock is dangerous if the lock isn't held.
> > Technically true, but moot.
> > 2) The rel argument: It is intentional, so that all 3 functions are
> > symmetrical.
> > 
> 
> IMO it would make more sense for the put version to be a put on the
> returned devlink pointer. I guess its not symmetrical, but it removes
> the need to perform the second lookup and it makes it easier to
> reason
> about the pointer you're releasing being the same one.
> 
> Having put take different arguments from get is the usual pattern for
> such a behavior.
> 
> Also devlink_nested_in_get_locked() doesn't increase the ref count so
> it
> is sort of "relying" on the caller already having a reference to it,
> which makes me think its not very useful. The only valid way to call
> this function as it exists now safely is to already hold a reference
> to
> the object, which also already requires you to have a valid pointer
> making me wonder why you'd ever need to call it in the first place.
> 
> The only example you have is to make devlink_nested_in_put_unlock()
> take
> a devlink_rel pointer as its argument instead of just calling it on
> the
> pointer returned by devlink_nested_in_get_lock().
> 
> This implementation seems confusing and likely to lead to errors.

I hope the next version will be more suitable.
Thank you for the comments and suggestions.

> 
> Thanks,
> Jake

Re: [PATCH net-next V9 02/14] devlink: Add helpers to lock nested-in instances

Reply via email to