On 3/31/2026 5:20 AM, Cosmin Ratiu wrote:
> On Mon, 2026-03-30 at 19:08 -0700, Jakub Kicinski wrote:
>> This is an AI-generated review of your patch. The human sending this
>> email has considered the AI review valid, or at least plausible.
>> ---
>> devlink: Add helpers to lock nested-in instances
>>
>> Upcoming code will need to obtain a reference to locked nested-in
>> devlink instances. Add helpers to lock, obtain an already locked
>> reference and unlock/unref the nested-in instance.
>>
>>> diff --git a/net/devlink/core.c b/net/devlink/core.c
>>> index eeb6a71f5f56e..db11248df7120 100644
>>> --- a/net/devlink/core.c
>>> +++ b/net/devlink/core.c
>>
>> [ ... ]
>>
>>> +/* Returns the nested in devlink object and validates its lock is
>>> held. */
>>> +struct devlink *devlink_nested_in_get_locked(struct devlink_rel
>>> *rel)
>>> +{
>>> +   struct devlink *devlink;
>>> +   unsigned long index;
>>> +
>>> +   if (!rel)
>>> +           return NULL;
>>> +   index = rel->nested_in.devlink_index;
>>> +   devlink = xa_find(&devlinks, &index, index,
>>> DEVLINK_REGISTERED);
>>> +   if (devlink)
>>> +           devl_assert_locked(devlink);
>>
>> Can this dereference a freed pointer if the lock is not actually
>> held?
>>
>> xa_find() acquires and releases the RCU read lock internally. When it
>> returns,
>> the devlink pointer is no longer protected by RCU or a reference
>> count.
>>
>> If a caller invokes this helper without holding the devlink lock
>> (which is the
>> exact scenario the assertion is designed to detect), another thread
>> could
>> concurrently unregister and free the devlink.
>>
>> If the RCU grace period expires before the assertion executes,
>> devl_assert_locked() would dereference freed memory. Should the
>> lookup and
>> assertion be enclosed within an explicit rcu_read_lock() block to
>> safely
>> assert the lock?
> 
> This function is meant to be invoked by callers already holding all
> locks from a given instance up the nesting tree. Calling it outside
> this context could indeed lead to a race as described, where another
> entity unregisters a devlink about-to-be-asserted on.
> 

Hmm. I'm struggling to follow this. If you already expect the parent to
hold the nested devlink's lock, it must have a pointer to this devlink
instance. In that case, why would you even need
devlink_nested_in_get_locked in the first place?

> All current callers hold the locks, but to make it more robust, I will
> add the rcu_read_lock/unlock section around the find+assert.
> 

You should at least do that. It is the correct pattern when dealing with
reference counting data structures from the xarray. That's already how
devlinks_xa_get() is implemented.

>>
>>> +   return devlink;
>>> +}
>>> +
>>> +void devlink_nested_in_put_unlock(struct devlink_rel *rel)
>>> +{
>>> +   struct devlink *devlink =
>>> devlink_nested_in_get_locked(rel);
>>
>> Could this corrupt the mutex state and cause a refcount underflow?
>>
>> This helper takes a devlink_rel pointer instead of the devlink
>> pointer
>> acquired by devlink_nested_in_get_lock(), and performs a secondary
>> global
>> lookup to find the devlink.
>>
>> If a caller mistakenly calls this in an error cleanup path where they
>> did not
>> actually acquire the lock, the global xa_find() will still locate the
>> registered devlink. This would execute devl_unlock() and
>> devlink_put() on a
>> devlink the current thread does not own.
>>

If the caller already held the lock, why is devlink_nested_in_put_unlock
calling the devl_unlock instead of the caller anyways? That seems
confusing. Wouldn't the normal pattern be to
devlink_nested_in_get_lock()? Oh, that is a separate function. Ok I see.

>> Would it be safer for unlock/put helpers to take the exact pointer
>> returned by
>> the lock/get helper to ensure safe resource cleanup?
> 
> 2 issues here:
> 1) Mistakenly calling this without having acquired the lock. This is
> akin to saying mutex_unlock is dangerous if the lock isn't held.
> Technically true, but moot.
> 2) The rel argument: It is intentional, so that all 3 functions are
> symmetrical.
> 

IMO it would make more sense for the put version to be a put on the
returned devlink pointer. I guess its not symmetrical, but it removes
the need to perform the second lookup and it makes it easier to reason
about the pointer you're releasing being the same one.

Having put take different arguments from get is the usual pattern for
such a behavior.

Also devlink_nested_in_get_locked() doesn't increase the ref count so it
is sort of "relying" on the caller already having a reference to it,
which makes me think its not very useful. The only valid way to call
this function as it exists now safely is to already hold a reference to
the object, which also already requires you to have a valid pointer
making me wonder why you'd ever need to call it in the first place.

The only example you have is to make devlink_nested_in_put_unlock() take
a devlink_rel pointer as its argument instead of just calling it on the
pointer returned by devlink_nested_in_get_lock().

This implementation seems confusing and likely to lead to errors.

Thanks,
Jake

Reply via email to