On Thu, 25 Feb 2021 17:24:48 +0200
Tariq Toukan <[email protected]> wrote:
> > Hi,
> >
> > Issue still reproduces. Even in GA kernel.
> > It is always preceded by some other lockdep warning.
> >
> > So to get the reproduction:
> > - First, have any lockdep issue.
> > - Then, open bond interface.
> >
> > Any idea what could it be?
> >
> > We'll share any new info as soon as we have it.
Looks like you are triggering:
int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
{
struct bond_up_slave *usable_slaves = NULL, *all_slaves = NULL;
struct slave *slave;
struct list_head *iter;
int agg_id = 0;
int ret = 0;
#ifdef CONFIG_LOCKDEP
WARN_ON(lockdep_is_held(&bond->mode_lock));
#endif
And the below commit made lockdep_is_held() always return true if lockdep
has been previously triggered. That is, if you had a lockdep splat earlier,
then lockdep_is_held() will always return true, and this WARN_ON() will
always trigger.
Peter,
Perhaps we should not have this part of your patch:
@@ -5056,13 +5081,13 @@ noinstr int lock_is_held_type(const struct lockdep_map
*lock, int read)
unsigned long flags;
int ret = 0;
- if (unlikely(current->lockdep_recursion))
+ if (unlikely(!lockdep_enabled()))
return 1; /* avoid false negative lockdep_assert_held() */
raw_local_irq_save(flags);
check_flags(flags);
Because that changes how lock_is_held_type() behaves, and it will return
true if there's been an earlier lockdep splat, and any code that has
something like the above is going to fail.
Although, checking if a lot is not held seems rather strange. If anything,
the above should be changed to WARN_ON_ONCE() so that it doesn't constantly
trigger when a lockdep trigger happens.
-- Steve
> >
> > Regards,
> > Tariq
>
>
> Bisect shows this is the offending commit:
>
> commit 4d004099a668c41522242aa146a38cc4eb59cb1e
> Author: Peter Zijlstra <[email protected]>
> Date: Fri Oct 2 11:04:21 2020 +0200
>
> lockdep: Fix lockdep recursion
>
> Steve reported that lockdep_assert*irq*(), when nested inside lockdep
> itself, will trigger a false-positive.
>
> One example is the stack-trace code, as called from inside lockdep,
> triggering tracing, which in turn calls RCU, which then uses
> lockdep_assert_irqs_disabled().
>
> Fixes: a21ee6055c30 ("lockdep: Change hardirq{s_enabled,_context}
> to per-cpu variables")
> Reported-by: Steven Rostedt <[email protected]>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> Signed-off-by: Ingo Molnar <[email protected]>