Re: [PATCH RT v3 5/5] rcutorture: Avoid problematic critical section nesting on RT

2019-09-23 Thread Sebastian Andrzej Siewior
On 2019-09-17 11:32:11 [-0500], Scott Wood wrote:
> Nice!  Are the "false positives" real issues from components that are
> currently blacklisted on RT, or something different?

So first a little bit of infrastructure like commit d5096aa65acd0
("sched: Mark hrtimers to expire in hard interrupt context") is required
so lockdep can see it all properly without RT enabled. Then we need
patches to avoid lockdep complaining about things that are not complained
about in RT because the lock is converted to raw_spinlock_t or something
different is applied so we don't have the warning.

> -Scott

Sebastian


Re: [PATCH RT v3 5/5] rcutorture: Avoid problematic critical section nesting on RT

2019-09-17 Thread Scott Wood
On Tue, 2019-09-17 at 16:50 +0200, Sebastian Andrzej Siewior wrote:
> On 2019-09-17 09:36:22 [-0500], Scott Wood wrote:
> > > On non-RT you can (but should not) use the counter part of the
> > > function
> > > in random order like:
> > >   local_bh_disable();
> > >   local_irq_disable();
> > >   local_bh_enable();
> > >   local_irq_enable();
> > 
> > Actually even non-RT will assert if you do local_bh_enable() with IRQs
> > disabled -- but the other combinations do work, and are used some places
> > via
> > spinlocks.  If they are used via direct calls to preempt_disable() or
> > local_irq_disable() (or via raw spinlocks), then that will not go away
> > on RT
> > and we'll have a problem.
> 
> lockdep_assert_irqs_enabled() is a nop with CONFIG_PROVE_LOCKING=N and
> RT breaks either way. 

Right, I meant a non-RT kernel with debug checks enabled.

> > > Since you _can_ use it in random order Paul wants to test that the
> > > random use of those function does not break RCU in any way. Since they
> > > can not be used on RT in random order it has been agreed that we keep
> > > the test for !RT but disable it on RT.
> > 
> > For now, yes.  Long term it would be good to keep track of when
> > preemption/irqs would be disabled on RT, even when running a non-RT
> > debug
> > kernel, and assert when bad things are done with it (assuming an RT-
> > capable
> > arch).  Besides detecting these fairly unusual patterns, it could also
> > detect earlier the much more common problem of nesting a non-raw
> > spinlock
> > inside a raw spinlock or other RT-atomic context.
> 
> you will be surprised but we have patches for that. We need first get
> rid of other "false positives" before plugging this in.

Nice!  Are the "false positives" real issues from components that are
currently blacklisted on RT, or something different?

-Scott




Re: [PATCH RT v3 5/5] rcutorture: Avoid problematic critical section nesting on RT

2019-09-17 Thread Sebastian Andrzej Siewior
On 2019-09-17 09:36:22 [-0500], Scott Wood wrote:
> > On non-RT you can (but should not) use the counter part of the function
> > in random order like:
> > local_bh_disable();
> > local_irq_disable();
> > local_bh_enable();
> > local_irq_enable();
> 
> Actually even non-RT will assert if you do local_bh_enable() with IRQs
> disabled -- but the other combinations do work, and are used some places via
> spinlocks.  If they are used via direct calls to preempt_disable() or
> local_irq_disable() (or via raw spinlocks), then that will not go away on RT
> and we'll have a problem.

lockdep_assert_irqs_enabled() is a nop with CONFIG_PROVE_LOCKING=N and
RT breaks either way. 

> > Since you _can_ use it in random order Paul wants to test that the
> > random use of those function does not break RCU in any way. Since they
> > can not be used on RT in random order it has been agreed that we keep
> > the test for !RT but disable it on RT.
> 
> For now, yes.  Long term it would be good to keep track of when
> preemption/irqs would be disabled on RT, even when running a non-RT debug
> kernel, and assert when bad things are done with it (assuming an RT-capable
> arch).  Besides detecting these fairly unusual patterns, it could also
> detect earlier the much more common problem of nesting a non-raw spinlock
> inside a raw spinlock or other RT-atomic context.

you will be surprised but we have patches for that. We need first get
rid of other "false positives" before plugging this in.

> -Scott

Sebastian


Re: [PATCH RT v3 5/5] rcutorture: Avoid problematic critical section nesting on RT

2019-09-17 Thread Scott Wood
On Tue, 2019-09-17 at 12:07 +0200, Sebastian Andrzej Siewior wrote:
> On 2019-09-16 11:55:57 [-0500], Scott Wood wrote:
> > On Thu, 2019-09-12 at 18:17 -0400, Joel Fernandes wrote:
> > > On Wed, Sep 11, 2019 at 05:57:29PM +0100, Scott Wood wrote:
> > > > rcutorture was generating some nesting scenarios that are not
> > > > reasonable.  Constrain the state selection to avoid them.
> > > > 
> > > > Example #1:
> > > > 
> > > > 1. preempt_disable()
> > > > 2. local_bh_disable()
> > > > 3. preempt_enable()
> > > > 4. local_bh_enable()
> > > > 
> > > > On PREEMPT_RT, BH disabling takes a local lock only when called in
> > > > non-atomic context.  Thus, atomic context must be retained until
> > > > after
> > > > BH
> > > > is re-enabled.  Likewise, if BH is initially disabled in non-atomic
> > > > context, it cannot be re-enabled in atomic context.
> > > > 
> > > > Example #2:
> > > > 
> > > > 1. rcu_read_lock()
> > > > 2. local_irq_disable()
> > > > 3. rcu_read_unlock()
> > > > 4. local_irq_enable()
> > > 
> > > If I understand correctly, these examples are not unrealistic in the
> > > real
> > > world unless RCU is used in the scheduler.
> > 
> > I hope you mean "not realistic", at least when it comes to explicit
> > preempt/irq disabling rather than spinlock variants that don't disable
> > preempt/irqs on PREEMPT_RT.
> 
> We have:
> - local_irq_disable() (+save)
> - spin_lock()
> - local_bh_disable()
> - preempt_disable()
> 
> On non-RT you can (but should not) use the counter part of the function
> in random order like:
>   local_bh_disable();
>   local_irq_disable();
>   local_bh_enable();
>   local_irq_enable();

Actually even non-RT will assert if you do local_bh_enable() with IRQs
disabled -- but the other combinations do work, and are used some places via
spinlocks.  If they are used via direct calls to preempt_disable() or
local_irq_disable() (or via raw spinlocks), then that will not go away on RT
and we'll have a problem.

> The non-RT will survive this. On RT the counterpart functions have to be
> used in reverse order:
>   local_bh_disable();
>   local_irq_disable();
>   local_irq_enable();
>   local_bh_enable();
> 
> or the kernel will fall apart.
> 
> Since you _can_ use it in random order Paul wants to test that the
> random use of those function does not break RCU in any way. Since they
> can not be used on RT in random order it has been agreed that we keep
> the test for !RT but disable it on RT.

For now, yes.  Long term it would be good to keep track of when
preemption/irqs would be disabled on RT, even when running a non-RT debug
kernel, and assert when bad things are done with it (assuming an RT-capable
arch).  Besides detecting these fairly unusual patterns, it could also
detect earlier the much more common problem of nesting a non-raw spinlock
inside a raw spinlock or other RT-atomic context.

-Scott




Re: [PATCH RT v3 5/5] rcutorture: Avoid problematic critical section nesting on RT

2019-09-17 Thread Sebastian Andrzej Siewior
On 2019-09-16 11:55:57 [-0500], Scott Wood wrote:
> On Thu, 2019-09-12 at 18:17 -0400, Joel Fernandes wrote:
> > On Wed, Sep 11, 2019 at 05:57:29PM +0100, Scott Wood wrote:
> > > rcutorture was generating some nesting scenarios that are not
> > > reasonable.  Constrain the state selection to avoid them.
> > > 
> > > Example #1:
> > > 
> > > 1. preempt_disable()
> > > 2. local_bh_disable()
> > > 3. preempt_enable()
> > > 4. local_bh_enable()
> > > 
> > > On PREEMPT_RT, BH disabling takes a local lock only when called in
> > > non-atomic context.  Thus, atomic context must be retained until after
> > > BH
> > > is re-enabled.  Likewise, if BH is initially disabled in non-atomic
> > > context, it cannot be re-enabled in atomic context.
> > > 
> > > Example #2:
> > > 
> > > 1. rcu_read_lock()
> > > 2. local_irq_disable()
> > > 3. rcu_read_unlock()
> > > 4. local_irq_enable()
> > 
> > If I understand correctly, these examples are not unrealistic in the real
> > world unless RCU is used in the scheduler.
> 
> I hope you mean "not realistic", at least when it comes to explicit
> preempt/irq disabling rather than spinlock variants that don't disable
> preempt/irqs on PREEMPT_RT.

We have:
- local_irq_disable() (+save)
- spin_lock()
- local_bh_disable()
- preempt_disable()

On non-RT you can (but should not) use the counter part of the function
in random order like:
local_bh_disable();
local_irq_disable();
local_bh_enable();
local_irq_enable();

The non-RT will survive this. On RT the counterpart functions have to be
used in reverse order:
local_bh_disable();
local_irq_disable();
local_irq_enable();
local_bh_enable();

or the kernel will fall apart.

Since you _can_ use it in random order Paul wants to test that the
random use of those function does not break RCU in any way. Since they
can not be used on RT in random order it has been agreed that we keep
the test for !RT but disable it on RT.

> -Scott

Sebastian


Re: [PATCH RT v3 5/5] rcutorture: Avoid problematic critical section nesting on RT

2019-09-16 Thread Scott Wood
On Thu, 2019-09-12 at 18:17 -0400, Joel Fernandes wrote:
> On Wed, Sep 11, 2019 at 05:57:29PM +0100, Scott Wood wrote:
> > rcutorture was generating some nesting scenarios that are not
> > reasonable.  Constrain the state selection to avoid them.
> > 
> > Example #1:
> > 
> > 1. preempt_disable()
> > 2. local_bh_disable()
> > 3. preempt_enable()
> > 4. local_bh_enable()
> > 
> > On PREEMPT_RT, BH disabling takes a local lock only when called in
> > non-atomic context.  Thus, atomic context must be retained until after
> > BH
> > is re-enabled.  Likewise, if BH is initially disabled in non-atomic
> > context, it cannot be re-enabled in atomic context.
> > 
> > Example #2:
> > 
> > 1. rcu_read_lock()
> > 2. local_irq_disable()
> > 3. rcu_read_unlock()
> > 4. local_irq_enable()
> 
> If I understand correctly, these examples are not unrealistic in the real
> world unless RCU is used in the scheduler.

I hope you mean "not realistic", at least when it comes to explicit
preempt/irq disabling rather than spinlock variants that don't disable
preempt/irqs on PREEMPT_RT.

> > If the thread is preempted between steps 1 and 2,
> > rcu_read_unlock_special.b.blocked will be set, but it won't be
> > acted on in step 3 because IRQs are disabled.  Thus, reporting of the
> > quiescent state will be delayed beyond the local_irq_enable().
> 
> Yes, with consolidated RCU this can happen but AFAIK it has not seen to be
> a
> problem since deferred QS reporting will happen take care of it, which can
> also happen from subsequent rcu_read_unlock_special().

The defer_qs_iw_pending stuff isn't in 5.2-rt.  Still, given patch 4/5 (and
special.b.deferred_qs on mainline) this shouldn't present a deadlock concern
(letting the test run a bit now to double check) so this patch could
probably be limited to the "example #1" sequence.

> > For now, these scenarios will continue to be tested on non-PREEMPT_RT
> > kernels, until debug checks are added to ensure that they are not
> > happening elsewhere.
> 
> Are you seeing real issues that need this patch? It would be good to not
> complicate rcutorture if not needed.

rcutorture crashes on RT without this patch (in particular due to the
local_bh_disable misordering).

-Scott




Re: [PATCH RT v3 5/5] rcutorture: Avoid problematic critical section nesting on RT

2019-09-12 Thread Joel Fernandes
On Wed, Sep 11, 2019 at 05:57:29PM +0100, Scott Wood wrote:
> rcutorture was generating some nesting scenarios that are not
> reasonable.  Constrain the state selection to avoid them.
> 
> Example #1:
> 
> 1. preempt_disable()
> 2. local_bh_disable()
> 3. preempt_enable()
> 4. local_bh_enable()
> 
> On PREEMPT_RT, BH disabling takes a local lock only when called in
> non-atomic context.  Thus, atomic context must be retained until after BH
> is re-enabled.  Likewise, if BH is initially disabled in non-atomic
> context, it cannot be re-enabled in atomic context.
> 
> Example #2:
> 
> 1. rcu_read_lock()
> 2. local_irq_disable()
> 3. rcu_read_unlock()
> 4. local_irq_enable()

If I understand correctly, these examples are not unrealistic in the real
world unless RCU is used in the scheduler.

> 
> If the thread is preempted between steps 1 and 2,
> rcu_read_unlock_special.b.blocked will be set, but it won't be
> acted on in step 3 because IRQs are disabled.  Thus, reporting of the
> quiescent state will be delayed beyond the local_irq_enable().

Yes, with consolidated RCU this can happen but AFAIK it has not seen to be a
problem since deferred QS reporting will happen take care of it, which can
also happen from subsequent rcu_read_unlock_special().

> For now, these scenarios will continue to be tested on non-PREEMPT_RT
> kernels, until debug checks are added to ensure that they are not
> happening elsewhere.

Are you seeing real issues that need this patch? It would be good to not
complicate rcutorture if not needed.

thanks,

 - Joel


> 
> Signed-off-by: Scott Wood 
> ---
> v3: Limit to RT kernels, and remove one constraint that, while it
> is bad on both RT and non-RT (missing a schedule), does not oops or
> otherwise prevent using rcutorture.  It wolud be added once debug checks
> are implemented.
> 
>  kernel/rcu/rcutorture.c | 96 
> +
>  1 file changed, 82 insertions(+), 14 deletions(-)
> 
> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> index efaa5b3f4d3f..ecb82cc432af 100644
> --- a/kernel/rcu/rcutorture.c
> +++ b/kernel/rcu/rcutorture.c
> @@ -60,10 +60,13 @@
>  #define RCUTORTURE_RDR_RBH0x08   /*  ... rcu_read_lock_bh(). */
>  #define RCUTORTURE_RDR_SCHED  0x10   /*  ... rcu_read_lock_sched(). */
>  #define RCUTORTURE_RDR_RCU0x20   /*  ... entering another RCU reader. */
> -#define RCUTORTURE_RDR_NBITS  6  /* Number of bits defined above. */
> +#define RCUTORTURE_RDR_ATOM_BH0x40   /*  ... disabling bh while 
> atomic */
> +#define RCUTORTURE_RDR_ATOM_RBH   0x80   /*  ... RBH while atomic */
> +#define RCUTORTURE_RDR_NBITS  8  /* Number of bits defined above. */
>  #define RCUTORTURE_MAX_EXTEND \
>   (RCUTORTURE_RDR_BH | RCUTORTURE_RDR_IRQ | RCUTORTURE_RDR_PREEMPT | \
> -  RCUTORTURE_RDR_RBH | RCUTORTURE_RDR_SCHED)
> +  RCUTORTURE_RDR_RBH | RCUTORTURE_RDR_SCHED | \
> +  RCUTORTURE_RDR_ATOM_BH | RCUTORTURE_RDR_ATOM_RBH)
>  #define RCUTORTURE_RDR_MAX_LOOPS 0x7 /* Maximum reader extensions. */
>   /* Must be power of two minus one. */
>  #define RCUTORTURE_RDR_MAX_SEGS (RCUTORTURE_RDR_MAX_LOOPS + 3)
> @@ -1092,31 +1095,52 @@ static void rcutorture_one_extend(int *readstate, int 
> newstate,
>   WARN_ON_ONCE((idxold >> RCUTORTURE_RDR_SHIFT) > 1);
>   rtrsp->rt_readstate = newstate;
>  
> - /* First, put new protection in place to avoid critical-section gap. */
> + /*
> +  * First, put new protection in place to avoid critical-section gap.
> +  * Disable preemption around the ATOM disables to ensure that
> +  * in_atomic() is true.
> +  */
>   if (statesnew & RCUTORTURE_RDR_BH)
>   local_bh_disable();
> + if (statesnew & RCUTORTURE_RDR_RBH)
> + rcu_read_lock_bh();
>   if (statesnew & RCUTORTURE_RDR_IRQ)
>   local_irq_disable();
>   if (statesnew & RCUTORTURE_RDR_PREEMPT)
>   preempt_disable();
> - if (statesnew & RCUTORTURE_RDR_RBH)
> - rcu_read_lock_bh();
>   if (statesnew & RCUTORTURE_RDR_SCHED)
>   rcu_read_lock_sched();
> + preempt_disable();
> + if (statesnew & RCUTORTURE_RDR_ATOM_BH)
> + local_bh_disable();
> + if (statesnew & RCUTORTURE_RDR_ATOM_RBH)
> + rcu_read_lock_bh();
> + preempt_enable();
>   if (statesnew & RCUTORTURE_RDR_RCU)
>   idxnew = cur_ops->readlock() << RCUTORTURE_RDR_SHIFT;
>  
> - /* Next, remove old protection, irq first due to bh conflict. */
> + /*
> +  * Next, remove old protection, in decreasing order of strength
> +  * to avoid unlock paths that aren't safe in the stronger
> +  * context.  Disable preemption around the ATOM enables in
> +  * case the context was only atomic due to IRQ disabling.
> +  */
> + preempt_disable();
>   if (statesold & RCUTORTURE_RDR_IRQ)
>   local_irq_enable();
> -