On Tue, 21 Nov 2017 09:14:25 -0600 Clark Williams <[email protected]> wrote:
> From 8ea8311b75a40bdea03e7f8228a0578b6367e9d1 Mon Sep 17 00:00:00 2001 > From: Clark Williams <[email protected]> > Date: Mon, 20 Nov 2017 14:26:12 -0600 > Subject: [PATCH] [rt] sched/rt: fix panic in double_lock_balance with > simplified IPI RT balancing > > I was testing 4.14-rt1 on a large system (cores == 96) and saw that > we were getting into an rt balancing storm, so I tried applying Steven's > patch (not upstream yet): > > sched/rt: Simplify the IPI rt balancing logic > > Booting the resulting kernel yielded a panic in > double_lock_balance() due to irqs not being disabled. > > This patch changes the calls to raw_spin_{lock,unlock} in the > function rto_push_irq_work_function, to be raw_spin_{lock,unlock}_irq. > Not sure if that's too heavy a hammer, but the resulting kernel boots > and runs and survives 12h runs of rteval. Once Steven's patch goes in > upstream, we'll need something like this in RT. > > Signed-off-by: Clark Williams <[email protected]> > --- > kernel/sched/rt.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c > index 57fb251dd8ce..a5cd0cea2f0f 100644 > --- a/kernel/sched/rt.c > +++ b/kernel/sched/rt.c > @@ -2008,9 +2008,9 @@ void rto_push_irq_work_func(struct irq_work *work) > * When it gets updated, a check is made if a push is possible. > */ > if (has_pushable_tasks(rq)) { > - raw_spin_lock(&rq->lock); > + raw_spin_lock_irq(&rq->lock); > push_rt_tasks(rq); > - raw_spin_unlock(&rq->lock); > + raw_spin_unlock_irq(&rq->lock); This looks buggy to me. You know you just indiscriminately enabled interrupts here. > } > > raw_spin_lock(&rq->rd->rto_lock); Why is this patch necessary? Is it because you have the irq_work running in non hard irq context? I think you need something like this instead (if you haven't already added it): -- Steve Index: linux-rt.git/kernel/sched/topology.c =================================================================== --- linux-rt.git.orig/kernel/sched/topology.c +++ linux-rt.git/kernel/sched/topology.c @@ -257,6 +257,7 @@ static int init_rootdomain(struct root_d rd->rto_cpu = -1; raw_spin_lock_init(&rd->rto_lock); init_irq_work(&rd->rto_push_work, rto_push_irq_work_func); + rd->rto_push_work.flags |= IRQ_WORK_HARD_IRQ; #endif init_dl_bw(&rd->dl_bw);

