Re: RT Scheduler - BUG_ON (idx >= MAX_RT_PRIO)

2015-09-07 Thread Sujit K M
On Mon, Sep 7, 2015 at 11:00 AM, Chinmay V S wrote: > Hello everyone, > > TL;DR: In Linux RT scheduler, how can rt_nr_running be non-zero AND > active-bitmap NOT have any valid bit set? > > Details: > Recently i encountered the following BUG() within the realtime > scheduler (sched_rt.c) on

Re: RT Scheduler - BUG_ON (idx >= MAX_RT_PRIO)

2015-09-07 Thread Sujit K M
On Mon, Sep 7, 2015 at 12:28 PM, Chinmay V S wrote: > Thanks for your quick response Mike. > >> Try without the proprietary modules. You may also want to audit futex >> fixes if you can't use a maintained stable tree. 3.2 has a bunch that >> 3.1 does not. > > I see that futex.c has 17 patches in

Re: RT Scheduler - BUG_ON (idx >= MAX_RT_PRIO)

2015-09-07 Thread Mike Galbraith
On Mon, 2015-09-07 at 12:28 +0530, Chinmay V S wrote: > To catch the "culprit" in the middle of busting the scheduler's > internal data structures, what would be the recommended debug > mechanisms (or config options) that i can try? I'd configure kdump, let it explode, and examine runqueues in

Re: RT Scheduler - BUG_ON (idx >= MAX_RT_PRIO)

2015-09-07 Thread Chinmay V S
Thanks for your quick response Mike. > Try without the proprietary modules. You may also want to audit futex > fixes if you can't use a maintained stable tree. 3.2 has a bunch that > 3.1 does not. I see that futex.c has 17 patches in 3.2.y that are missing in my tree.

Re: RT Scheduler - BUG_ON (idx >= MAX_RT_PRIO)

2015-09-07 Thread Mike Galbraith
On Mon, 2015-09-07 at 11:00 +0530, Chinmay V S wrote: > So how could rt_nr_running be non-zero AND active-bitmap NOT have any > valid bit set? It can't without being busted. > Also including the kernel OOPS below. > Do you see any tell-tale signs in the register-dump/backtrace that can > point

Re: RT Scheduler - BUG_ON (idx >= MAX_RT_PRIO)

2015-09-07 Thread Mike Galbraith
On Mon, 2015-09-07 at 11:00 +0530, Chinmay V S wrote: > So how could rt_nr_running be non-zero AND active-bitmap NOT have any > valid bit set? It can't without being busted. > Also including the kernel OOPS below. > Do you see any tell-tale signs in the register-dump/backtrace that can > point

Re: RT Scheduler - BUG_ON (idx >= MAX_RT_PRIO)

2015-09-07 Thread Sujit K M
On Mon, Sep 7, 2015 at 11:00 AM, Chinmay V S wrote: > Hello everyone, > > TL;DR: In Linux RT scheduler, how can rt_nr_running be non-zero AND > active-bitmap NOT have any valid bit set? > > Details: > Recently i encountered the following BUG() within the realtime > scheduler

Re: RT Scheduler - BUG_ON (idx >= MAX_RT_PRIO)

2015-09-07 Thread Chinmay V S
Thanks for your quick response Mike. > Try without the proprietary modules. You may also want to audit futex > fixes if you can't use a maintained stable tree. 3.2 has a bunch that > 3.1 does not. I see that futex.c has 17 patches in 3.2.y that are missing in my tree.

Re: RT Scheduler - BUG_ON (idx >= MAX_RT_PRIO)

2015-09-07 Thread Mike Galbraith
On Mon, 2015-09-07 at 12:28 +0530, Chinmay V S wrote: > To catch the "culprit" in the middle of busting the scheduler's > internal data structures, what would be the recommended debug > mechanisms (or config options) that i can try? I'd configure kdump, let it explode, and examine runqueues in

Re: RT Scheduler - BUG_ON (idx >= MAX_RT_PRIO)

2015-09-07 Thread Sujit K M
On Mon, Sep 7, 2015 at 12:28 PM, Chinmay V S wrote: > Thanks for your quick response Mike. > >> Try without the proprietary modules. You may also want to audit futex >> fixes if you can't use a maintained stable tree. 3.2 has a bunch that >> 3.1 does not. > > I see that futex.c

RT Scheduler - BUG_ON (idx >= MAX_RT_PRIO)

2015-09-06 Thread Chinmay V S
Hello everyone, TL;DR: In Linux RT scheduler, how can rt_nr_running be non-zero AND active-bitmap NOT have any valid bit set? Details: Recently i encountered the following BUG() within the realtime scheduler (sched_rt.c) on 3.1.10 kernel. [101640.492840] kernel BUG at kernel/sched_rt.c:1126!

RT Scheduler - BUG_ON (idx >= MAX_RT_PRIO)

2015-09-06 Thread Chinmay V S
Hello everyone, TL;DR: In Linux RT scheduler, how can rt_nr_running be non-zero AND active-bitmap NOT have any valid bit set? Details: Recently i encountered the following BUG() within the realtime scheduler (sched_rt.c) on 3.1.10 kernel. [101640.492840] kernel BUG at kernel/sched_rt.c:1126!