Re: 2.6.23-rc7-mm1: panic in scheduler

2007-09-25 Thread Lee Schermerhorn
On Tue, 2007-09-25 at 13:32 +0530, Kamalesh Babulal wrote:
> Balbir Singh wrote:
> > On 9/25/07, Kamalesh Babulal <[EMAIL PROTECTED]> wrote:
> >> Exactly same call trace is produced over IA64 Madison (up to 9M cache) 
> >> with 8 cpu's.
> >> --
> > 
> > Hi, Kamalesh,
> > 
> > Could you please reproduce the problem or share the steps to reproduce
> > the problem?
> > 
> > Thanks,
> > Balbir
> > -
> 
> Hi Balbir,
> 
> Yes, i am able to reproduce the problem. The problem can be reproduced
> using the ltprunall.
> 

I see the problem just trying to boot.  I have yet to successfully boot
23-rc7-mm1 on my platform.  [But, I'll try Ingo's dev tree real soon
now...]

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc7-mm1: panic in scheduler

2007-09-25 Thread Kamalesh Babulal
Balbir Singh wrote:
> On 9/25/07, Kamalesh Babulal <[EMAIL PROTECTED]> wrote:
>> Exactly same call trace is produced over IA64 Madison (up to 9M cache) with 
>> 8 cpu's.
>> --
> 
> Hi, Kamalesh,
> 
> Could you please reproduce the problem or share the steps to reproduce
> the problem?
> 
> Thanks,
> Balbir
> -

Hi Balbir,

Yes, i am able to reproduce the problem. The problem can be reproduced
using the ltprunall.

-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc7-mm1: panic in scheduler

2007-09-25 Thread Balbir Singh
On 9/25/07, Kamalesh Babulal <[EMAIL PROTECTED]> wrote:
> Exactly same call trace is produced over IA64 Madison (up to 9M cache) with 8 
> cpu's.
> --

Hi, Kamalesh,

Could you please reproduce the problem or share the steps to reproduce
the problem?

Thanks,
Balbir
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc7-mm1: panic in scheduler

2007-09-25 Thread Balbir Singh
On 9/25/07, Kamalesh Babulal [EMAIL PROTECTED] wrote:
 Exactly same call trace is produced over IA64 Madison (up to 9M cache) with 8 
 cpu's.
 --

Hi, Kamalesh,

Could you please reproduce the problem or share the steps to reproduce
the problem?

Thanks,
Balbir
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc7-mm1: panic in scheduler

2007-09-25 Thread Kamalesh Babulal
Balbir Singh wrote:
 On 9/25/07, Kamalesh Babulal [EMAIL PROTECTED] wrote:
 Exactly same call trace is produced over IA64 Madison (up to 9M cache) with 
 8 cpu's.
 --
 
 Hi, Kamalesh,
 
 Could you please reproduce the problem or share the steps to reproduce
 the problem?
 
 Thanks,
 Balbir
 -

Hi Balbir,

Yes, i am able to reproduce the problem. The problem can be reproduced
using the ltprunall.

-- 
Thanks  Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc7-mm1: panic in scheduler

2007-09-25 Thread Lee Schermerhorn
On Tue, 2007-09-25 at 13:32 +0530, Kamalesh Babulal wrote:
 Balbir Singh wrote:
  On 9/25/07, Kamalesh Babulal [EMAIL PROTECTED] wrote:
  Exactly same call trace is produced over IA64 Madison (up to 9M cache) 
  with 8 cpu's.
  --
  
  Hi, Kamalesh,
  
  Could you please reproduce the problem or share the steps to reproduce
  the problem?
  
  Thanks,
  Balbir
  -
 
 Hi Balbir,
 
 Yes, i am able to reproduce the problem. The problem can be reproduced
 using the ltprunall.
 

I see the problem just trying to boot.  I have yet to successfully boot
23-rc7-mm1 on my platform.  [But, I'll try Ingo's dev tree real soon
now...]

Lee

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc7-mm1: panic in scheduler

2007-09-24 Thread Ingo Molnar

* Lee Schermerhorn <[EMAIL PROTECTED]> wrote:

> Taking a quick look at [__]{en|de|queue_entity() and the functions 
> they call, I see something suspicious in set_leftmost() in 
> sched_fair.c:
> 
> static inline void
> set_leftmost(struct cfs_rq *cfs_rq, struct rb_node *leftmost)
> {
> struct sched_entity *se;
> 
> cfs_rq->rb_leftmost = leftmost;
> if (leftmost)
> se = rb_entry(leftmost, struct sched_entity, run_node);
> }
> 
> Missing code?  corrupt patch?

could you pull this git tree ontop of a -rc7 (or later) upstream tree:

  git-pull 
git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel.git

does the solve the crash?

the above set_leftmost() code used to be larger and now indeed those 
bits are mostly dead code. I've queued up a clean-up patch for that - 
see the patch below. It should not impact correctness though, so if you 
can still trigger the crash with the latest sched-devel.git tree we'd 
like to know about it.

Ingo

--->
Subject: sched: remove set_leftmost()
From: Ingo Molnar <[EMAIL PROTECTED]>

Lee Schermerhorn noticed that set_leftmost() contains dead code,
remove this.

Reported-by: Lee Schermerhorn <[EMAIL PROTECTED]>
Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 kernel/sched_fair.c |   14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

Index: linux/kernel/sched_fair.c
===
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -124,16 +124,6 @@ max_vruntime(u64 min_vruntime, u64 vrunt
return min_vruntime;
 }
 
-static inline void
-set_leftmost(struct cfs_rq *cfs_rq, struct rb_node *leftmost)
-{
-   struct sched_entity *se;
-
-   cfs_rq->rb_leftmost = leftmost;
-   if (leftmost)
-   se = rb_entry(leftmost, struct sched_entity, run_node);
-}
-
 static inline s64
 entity_key(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
@@ -175,7 +165,7 @@ __enqueue_entity(struct cfs_rq *cfs_rq, 
 * used):
 */
if (leftmost)
-   set_leftmost(cfs_rq, >run_node);
+   cfs_rq->rb_leftmost = >run_node;
 
rb_link_node(>run_node, parent, link);
rb_insert_color(>run_node, _rq->tasks_timeline);
@@ -185,7 +175,7 @@ static void
 __dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
if (cfs_rq->rb_leftmost == >run_node)
-   set_leftmost(cfs_rq, rb_next(>run_node));
+   cfs_rq->rb_leftmost = rb_next(>run_node);
 
rb_erase(>run_node, _rq->tasks_timeline);
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc7-mm1: panic in scheduler

2007-09-24 Thread Kamalesh Babulal
Lee Schermerhorn wrote:
> I looked around on the MLs for mention of this, but didn't find anything
> that appeared to match.
> 
> Platform:  HP rx8620 - 16-cpu/32GB/4-node ia64 [Madison]
> 
> 2.6.23-rc7-mm1 broken out -- panic occurs when git-sched.patch pushed:
> 
> Unable to handle kernel NULL pointer dereference (address )
> swapper[0]: Oops 8813272891392 [1]
> Modules linked in: scsi_wait_scan ehci_hcd ohci_hcd uhci_hcd usbcore
> 
> Pid: 0, CPU 14, comm:  swapper
> psr : 101008522030 ifs : 8002 ip  : []
> Not tainted
> ip is at rb_next+0x0/0x140
> unat:  pfs : 0308 rsc : 0003
> rnat: 8012 bsps: 0001003e pr  : 6609a840599519a5
> ldrs:  ccv : 0002 fpsr: 0009804c8a70433f
> csd :  ssd : 
> b0  : a00100078dc0 b6  : a00100074a40 b7  : a00100078e00
> f6  : 1003e f7  : 1003e0040
> f8  : 1003e2aab f9  : 1003e000d43798a2b
> f10 : 1003e35e9970b967dd8b9 f11 : 1003e0002
> r1  : a00100bc0920 r2  : e76577f0 r3  : e7657f10
> r8  : fff0 r9  : 0002 r10 : e7657780
> r11 :  r12 : e7004160fe10 r13 : e70041608000
> r14 :  r15 : 000e r16 : 0007f6c30a22
> r17 : e70041608040 r18 : a001008383a8 r19 : a00100078e00
> r20 : e7655bb8 r21 : e7655bb0 r22 : e7657ed0
> r23 : 000f4240 r24 : a001009e0440 r25 : e70041608bb4
> r26 :  r27 :  r28 : e7657f80
> r29 : 02e7 r30 :  r31 : e7657780
> 
> Call Trace:
>  [] show_stack+0x80/0xa0
> sp=e7004160f9e0 bsp=e70041609008
>  [] show_regs+0x870/0x8a0
> sp=e7004160fbb0 bsp=e70041608fa8
>  [] die+0x190/0x300
> sp=e7004160fbb0 bsp=e70041608f60
>  [] ia64_do_page_fault+0x780/0xa80
> sp=e7004160fbb0 bsp=e70041608f08
>  [] ia64_leave_kernel+0x0/0x270
> sp=e7004160fc40 bsp=e70041608f08
>  [] rb_next+0x0/0x140
> sp=e7004160fe10 bsp=e70041608ef8
>  [] __dequeue_entity+0x80/0xc0
> sp=e7004160fe10 bsp=e70041608ec8
>  [] pick_next_task_fair+0x60/0x180
> sp=e7004160fe10 bsp=e70041608e98
>  [] schedule+0x340/0x19c0
> sp=e7004160fe10 bsp=e70041608cc0
>  [] cpu_idle+0x290/0x3e0
> sp=e7004160fe30 bsp=e70041608c50
>  [] start_secondary+0x380/0x5a0
> sp=e7004160fe30 bsp=e70041608c00
>  [] __kprobes_text_end+0x6c0/0x6f0
> sp=e7004160fe30 bsp=e70041608c00
> 
> 
> Taking a quick look at [__]{en|de|queue_entity() and the functions they call,
> I see something suspicious in set_leftmost() in sched_fair.c:
> 
> static inline void
> set_leftmost(struct cfs_rq *cfs_rq, struct rb_node *leftmost)
> {
> struct sched_entity *se;
> 
> cfs_rq->rb_leftmost = leftmost;
> if (leftmost)
> se = rb_entry(leftmost, struct sched_entity, run_node);
> }
> 
> Missing code?  corrupt patch?
> 
> config available on request, but there doesn't seem to be much in the way
> of scheduler config option.  A few that might apply:
> 
> SCHED_SMT is not set
> SCHED_DEBUG=y
> SCHEDSTATS=y
> 
> 
> Regards,
> Lee Schermerhorn
> 

Exactly same call trace is produced over IA64 Madison (up to 9M cache) with 8 
cpu's.
-- 
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.23-rc7-mm1: panic in scheduler

2007-09-24 Thread Lee Schermerhorn
I looked around on the MLs for mention of this, but didn't find anything
that appeared to match.

Platform:  HP rx8620 - 16-cpu/32GB/4-node ia64 [Madison]

2.6.23-rc7-mm1 broken out -- panic occurs when git-sched.patch pushed:

Unable to handle kernel NULL pointer dereference (address )
swapper[0]: Oops 8813272891392 [1]
Modules linked in: scsi_wait_scan ehci_hcd ohci_hcd uhci_hcd usbcore

Pid: 0, CPU 14, comm:  swapper
psr : 101008522030 ifs : 8002 ip  : []Not 
tainted
ip is at rb_next+0x0/0x140
unat:  pfs : 0308 rsc : 0003
rnat: 8012 bsps: 0001003e pr  : 6609a840599519a5
ldrs:  ccv : 0002 fpsr: 0009804c8a70433f
csd :  ssd : 
b0  : a00100078dc0 b6  : a00100074a40 b7  : a00100078e00
f6  : 1003e f7  : 1003e0040
f8  : 1003e2aab f9  : 1003e000d43798a2b
f10 : 1003e35e9970b967dd8b9 f11 : 1003e0002
r1  : a00100bc0920 r2  : e76577f0 r3  : e7657f10
r8  : fff0 r9  : 0002 r10 : e7657780
r11 :  r12 : e7004160fe10 r13 : e70041608000
r14 :  r15 : 000e r16 : 0007f6c30a22
r17 : e70041608040 r18 : a001008383a8 r19 : a00100078e00
r20 : e7655bb8 r21 : e7655bb0 r22 : e7657ed0
r23 : 000f4240 r24 : a001009e0440 r25 : e70041608bb4
r26 :  r27 :  r28 : e7657f80
r29 : 02e7 r30 :  r31 : e7657780

Call Trace:
 [] show_stack+0x80/0xa0
sp=e7004160f9e0 bsp=e70041609008
 [] show_regs+0x870/0x8a0
sp=e7004160fbb0 bsp=e70041608fa8
 [] die+0x190/0x300
sp=e7004160fbb0 bsp=e70041608f60
 [] ia64_do_page_fault+0x780/0xa80
sp=e7004160fbb0 bsp=e70041608f08
 [] ia64_leave_kernel+0x0/0x270
sp=e7004160fc40 bsp=e70041608f08
 [] rb_next+0x0/0x140
sp=e7004160fe10 bsp=e70041608ef8
 [] __dequeue_entity+0x80/0xc0
sp=e7004160fe10 bsp=e70041608ec8
 [] pick_next_task_fair+0x60/0x180
sp=e7004160fe10 bsp=e70041608e98
 [] schedule+0x340/0x19c0
sp=e7004160fe10 bsp=e70041608cc0
 [] cpu_idle+0x290/0x3e0
sp=e7004160fe30 bsp=e70041608c50
 [] start_secondary+0x380/0x5a0
sp=e7004160fe30 bsp=e70041608c00
 [] __kprobes_text_end+0x6c0/0x6f0
sp=e7004160fe30 bsp=e70041608c00


Taking a quick look at [__]{en|de|queue_entity() and the functions they call,
I see something suspicious in set_leftmost() in sched_fair.c:

static inline void
set_leftmost(struct cfs_rq *cfs_rq, struct rb_node *leftmost)
{
struct sched_entity *se;

cfs_rq->rb_leftmost = leftmost;
if (leftmost)
se = rb_entry(leftmost, struct sched_entity, run_node);
}

Missing code?  corrupt patch?

config available on request, but there doesn't seem to be much in the way
of scheduler config option.  A few that might apply:

SCHED_SMT is not set
SCHED_DEBUG=y
SCHEDSTATS=y


Regards,
Lee Schermerhorn


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.23-rc7-mm1: panic in scheduler

2007-09-24 Thread Lee Schermerhorn
I looked around on the MLs for mention of this, but didn't find anything
that appeared to match.

Platform:  HP rx8620 - 16-cpu/32GB/4-node ia64 [Madison]

2.6.23-rc7-mm1 broken out -- panic occurs when git-sched.patch pushed:

Unable to handle kernel NULL pointer dereference (address )
swapper[0]: Oops 8813272891392 [1]
Modules linked in: scsi_wait_scan ehci_hcd ohci_hcd uhci_hcd usbcore

Pid: 0, CPU 14, comm:  swapper
psr : 101008522030 ifs : 8002 ip  : [a001003014e0]Not 
tainted
ip is at rb_next+0x0/0x140
unat:  pfs : 0308 rsc : 0003
rnat: 8012 bsps: 0001003e pr  : 6609a840599519a5
ldrs:  ccv : 0002 fpsr: 0009804c8a70433f
csd :  ssd : 
b0  : a00100078dc0 b6  : a00100074a40 b7  : a00100078e00
f6  : 1003e f7  : 1003e0040
f8  : 1003e2aab f9  : 1003e000d43798a2b
f10 : 1003e35e9970b967dd8b9 f11 : 1003e0002
r1  : a00100bc0920 r2  : e76577f0 r3  : e7657f10
r8  : fff0 r9  : 0002 r10 : e7657780
r11 :  r12 : e7004160fe10 r13 : e70041608000
r14 :  r15 : 000e r16 : 0007f6c30a22
r17 : e70041608040 r18 : a001008383a8 r19 : a00100078e00
r20 : e7655bb8 r21 : e7655bb0 r22 : e7657ed0
r23 : 000f4240 r24 : a001009e0440 r25 : e70041608bb4
r26 :  r27 :  r28 : e7657f80
r29 : 02e7 r30 :  r31 : e7657780

Call Trace:
 [a00100014f60] show_stack+0x80/0xa0
sp=e7004160f9e0 bsp=e70041609008
 [a00100015bf0] show_regs+0x870/0x8a0
sp=e7004160fbb0 bsp=e70041608fa8
 [a0010003d170] die+0x190/0x300
sp=e7004160fbb0 bsp=e70041608f60
 [a00100071bc0] ia64_do_page_fault+0x780/0xa80
sp=e7004160fbb0 bsp=e70041608f08
 [a001b5c0] ia64_leave_kernel+0x0/0x270
sp=e7004160fc40 bsp=e70041608f08
 [a001003014e0] rb_next+0x0/0x140
sp=e7004160fe10 bsp=e70041608ef8
 [a00100078dc0] __dequeue_entity+0x80/0xc0
sp=e7004160fe10 bsp=e70041608ec8
 [a00100078e60] pick_next_task_fair+0x60/0x180
sp=e7004160fe10 bsp=e70041608e98
 [a001006a5880] schedule+0x340/0x19c0
sp=e7004160fe10 bsp=e70041608cc0
 [a00100014cb0] cpu_idle+0x290/0x3e0
sp=e7004160fe30 bsp=e70041608c50
 [a00100066020] start_secondary+0x380/0x5a0
sp=e7004160fe30 bsp=e70041608c00
 [a001006abca0] __kprobes_text_end+0x6c0/0x6f0
sp=e7004160fe30 bsp=e70041608c00


Taking a quick look at [__]{en|de|queue_entity() and the functions they call,
I see something suspicious in set_leftmost() in sched_fair.c:

static inline void
set_leftmost(struct cfs_rq *cfs_rq, struct rb_node *leftmost)
{
struct sched_entity *se;

cfs_rq-rb_leftmost = leftmost;
if (leftmost)
se = rb_entry(leftmost, struct sched_entity, run_node);
}

Missing code?  corrupt patch?

config available on request, but there doesn't seem to be much in the way
of scheduler config option.  A few that might apply:

SCHED_SMT is not set
SCHED_DEBUG=y
SCHEDSTATS=y


Regards,
Lee Schermerhorn


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc7-mm1: panic in scheduler

2007-09-24 Thread Kamalesh Babulal
Lee Schermerhorn wrote:
 I looked around on the MLs for mention of this, but didn't find anything
 that appeared to match.
 
 Platform:  HP rx8620 - 16-cpu/32GB/4-node ia64 [Madison]
 
 2.6.23-rc7-mm1 broken out -- panic occurs when git-sched.patch pushed:
 
 Unable to handle kernel NULL pointer dereference (address )
 swapper[0]: Oops 8813272891392 [1]
 Modules linked in: scsi_wait_scan ehci_hcd ohci_hcd uhci_hcd usbcore
 
 Pid: 0, CPU 14, comm:  swapper
 psr : 101008522030 ifs : 8002 ip  : [a001003014e0]
 Not tainted
 ip is at rb_next+0x0/0x140
 unat:  pfs : 0308 rsc : 0003
 rnat: 8012 bsps: 0001003e pr  : 6609a840599519a5
 ldrs:  ccv : 0002 fpsr: 0009804c8a70433f
 csd :  ssd : 
 b0  : a00100078dc0 b6  : a00100074a40 b7  : a00100078e00
 f6  : 1003e f7  : 1003e0040
 f8  : 1003e2aab f9  : 1003e000d43798a2b
 f10 : 1003e35e9970b967dd8b9 f11 : 1003e0002
 r1  : a00100bc0920 r2  : e76577f0 r3  : e7657f10
 r8  : fff0 r9  : 0002 r10 : e7657780
 r11 :  r12 : e7004160fe10 r13 : e70041608000
 r14 :  r15 : 000e r16 : 0007f6c30a22
 r17 : e70041608040 r18 : a001008383a8 r19 : a00100078e00
 r20 : e7655bb8 r21 : e7655bb0 r22 : e7657ed0
 r23 : 000f4240 r24 : a001009e0440 r25 : e70041608bb4
 r26 :  r27 :  r28 : e7657f80
 r29 : 02e7 r30 :  r31 : e7657780
 
 Call Trace:
  [a00100014f60] show_stack+0x80/0xa0
 sp=e7004160f9e0 bsp=e70041609008
  [a00100015bf0] show_regs+0x870/0x8a0
 sp=e7004160fbb0 bsp=e70041608fa8
  [a0010003d170] die+0x190/0x300
 sp=e7004160fbb0 bsp=e70041608f60
  [a00100071bc0] ia64_do_page_fault+0x780/0xa80
 sp=e7004160fbb0 bsp=e70041608f08
  [a001b5c0] ia64_leave_kernel+0x0/0x270
 sp=e7004160fc40 bsp=e70041608f08
  [a001003014e0] rb_next+0x0/0x140
 sp=e7004160fe10 bsp=e70041608ef8
  [a00100078dc0] __dequeue_entity+0x80/0xc0
 sp=e7004160fe10 bsp=e70041608ec8
  [a00100078e60] pick_next_task_fair+0x60/0x180
 sp=e7004160fe10 bsp=e70041608e98
  [a001006a5880] schedule+0x340/0x19c0
 sp=e7004160fe10 bsp=e70041608cc0
  [a00100014cb0] cpu_idle+0x290/0x3e0
 sp=e7004160fe30 bsp=e70041608c50
  [a00100066020] start_secondary+0x380/0x5a0
 sp=e7004160fe30 bsp=e70041608c00
  [a001006abca0] __kprobes_text_end+0x6c0/0x6f0
 sp=e7004160fe30 bsp=e70041608c00
 
 
 Taking a quick look at [__]{en|de|queue_entity() and the functions they call,
 I see something suspicious in set_leftmost() in sched_fair.c:
 
 static inline void
 set_leftmost(struct cfs_rq *cfs_rq, struct rb_node *leftmost)
 {
 struct sched_entity *se;
 
 cfs_rq-rb_leftmost = leftmost;
 if (leftmost)
 se = rb_entry(leftmost, struct sched_entity, run_node);
 }
 
 Missing code?  corrupt patch?
 
 config available on request, but there doesn't seem to be much in the way
 of scheduler config option.  A few that might apply:
 
 SCHED_SMT is not set
 SCHED_DEBUG=y
 SCHEDSTATS=y
 
 
 Regards,
 Lee Schermerhorn
 

Exactly same call trace is produced over IA64 Madison (up to 9M cache) with 8 
cpu's.
-- 
Thanks  Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.23-rc7-mm1: panic in scheduler

2007-09-24 Thread Ingo Molnar

* Lee Schermerhorn [EMAIL PROTECTED] wrote:

 Taking a quick look at [__]{en|de|queue_entity() and the functions 
 they call, I see something suspicious in set_leftmost() in 
 sched_fair.c:
 
 static inline void
 set_leftmost(struct cfs_rq *cfs_rq, struct rb_node *leftmost)
 {
 struct sched_entity *se;
 
 cfs_rq-rb_leftmost = leftmost;
 if (leftmost)
 se = rb_entry(leftmost, struct sched_entity, run_node);
 }
 
 Missing code?  corrupt patch?

could you pull this git tree ontop of a -rc7 (or later) upstream tree:

  git-pull 
git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel.git

does the solve the crash?

the above set_leftmost() code used to be larger and now indeed those 
bits are mostly dead code. I've queued up a clean-up patch for that - 
see the patch below. It should not impact correctness though, so if you 
can still trigger the crash with the latest sched-devel.git tree we'd 
like to know about it.

Ingo

---
Subject: sched: remove set_leftmost()
From: Ingo Molnar [EMAIL PROTECTED]

Lee Schermerhorn noticed that set_leftmost() contains dead code,
remove this.

Reported-by: Lee Schermerhorn [EMAIL PROTECTED]
Signed-off-by: Ingo Molnar [EMAIL PROTECTED]
---
 kernel/sched_fair.c |   14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

Index: linux/kernel/sched_fair.c
===
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -124,16 +124,6 @@ max_vruntime(u64 min_vruntime, u64 vrunt
return min_vruntime;
 }
 
-static inline void
-set_leftmost(struct cfs_rq *cfs_rq, struct rb_node *leftmost)
-{
-   struct sched_entity *se;
-
-   cfs_rq-rb_leftmost = leftmost;
-   if (leftmost)
-   se = rb_entry(leftmost, struct sched_entity, run_node);
-}
-
 static inline s64
 entity_key(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
@@ -175,7 +165,7 @@ __enqueue_entity(struct cfs_rq *cfs_rq, 
 * used):
 */
if (leftmost)
-   set_leftmost(cfs_rq, se-run_node);
+   cfs_rq-rb_leftmost = se-run_node;
 
rb_link_node(se-run_node, parent, link);
rb_insert_color(se-run_node, cfs_rq-tasks_timeline);
@@ -185,7 +175,7 @@ static void
 __dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
if (cfs_rq-rb_leftmost == se-run_node)
-   set_leftmost(cfs_rq, rb_next(se-run_node));
+   cfs_rq-rb_leftmost = rb_next(se-run_node);
 
rb_erase(se-run_node, cfs_rq-tasks_timeline);
 }
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/