Re: [CFQ/OOPS] rb_erase with April 9 next tree

2009-04-09 Thread Sachin Sant

Jens Axboe wrote:


Can you see if this fixes it for you?

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index e01b103..64de5c0 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1654,6 +1654,7 @@ retry:
}

RB_CLEAR_NODE(&cfqq->rb_node);
+   RB_CLEAR_NODE(&cfqq->p_node);
INIT_LIST_HEAD(&cfqq->fifo);

atomic_set(&cfqq->ref, 0);
  

Yes. The above patch fixed this oops. Thanks

Regards
-Sachin

--

-
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
-

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [CFQ/OOPS] rb_erase with April 9 next tree

2009-04-09 Thread Jens Axboe
On Thu, Apr 09 2009, Jens Axboe wrote:
> On Thu, Apr 09 2009, Sachin Sant wrote:
> > I had Next 09 booted on a powerpc box and was compiling a kernel.
> > That's when i ran into this oops.
> >
> > Unable to handle kernel paging request for data at address 0x0010.
> > Faulting instruction address: 0xc02ee1b0...
> > 0:mon> e
> > cpu 0x0: Vector: 300 (Data Access) at [c000d6cf63c0]
> >pc: c02ee1b0: .rb_erase+0x16c/0x3b4
> >lr: c02e14d0: .cfq_prio_tree_add+0x58/0x120
> >sp: c000d6cf6640
> >   msr: 80009032
> >   dar: 10
> > dsisr: 4000
> >  current = 0xc000fbdf5880
> >  paca= 0xc0a92300
> >   pid   = 1867, comm = ld
> > 0:mon> t
> > [c000d6cf66d0] c02e14d0 .cfq_prio_tree_add+0x58/0x120
> > [c000d6cf6770] c02e16c8 .__cfq_slice_expired+0xc8/0x11c
> > [c000d6cf6800] c02e3920 .cfq_insert_request+0x374/0x3f4
> > [c000d6cf68a0] c02cf448 .elv_insert+0x234/0x348
> > [c000d6cf6940] c02d3348 .__make_request+0x514/0x5b0
> > [c000d6cf6a00] c02d1348 .generic_make_request+0x430/0x4c8
> > [c000d6cf6b30] c02d14dc .submit_bio+0xfc/0x124
> > [c000d6cf6bf0] c0156998 .submit_bh+0x14c/0x198
> > [c000d6cf6c80] c015ba78 .block_read_full_page+0x394/0x40c
> > [c000d6cf7180] c0163080 .do_mpage_readpage+0x680/0x688
> > [c000d6cf7690] c0163200 .mpage_readpages+0x104/0x190
> > [c000d6cf77f0] c01e2aac .ext3_readpages+0x28/0x40
> > [c000d6cf7870] c00ebd20 .__do_page_cache_readahead+0x180/0x278
> > [c000d6cf7960] c00ec16c .ondemand_readahead+0x1ac/0x1d8
> > [c000d6cf7a00] c00e1f28 .generic_file_aio_read+0x260/0x6b0
> > [c000d6cf7b40] c0129f74 .do_sync_read+0xcc/0x130
> > [c000d6cf7ce0] c012af44 .vfs_read+0xd0/0x1bc
> > [c000d6cf7d80] c012b138 .SyS_read+0x58/0xa0
> > [c000d6cf7e30] c00084ac syscall_exit+0x0/0x40
> 
> Just ran into this myself, too. I'll pull that bad patch from -next
> asap. I wont be able to fix this before next week.

Can you see if this fixes it for you?

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index e01b103..64de5c0 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1654,6 +1654,7 @@ retry:
}
 
RB_CLEAR_NODE(&cfqq->rb_node);
+   RB_CLEAR_NODE(&cfqq->p_node);
INIT_LIST_HEAD(&cfqq->fifo);
 
atomic_set(&cfqq->ref, 0);

-- 
Jens Axboe

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [CFQ/OOPS] rb_erase with April 9 next tree

2009-04-09 Thread Jens Axboe
On Thu, Apr 09 2009, Sachin Sant wrote:
> I had Next 09 booted on a powerpc box and was compiling a kernel.
> That's when i ran into this oops.
>
> Unable to handle kernel paging request for data at address 0x0010.
> Faulting instruction address: 0xc02ee1b0...
> 0:mon> e
> cpu 0x0: Vector: 300 (Data Access) at [c000d6cf63c0]
>pc: c02ee1b0: .rb_erase+0x16c/0x3b4
>lr: c02e14d0: .cfq_prio_tree_add+0x58/0x120
>sp: c000d6cf6640
>   msr: 80009032
>   dar: 10
> dsisr: 4000
>  current = 0xc000fbdf5880
>  paca= 0xc0a92300
>   pid   = 1867, comm = ld
> 0:mon> t
> [c000d6cf66d0] c02e14d0 .cfq_prio_tree_add+0x58/0x120
> [c000d6cf6770] c02e16c8 .__cfq_slice_expired+0xc8/0x11c
> [c000d6cf6800] c02e3920 .cfq_insert_request+0x374/0x3f4
> [c000d6cf68a0] c02cf448 .elv_insert+0x234/0x348
> [c000d6cf6940] c02d3348 .__make_request+0x514/0x5b0
> [c000d6cf6a00] c02d1348 .generic_make_request+0x430/0x4c8
> [c000d6cf6b30] c02d14dc .submit_bio+0xfc/0x124
> [c000d6cf6bf0] c0156998 .submit_bh+0x14c/0x198
> [c000d6cf6c80] c015ba78 .block_read_full_page+0x394/0x40c
> [c000d6cf7180] c0163080 .do_mpage_readpage+0x680/0x688
> [c000d6cf7690] c0163200 .mpage_readpages+0x104/0x190
> [c000d6cf77f0] c01e2aac .ext3_readpages+0x28/0x40
> [c000d6cf7870] c00ebd20 .__do_page_cache_readahead+0x180/0x278
> [c000d6cf7960] c00ec16c .ondemand_readahead+0x1ac/0x1d8
> [c000d6cf7a00] c00e1f28 .generic_file_aio_read+0x260/0x6b0
> [c000d6cf7b40] c0129f74 .do_sync_read+0xcc/0x130
> [c000d6cf7ce0] c012af44 .vfs_read+0xd0/0x1bc
> [c000d6cf7d80] c012b138 .SyS_read+0x58/0xa0
> [c000d6cf7e30] c00084ac syscall_exit+0x0/0x40

Just ran into this myself, too. I'll pull that bad patch from -next
asap. I wont be able to fix this before next week.

-- 
Jens Axboe

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


[CFQ/OOPS] rb_erase with April 9 next tree

2009-04-09 Thread Sachin Sant

I had Next 09 booted on a powerpc box and was compiling a kernel.
That's when i ran into this oops.

Unable to handle kernel paging request for data at address 0x0010.
Faulting instruction address: 0xc02ee1b0...
0:mon> e
cpu 0x0: Vector: 300 (Data Access) at [c000d6cf63c0]
   pc: c02ee1b0: .rb_erase+0x16c/0x3b4
   lr: c02e14d0: .cfq_prio_tree_add+0x58/0x120
   sp: c000d6cf6640
  msr: 80009032
  dar: 10
dsisr: 4000
 current = 0xc000fbdf5880
 paca= 0xc0a92300
  pid   = 1867, comm = ld
0:mon> t
[c000d6cf66d0] c02e14d0 .cfq_prio_tree_add+0x58/0x120
[c000d6cf6770] c02e16c8 .__cfq_slice_expired+0xc8/0x11c
[c000d6cf6800] c02e3920 .cfq_insert_request+0x374/0x3f4
[c000d6cf68a0] c02cf448 .elv_insert+0x234/0x348
[c000d6cf6940] c02d3348 .__make_request+0x514/0x5b0
[c000d6cf6a00] c02d1348 .generic_make_request+0x430/0x4c8
[c000d6cf6b30] c02d14dc .submit_bio+0xfc/0x124
[c000d6cf6bf0] c0156998 .submit_bh+0x14c/0x198
[c000d6cf6c80] c015ba78 .block_read_full_page+0x394/0x40c
[c000d6cf7180] c0163080 .do_mpage_readpage+0x680/0x688
[c000d6cf7690] c0163200 .mpage_readpages+0x104/0x190
[c000d6cf77f0] c01e2aac .ext3_readpages+0x28/0x40
[c000d6cf7870] c00ebd20 .__do_page_cache_readahead+0x180/0x278
[c000d6cf7960] c00ec16c .ondemand_readahead+0x1ac/0x1d8
[c000d6cf7a00] c00e1f28 .generic_file_aio_read+0x260/0x6b0
[c000d6cf7b40] c0129f74 .do_sync_read+0xcc/0x130
[c000d6cf7ce0] c012af44 .vfs_read+0xd0/0x1bc
[c000d6cf7d80] c012b138 .SyS_read+0x58/0xa0
[c000d6cf7e30] c00084ac syscall_exit+0x0/0x40
--- Exception: c01 (System Call) at 0450a854
SP (fffd455e850) is in userspace
0:mon> di %pc
c02ee1b0  e95f0010  ld  r10,16(r31)
c02ee1b4  7faa4000  cmpdcr7,r10,r8
c02ee1b8  409e00ec  bne cr7,c02ee2a4# 
.rb_erase+0x260/0x3b4
c02ee1bc  e95f0008  ld  r10,8(r31)
c02ee1c0  e80a  ld  r0,0(r10)
c02ee1c4  780907e1  clrldi. r9,r0,63
c02ee1c8  40820028  bne c02ee1f0# 
.rb_erase+0x1ac/0x3b4
c02ee1cc  6001  ori r0,r0,1
c02ee1d0  7fe3fb78  mr  r3,r31
c02ee1d4  7fa4eb78  mr  r4,r29
c02ee1d8  f80a  std r0,0(r10)
c02ee1dc  e81f  ld  r0,0(r31)
c02ee1e0  780007a4  rldicr  r0,r0,0,62
c02ee1e4  f81f  std r0,0(r31)
c02ee1e8  4bfffbfd  bl  c02edde4# 
.__rb_rotate_left+0x0/0x7c
c02ee1ec  e95f0008  ld  r10,8(r31)
0:mon> di %ld
invalid register name '%ld'
c02ee1f0  e96a0010  ld  r11,16(r10)
c02ee1f4  2fab  cmpdi   cr7,r11,0
c02ee1f8  419e0010  beq cr7,c02ee208# 
.rb_erase+0x1c4/0x3b4
c02ee1fc  e80b  ld  r0,0(r11)
c02ee200  780907e1  clrldi. r9,r0,63
c02ee204  4182001c  beq c02ee220# 
.rb_erase+0x1dc/0x3b4
c02ee208  e92a0008  ld  r9,8(r10)
c02ee20c  2fa9  cmpdi   cr7,r9,0
c02ee210  419e00f4  beq cr7,c02ee304# 
.rb_erase+0x2c0/0x3b4
c02ee214  e809  ld  r0,0(r9)
c02ee218  780907e1  clrldi. r9,r0,63
c02ee21c  408200e8  bne c02ee304# 
.rb_erase+0x2c0/0x3b4
c02ee220  e92a0008  ld  r9,8(r10)
c02ee224  2fa9  cmpdi   cr7,r9,0
c02ee228  419e0010  beq cr7,c02ee238# 
.rb_erase+0x1f4/0x3b4
c02ee22c  e809  ld  r0,0(r9)
0:mon>
R00 = c000fbc07330   R16 = c06d2c92
R01 = c000d6cf6640   R17 = 
R02 = c09986e8   R18 = 0004
R03 = c000f93620b0   R19 = c000d6cf6a90
R04 = c000fb8af038   R20 = c000d6cf6a70
R05 = fff0   R21 = 0080
R06 = 0001   R22 = 04334ff2
R07 = c000f936a210   R23 = 0085
R08 = c000f936a210   R24 = c000fbaf
R09 = 0001   R25 = 
R10 = c000fbc09130   R26 = c000fbb0e490
R11 =    R27 = c000fb8af000
R12 = c000dd7e3800   R28 = c000fb8af038
R13 = c0a92300   R29 = c000fb8af038
R14 = 0001   R30 = c0923360
R15 = 0001   R31 = 
pc  = c02ee1b0 .rb_erase+0x16c/0x3b4
lr  = c02e14d0 .cfq_prio_tree_add+0x58/0x120
msr = 80009032   cr  = 44004448
ctr = c02e35ac   xer = 0001   trap =  300
dar = 0010   dsisr = 4000

On subsequent reboots, i observed similar oops during bootup.
I have attached the oops message here.

Let me know if i can prov