Re: Oops while running fs_racer test on a POWER6 box against latest git

2010-07-09 Thread divya

On Friday 02 July 2010 12:16 PM, divya wrote:

On Thursday 01 July 2010 11:55 PM, Maciej Rutecki wrote:

On środa, 30 czerwca 2010 o 13:22:27 divya wrote:

While running fs_racer test from LTP on a POWER6 box against latest
git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the 
following

warning followed by multiple oops.


I created a Bugzilla entry at
https://bugzilla.kernel.org/show_bug.cgi?id=16324
for your bug report, please add your address to the CC list in there, 
thanks!



Here I find a cleaner back trace while running fs_racer test from LTP 
on a POWER6

box against the latest git(2.6.35-rc3-git5 - commitid 980019d74e4b242)

Badness at kernel/mutex-debug.c:64
BUG: key (null) not in .data!
NIP: c00be9e8 LR: c00be9cc CTR: 
REGS: c0010bb176f0 TRAP: 0700   Not tainted  
(2.6.35-rc3-git5-autotest)

BUG: key 01d8 not in .data!
BUG: key 01e0 not in .data!
BUG: key 01e8 not in .data!
MSR: 80029032
Unable to handle kernel paging request for data at address 0x0028
Faulting instruction address: 0xc03ad0ec
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=1024 NUMA pSeries
last sysfs file: 
/sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map

Page fault in user mode with in_atomic() = 1 mm = c0010943e600
Modules linked in:
NIP = fff9e98fc40  MSR = 80004001d032
 ipv6 fuse loop
Unable to handle kernel paging request for unknown fault
 dm_mod
Faulting instruction address: 0xc008d0f4
 sr_mod ibmveth cdrom sg sd_mod crc_t10dif ibmvscsic 
scsi_transport_srp scsi_tgt scsi_mod

NIP: c03ad0ec LR: c064c3b0 CTR: c03a6eb0
REGS: c00109b4f610 TRAP: 0300   Not tainted  
(2.6.35-rc3-git5-autotest)

MSR: 80009032EE,ME,IR,DR   CR: 88004484  XER: 0001
DAR: 0028, DSISR: 4001
TASK = c00109a98600[7403] 'mkdir' THREAD: c00109b4c000 CPU: 19
GPR00: 8013 c00109b4f890 c0d3d798 
0028
GPR04:    
0001
GPR08:  0028 c0189f2c 
c00109a98600
GPR12: 24004424 cf602f80 41ff 
0001
GPR16: 0002 c0010d8304c0 c00109b4fb44 

GPR20: c0010df77908 f000 0001 
41ff
GPR24: c0010df77758 c00109fa1800 c0010df77908 
c000ff236600
GPR28: 0028 0040 c0ca7b38 
c0189f2c

NIP [c03ad0ec] .do_raw_spin_trylock+0x10/0x48
LR [c064c3b0] ._raw_spin_lock+0x50/0xa4
Call Trace:
[c00109b4f890] [c064c3a4] ._raw_spin_lock+0x44/0xa4 
(unreliable)

[c00109b4f920] [c0189f2c] .new_inode+0x4c/0xe4
[c00109b4f9b0] [c02257fc] .ext3_new_inode+0x84/0xb70
[c00109b4fad0] [c022f1ec] .ext3_mkdir+0x130/0x438
[c00109b4fbe0] [c017adb4] .vfs_mkdir+0xb8/0x160
[c00109b4fc80] [c017e52c] .SyS_mkdirat+0xb0/0x114
[c00109b4fdc0] [c017a730] .SyS_mkdir+0x1c/0x30
[c00109b4fe30] [c00085b4] syscall_exit+0x0/0x40
Instruction dump:
eb41ffd0 7c0803a6 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020
3800 7c691b78 980d0214 800d00087d601829  2c0b 40c20010 7c00192d
Oops: Weird page fault, sig: 11 [#2]

Pls let me know if this back trace would help in analyzing further.
Meanwhile I shall do a git bisect and send the inputs.

Thanks
Divya




Hi All,

From the git bisect,seems like the commit 
57439f878afafefad8836ebf5c49da2a0a746105 is the corrupt for the above issue.

Thanks
Divya


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Oops while running fs_racer test on a POWER6 box against latest git

2010-07-09 Thread Nick Piggin
On Fri, Jul 09, 2010 at 09:34:16AM +0200, Jens Axboe wrote:
 On 2010-07-09 08:57, divya wrote:
  On Friday 02 July 2010 12:16 PM, divya wrote:
  On Thursday 01 July 2010 11:55 PM, Maciej Rutecki wrote:
  On środa, 30 czerwca 2010 o 13:22:27 divya wrote:
  While running fs_racer test from LTP on a POWER6 box against latest
  git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the 
  following
  warning followed by multiple oops.
 
  I created a Bugzilla entry at
  https://bugzilla.kernel.org/show_bug.cgi?id=16324
  for your bug report, please add your address to the CC list in there, 
  thanks!
 
 
  Here I find a cleaner back trace while running fs_racer test from LTP 
  on a POWER6
  box against the latest git(2.6.35-rc3-git5 - commitid 980019d74e4b242)
 
  Badness at kernel/mutex-debug.c:64
  BUG: key (null) not in .data!
  NIP: c00be9e8 LR: c00be9cc CTR: 
  REGS: c0010bb176f0 TRAP: 0700   Not tainted  
  (2.6.35-rc3-git5-autotest)
  BUG: key 01d8 not in .data!
  BUG: key 01e0 not in .data!
  BUG: key 01e8 not in .data!
  MSR: 80029032
  Unable to handle kernel paging request for data at address 0x0028
  Faulting instruction address: 0xc03ad0ec
  Oops: Kernel access of bad area, sig: 11 [#1]
  SMP NR_CPUS=1024 NUMA pSeries
  last sysfs file: 
  /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map
  Page fault in user mode with in_atomic() = 1 mm = c0010943e600
  Modules linked in:
  NIP = fff9e98fc40  MSR = 80004001d032
   ipv6 fuse loop
  Unable to handle kernel paging request for unknown fault
   dm_mod
  Faulting instruction address: 0xc008d0f4
   sr_mod ibmveth cdrom sg sd_mod crc_t10dif ibmvscsic 
  scsi_transport_srp scsi_tgt scsi_mod
  NIP: c03ad0ec LR: c064c3b0 CTR: c03a6eb0
  REGS: c00109b4f610 TRAP: 0300   Not tainted  
  (2.6.35-rc3-git5-autotest)
  MSR: 80009032EE,ME,IR,DR   CR: 88004484  XER: 0001
  DAR: 0028, DSISR: 4001
  TASK = c00109a98600[7403] 'mkdir' THREAD: c00109b4c000 CPU: 19
  GPR00: 8013 c00109b4f890 c0d3d798 
  0028
  GPR04:    
  0001
  GPR08:  0028 c0189f2c 
  c00109a98600
  GPR12: 24004424 cf602f80 41ff 
  0001
  GPR16: 0002 c0010d8304c0 c00109b4fb44 
  
  GPR20: c0010df77908 f000 0001 
  41ff
  GPR24: c0010df77758 c00109fa1800 c0010df77908 
  c000ff236600
  GPR28: 0028 0040 c0ca7b38 
  c0189f2c
  NIP [c03ad0ec] .do_raw_spin_trylock+0x10/0x48
  LR [c064c3b0] ._raw_spin_lock+0x50/0xa4
  Call Trace:
  [c00109b4f890] [c064c3a4] ._raw_spin_lock+0x44/0xa4 
  (unreliable)
  [c00109b4f920] [c0189f2c] .new_inode+0x4c/0xe4
  [c00109b4f9b0] [c02257fc] .ext3_new_inode+0x84/0xb70
  [c00109b4fad0] [c022f1ec] .ext3_mkdir+0x130/0x438
  [c00109b4fbe0] [c017adb4] .vfs_mkdir+0xb8/0x160
  [c00109b4fc80] [c017e52c] .SyS_mkdirat+0xb0/0x114
  [c00109b4fdc0] [c017a730] .SyS_mkdir+0x1c/0x30
  [c00109b4fe30] [c00085b4] syscall_exit+0x0/0x40
  Instruction dump:
  eb41ffd0 7c0803a6 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020
  3800 7c691b78 980d0214 800d00087d601829  2c0b 40c20010 7c00192d
  Oops: Weird page fault, sig: 11 [#2]
 
  Pls let me know if this back trace would help in analyzing further.
  Meanwhile I shall do a git bisect and send the inputs.

The call stack for Badness at kernel/mutex-debug.c:64 (or whatever
explodes first) would be handy.  This one seems jumbled still. What
spinlock is in the trace? inode_lock?  That would indicate some random
corruption or breakage in the lock debugging.

 
  Thanks
  Divya
 
 
 
  Hi All,
  
   From the git bisect,seems like the commit
   57439f878afafefad8836ebf5c49da2a0a746105 is the corrupt for the above
   issue.

Call me blind but I can't see the problem. Are you sure this commit
breaks it?

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Oops while running fs_racer test on a POWER6 box against latest git

2010-07-09 Thread Jens Axboe
On 2010-07-09 08:57, divya wrote:
 On Friday 02 July 2010 12:16 PM, divya wrote:
 On Thursday 01 July 2010 11:55 PM, Maciej Rutecki wrote:
 On środa, 30 czerwca 2010 o 13:22:27 divya wrote:
 While running fs_racer test from LTP on a POWER6 box against latest
 git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the 
 following
 warning followed by multiple oops.

 I created a Bugzilla entry at
 https://bugzilla.kernel.org/show_bug.cgi?id=16324
 for your bug report, please add your address to the CC list in there, 
 thanks!


 Here I find a cleaner back trace while running fs_racer test from LTP 
 on a POWER6
 box against the latest git(2.6.35-rc3-git5 - commitid 980019d74e4b242)

 Badness at kernel/mutex-debug.c:64
 BUG: key (null) not in .data!
 NIP: c00be9e8 LR: c00be9cc CTR: 
 REGS: c0010bb176f0 TRAP: 0700   Not tainted  
 (2.6.35-rc3-git5-autotest)
 BUG: key 01d8 not in .data!
 BUG: key 01e0 not in .data!
 BUG: key 01e8 not in .data!
 MSR: 80029032
 Unable to handle kernel paging request for data at address 0x0028
 Faulting instruction address: 0xc03ad0ec
 Oops: Kernel access of bad area, sig: 11 [#1]
 SMP NR_CPUS=1024 NUMA pSeries
 last sysfs file: 
 /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map
 Page fault in user mode with in_atomic() = 1 mm = c0010943e600
 Modules linked in:
 NIP = fff9e98fc40  MSR = 80004001d032
  ipv6 fuse loop
 Unable to handle kernel paging request for unknown fault
  dm_mod
 Faulting instruction address: 0xc008d0f4
  sr_mod ibmveth cdrom sg sd_mod crc_t10dif ibmvscsic 
 scsi_transport_srp scsi_tgt scsi_mod
 NIP: c03ad0ec LR: c064c3b0 CTR: c03a6eb0
 REGS: c00109b4f610 TRAP: 0300   Not tainted  
 (2.6.35-rc3-git5-autotest)
 MSR: 80009032EE,ME,IR,DR   CR: 88004484  XER: 0001
 DAR: 0028, DSISR: 4001
 TASK = c00109a98600[7403] 'mkdir' THREAD: c00109b4c000 CPU: 19
 GPR00: 8013 c00109b4f890 c0d3d798 
 0028
 GPR04:    
 0001
 GPR08:  0028 c0189f2c 
 c00109a98600
 GPR12: 24004424 cf602f80 41ff 
 0001
 GPR16: 0002 c0010d8304c0 c00109b4fb44 
 
 GPR20: c0010df77908 f000 0001 
 41ff
 GPR24: c0010df77758 c00109fa1800 c0010df77908 
 c000ff236600
 GPR28: 0028 0040 c0ca7b38 
 c0189f2c
 NIP [c03ad0ec] .do_raw_spin_trylock+0x10/0x48
 LR [c064c3b0] ._raw_spin_lock+0x50/0xa4
 Call Trace:
 [c00109b4f890] [c064c3a4] ._raw_spin_lock+0x44/0xa4 
 (unreliable)
 [c00109b4f920] [c0189f2c] .new_inode+0x4c/0xe4
 [c00109b4f9b0] [c02257fc] .ext3_new_inode+0x84/0xb70
 [c00109b4fad0] [c022f1ec] .ext3_mkdir+0x130/0x438
 [c00109b4fbe0] [c017adb4] .vfs_mkdir+0xb8/0x160
 [c00109b4fc80] [c017e52c] .SyS_mkdirat+0xb0/0x114
 [c00109b4fdc0] [c017a730] .SyS_mkdir+0x1c/0x30
 [c00109b4fe30] [c00085b4] syscall_exit+0x0/0x40
 Instruction dump:
 eb41ffd0 7c0803a6 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020
 3800 7c691b78 980d0214 800d00087d601829  2c0b 40c20010 7c00192d
 Oops: Weird page fault, sig: 11 [#2]

 Pls let me know if this back trace would help in analyzing further.
 Meanwhile I shall do a git bisect and send the inputs.

 Thanks
 Divya



 Hi All,
 
  From the git bisect,seems like the commit
  57439f878afafefad8836ebf5c49da2a0a746105 is the corrupt for the above
  issue.

CC'ing Nick and Al.

-- 
Jens Axboe

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Oops while running fs_racer test on a POWER6 box against latest git

2010-07-02 Thread divya

On Thursday 01 July 2010 11:55 PM, Maciej Rutecki wrote:

On środa, 30 czerwca 2010 o 13:22:27 divya wrote:
   

While running fs_racer test from LTP on a POWER6 box against latest
git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the following
warning followed by multiple oops.

 

I created a Bugzilla entry at
https://bugzilla.kernel.org/show_bug.cgi?id=16324
for your bug report, please add your address to the CC list in there, thanks!


   

Here I find a cleaner back trace while running fs_racer test from LTP on a 
POWER6
box against the latest git(2.6.35-rc3-git5 - commitid 980019d74e4b242)

Badness at kernel/mutex-debug.c:64
BUG: key (null) not in .data!
NIP: c00be9e8 LR: c00be9cc CTR: 
REGS: c0010bb176f0 TRAP: 0700   Not tainted  (2.6.35-rc3-git5-autotest)
BUG: key 01d8 not in .data!
BUG: key 01e0 not in .data!
BUG: key 01e8 not in .data!
MSR: 80029032
Unable to handle kernel paging request for data at address 0x0028
Faulting instruction address: 0xc03ad0ec
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=1024 NUMA pSeries
last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map
Page fault in user mode with in_atomic() = 1 mm = c0010943e600
Modules linked in:
NIP = fff9e98fc40  MSR = 80004001d032
 ipv6 fuse loop
Unable to handle kernel paging request for unknown fault
 dm_mod
Faulting instruction address: 0xc008d0f4
 sr_mod ibmveth cdrom sg sd_mod crc_t10dif ibmvscsic scsi_transport_srp 
scsi_tgt scsi_mod
NIP: c03ad0ec LR: c064c3b0 CTR: c03a6eb0
REGS: c00109b4f610 TRAP: 0300   Not tainted  (2.6.35-rc3-git5-autotest)
MSR: 80009032EE,ME,IR,DR   CR: 88004484  XER: 0001
DAR: 0028, DSISR: 4001
TASK = c00109a98600[7403] 'mkdir' THREAD: c00109b4c000 CPU: 19
GPR00: 8013 c00109b4f890 c0d3d798 0028
GPR04:    0001
GPR08:  0028 c0189f2c c00109a98600
GPR12: 24004424 cf602f80 41ff 0001
GPR16: 0002 c0010d8304c0 c00109b4fb44 
GPR20: c0010df77908 f000 0001 41ff
GPR24: c0010df77758 c00109fa1800 c0010df77908 c000ff236600
GPR28: 0028 0040 c0ca7b38 c0189f2c
NIP [c03ad0ec] .do_raw_spin_trylock+0x10/0x48
LR [c064c3b0] ._raw_spin_lock+0x50/0xa4
Call Trace:
[c00109b4f890] [c064c3a4] ._raw_spin_lock+0x44/0xa4 (unreliable)
[c00109b4f920] [c0189f2c] .new_inode+0x4c/0xe4
[c00109b4f9b0] [c02257fc] .ext3_new_inode+0x84/0xb70
[c00109b4fad0] [c022f1ec] .ext3_mkdir+0x130/0x438
[c00109b4fbe0] [c017adb4] .vfs_mkdir+0xb8/0x160
[c00109b4fc80] [c017e52c] .SyS_mkdirat+0xb0/0x114
[c00109b4fdc0] [c017a730] .SyS_mkdir+0x1c/0x30
[c00109b4fe30] [c00085b4] syscall_exit+0x0/0x40
Instruction dump:
eb41ffd0 7c0803a6 eb61ffd8 eb81ffe0 eba1ffe8 ebc1fff0 ebe1fff8 4e800020
3800 7c691b78 980d0214 800d00087d601829  2c0b 40c20010 7c00192d
Oops: Weird page fault, sig: 11 [#2]

Pls let me know if this back trace would help in analyzing further.
Meanwhile I shall do a git bisect and send the inputs.

Thanks
Divya



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Oops while running fs_racer test on a POWER6 box against latest git

2010-07-01 Thread Nick Piggin
On Thu, Jul 01, 2010 at 03:04:54PM +1000, Michael Neuling wrote:
  While running fs_racer test from LTP on a POWER6 box against latest 
  git(2.6.3
 5-rc3-git4 - commitid 984bc9601f64fd)
  came across the following warning followed by multiple oops.
  
  [ cut here ]
  
  Badness at kernel/mutex-debug.c:64
  NIP: c00be9e8 LR: c00be9cc CTR: 
  REGS: c0010be8f6f0 TRAP: 0700   Not tainted  (2.6.35-rc3-git4-autotest)
  MSR: 80029032EE,ME,CE,IR,DRCR: 24224422  XER: 0012
  TASK = c0010727cf00[8211] 'fs_racer_file_c' THREAD: c0010be8bb50 
  CPU:
  2
  GPR00:  c0010be8f970 c0d3d798 0001
  GPR04: c0010be8fa70 c0010be8c000 c0010727d9f8 
  GPR08: c43042f0 c16534e8 017a c0c29a1c
  GPR12: 28228424 cf600500 c0010be8fc40 2000
  GPR16: f000 c00109c73000 c0010be8fc30 00010442
  GPR20:   01b6 c0010dd12250
  GPR24: c017c08c c0010727cf00 c0010dd12278 c0010dd12210
  GPR28: 0001 c0010be8c000 c0ca2008 c0010be8fa70
  NIP [c00be9e8] .mutex_remove_waiter+0xa4/0x130
  LR [c00be9cc] .mutex_remove_waiter+0x88/0x130
  Call Trace:
  [c0010be8f970] [c0010be8fa00] 0xc0010be8fa00 (unreliable)
  [c0010be8fa00] [c064a9f0] .mutex_lock_nested+0x384/0x430
  Instruction dump:
  e81f0010 e93d 7fa04800 41fe0028 482e96e5 6000 2fa3 419e0018
  e93e8008 8009 2f80 409e00080fe0   e93e8000 8009 2f80
  Unable to handle kernel paging request for unknown fault
  Faulting instruction address: 0xc008d0f4
  Oops: Kernel access of bad area, sig: 7 [#1]
  SMP NR_CPUS=1024 NUMA
  Unrecoverable FP Unavailable Exception 800 at c0648ed4
  pSeries
  last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map
  Modules linked in: ipv6 fuse loop dm_mod sr_mod cdrom ibmveth sg
  sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt scsi_mod
  NIP: c008d0f4 LR: c008d0d0 CTR: 
  REGS: c0010978f900 TRAP: 0600   Tainted: GW
  (2.6.35-rc3-git4-a
 utotest)
  MSR: 80009032
  Unrecoverable FP Unavailable Exception 800 at c0648ed4
  Unrecoverable FP Unavailable Exception 800 at c0648ed4
  Unrecoverable FP Unavailable Exception 800 at c0648ed4
  Unrecoverable FP Unavailable Exception 800 at c0648ed4
  Unrecoverable FP Unavailable Exception 800 at c0648ed4
  EE,ME,IR,DRCR: 24022442  XER: 0012
  DAR: c0648f54, DSISR: 4001
  TASK = c001096e4900[7353] 'fs_racer_file_s' THREAD: c0010978c000 
  CPU:
  10
  GPR00: 4000 c0010978fb80 c0d3d798 0001
  GPR04: c083539e c1610228  c54c6880
  GPR08: 06a5 c0648f54 0007 049b
  GPR12:  cf601900  
  GPR16: 4b7dc520   c0010978fea0
  GPR20: 0fffcca7e7a0 0fffcca7e7a0 0fffabf7dfd0 0fffabf7dfd0
  GPR24:  01200011 c0e1c0a8 c0648ed4
  GPR28:  c001096e4900 c0ca0458 c0010725d400
  NIP [c008d0f4] .copy_process+0x310/0xf40
  LR [c008d0d0] .copy_process+0x2ec/0xf40
  Call Trace:
  [c0010978fb80] [c008d0d0] .copy_process+0x2ec/0xf40 (unreliable)
  [c0010978fc80] [c008deb4] .do_fork+0x190/0x3cc
  [c0010978fdc0] [c0011ef4] .sys_clone+0x58/0x70
  [c0010978fe30] [c00087f0] .ppc_clone+0x8/0xc
  Instruction dump:
  419e0010 7fe3fb78 480774cd 6000 801f0014 e93f0008 7800b842 39290080
  78004800 6042 901f0014 380040007d6048a8   7d6b0078 7d6049ad 40c2fff4
  
  Kernel version 2.6.34-rc3-git3 works fine.
 
 Should this read 2.6.35-rc3-git3?
 
 If so, there's only about 20 commits in:
 5904b3b81d2516..984bc9601f64fd
 
 The likely fs related candidates are from Christoph and Nick Piggin
 (added to CC)
 
 No commits relating to POWER6 or PPC.

Not sure what's happening here. The first warning looks like some mutex
corruption, but it doesn't have a stack trace (these are 2 seperate
dumps, right? ie. the copy_process stack doesn't relate to the mutex
warning?) So I don't have much idea.

If it is reproducable, can you try getting a better stack trace, or
better yet, even bisecting if there is just a small window?

Thanks,
Nick

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Oops while running fs_racer test on a POWER6 box against latest git

2010-07-01 Thread Maciej Rutecki
On środa, 30 czerwca 2010 o 13:22:27 divya wrote:
 While running fs_racer test from LTP on a POWER6 box against latest
 git(2.6.35-rc3-git4 - commitid 984bc9601f64fd) came across the following
 warning followed by multiple oops.
 

I created a Bugzilla entry at 
https://bugzilla.kernel.org/show_bug.cgi?id=16324
for your bug report, please add your address to the CC list in there, thanks!


-- 
Maciej Rutecki
http://www.maciek.unixy.pl
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Oops while running fs_racer test on a POWER6 box against latest git

2010-07-01 Thread Michael Neuling
In message 20100701105907.gk22...@laptop you wrote:
 On Thu, Jul 01, 2010 at 03:04:54PM +1000, Michael Neuling wrote:
   While running fs_racer test from LTP on a POWER6 box against latest git(2
.6.3
  5-rc3-git4 - commitid 984bc9601f64fd)
   came across the following warning followed by multiple oops.
   
   [ cut here ]
   
   Badness at kernel/mutex-debug.c:64
   NIP: c00be9e8 LR: c00be9cc CTR: 
   REGS: c0010be8f6f0 TRAP: 0700   Not tainted  (2.6.35-rc3-git4-autotes
t)
   MSR: 80029032EE,ME,CE,IR,DRCR: 24224422  XER: 0012
   TASK = c0010727cf00[8211] 'fs_racer_file_c' THREAD: c0010be8bb50 
CPU:
   2
   GPR00:  c0010be8f970 c0d3d798 000
1
   GPR04: c0010be8fa70 c0010be8c000 c0010727d9f8 000
0
   GPR08: c43042f0 c16534e8 017a c0c29a1
c
   GPR12: 28228424 cf600500 c0010be8fc40 200
0
   GPR16: f000 c00109c73000 c0010be8fc30 0001044
2
   GPR20:   01b6 c0010dd1225
0
   GPR24: c017c08c c0010727cf00 c0010dd12278 c0010dd1221
0
   GPR28: 0001 c0010be8c000 c0ca2008 c0010be8fa7
0
   NIP [c00be9e8] .mutex_remove_waiter+0xa4/0x130
   LR [c00be9cc] .mutex_remove_waiter+0x88/0x130
   Call Trace:
   [c0010be8f970] [c0010be8fa00] 0xc0010be8fa00 (unreliable)
   [c0010be8fa00] [c064a9f0] .mutex_lock_nested+0x384/0x430
   Instruction dump:
   e81f0010 e93d 7fa04800 41fe0028 482e96e5 6000 2fa3 419e0018
   e93e8008 8009 2f80 409e00080fe0   e93e8000 8009 2f8
0
   Unable to handle kernel paging request for unknown fault
   Faulting instruction address: 0xc008d0f4
   Oops: Kernel access of bad area, sig: 7 [#1]
   SMP NR_CPUS=1024 NUMA
   Unrecoverable FP Unavailable Exception 800 at c0648ed4
   pSeries
   last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_ma
p
   Modules linked in: ipv6 fuse loop dm_mod sr_mod cdrom ibmveth sg
   sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt scsi_mod
   NIP: c008d0f4 LR: c008d0d0 CTR: 
   REGS: c0010978f900 TRAP: 0600   Tainted: GW(2.6.35-rc3-gi
t4-a
  utotest)
   MSR: 80009032
   Unrecoverable FP Unavailable Exception 800 at c0648ed4
   Unrecoverable FP Unavailable Exception 800 at c0648ed4
   Unrecoverable FP Unavailable Exception 800 at c0648ed4
   Unrecoverable FP Unavailable Exception 800 at c0648ed4
   Unrecoverable FP Unavailable Exception 800 at c0648ed4
   EE,ME,IR,DRCR: 24022442  XER: 0012
   DAR: c0648f54, DSISR: 4001
   TASK = c001096e4900[7353] 'fs_racer_file_s' THREAD: c0010978c000 
CPU:
   10
   GPR00: 4000 c0010978fb80 c0d3d798 000
1
   GPR04: c083539e c1610228  c54c688
0
   GPR08: 06a5 c0648f54 0007 049b000
0
   GPR12:  cf601900  fff
f
   GPR16: 4b7dc520   c0010978fea
0
   GPR20: 0fffcca7e7a0 0fffcca7e7a0 0fffabf7dfd0 0fffabf7dfd
0
   GPR24:  01200011 c0e1c0a8 c0648ed
4
   GPR28:  c001096e4900 c0ca0458 c0010725d40
0
   NIP [c008d0f4] .copy_process+0x310/0xf40
   LR [c008d0d0] .copy_process+0x2ec/0xf40
   Call Trace:
   [c0010978fb80] [c008d0d0] .copy_process+0x2ec/0xf40 (unreliab
le)
   [c0010978fc80] [c008deb4] .do_fork+0x190/0x3cc
   [c0010978fdc0] [c0011ef4] .sys_clone+0x58/0x70
   [c0010978fe30] [c00087f0] .ppc_clone+0x8/0xc
   Instruction dump:
   419e0010 7fe3fb78 480774cd 6000 801f0014 e93f0008 7800b842 39290080
   78004800 6042 901f0014 380040007d6048a8   7d6b0078 7d6049ad 40c2fff
4
   
   Kernel version 2.6.34-rc3-git3 works fine.
  
  Should this read 2.6.35-rc3-git3?
  
  If so, there's only about 20 commits in:
  5904b3b81d2516..984bc9601f64fd
  
  The likely fs related candidates are from Christoph and Nick Piggin
  (added to CC)
  
  No commits relating to POWER6 or PPC.
 
 Not sure what's happening here. The first warning looks like some mutex
 corruption, but it doesn't have a stack trace (these are 2 seperate
 dumps, right? ie. the copy_process stack doesn't relate to the mutex
 warning?) So I don't have much idea.
 
 If it is reproducable, can you try getting a better stack trace, or
 better yet, even bisecting if there is just a small window?

I can't reproduce the bug here on POWER6 or POWER7.

Divya, can you bisect this?

Mikey
___
Linuxppc-dev mailing list

Oops while running fs_racer test on a POWER6 box against latest git

2010-06-30 Thread divya

While running fs_racer test from LTP on a POWER6 box against latest 
git(2.6.35-rc3-git4 - commitid 984bc9601f64fd)
came across the following warning followed by multiple oops.

[ cut here ]

Badness at kernel/mutex-debug.c:64
NIP: c00be9e8 LR: c00be9cc CTR: 
REGS: c0010be8f6f0 TRAP: 0700   Not tainted  (2.6.35-rc3-git4-autotest)
MSR: 80029032EE,ME,CE,IR,DRCR: 24224422  XER: 0012
TASK = c0010727cf00[8211] 'fs_racer_file_c' THREAD: c0010be8bb50 CPU: 2
GPR00:  c0010be8f970 c0d3d798 0001
GPR04: c0010be8fa70 c0010be8c000 c0010727d9f8 
GPR08: c43042f0 c16534e8 017a c0c29a1c
GPR12: 28228424 cf600500 c0010be8fc40 2000
GPR16: f000 c00109c73000 c0010be8fc30 00010442
GPR20:   01b6 c0010dd12250
GPR24: c017c08c c0010727cf00 c0010dd12278 c0010dd12210
GPR28: 0001 c0010be8c000 c0ca2008 c0010be8fa70
NIP [c00be9e8] .mutex_remove_waiter+0xa4/0x130
LR [c00be9cc] .mutex_remove_waiter+0x88/0x130
Call Trace:
[c0010be8f970] [c0010be8fa00] 0xc0010be8fa00 (unreliable)
[c0010be8fa00] [c064a9f0] .mutex_lock_nested+0x384/0x430
Instruction dump:
e81f0010 e93d 7fa04800 41fe0028 482e96e5 6000 2fa3 419e0018
e93e8008 8009 2f80 409e00080fe0   e93e8000 8009 2f80
Unable to handle kernel paging request for unknown fault
Faulting instruction address: 0xc008d0f4
Oops: Kernel access of bad area, sig: 7 [#1]
SMP NR_CPUS=1024 NUMA
Unrecoverable FP Unavailable Exception 800 at c0648ed4
pSeries
last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map
Modules linked in: ipv6 fuse loop dm_mod sr_mod cdrom ibmveth sg
sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt scsi_mod
NIP: c008d0f4 LR: c008d0d0 CTR: 
REGS: c0010978f900 TRAP: 0600   Tainted: GW
(2.6.35-rc3-git4-autotest)
MSR: 80009032
Unrecoverable FP Unavailable Exception 800 at c0648ed4
Unrecoverable FP Unavailable Exception 800 at c0648ed4
Unrecoverable FP Unavailable Exception 800 at c0648ed4
Unrecoverable FP Unavailable Exception 800 at c0648ed4
Unrecoverable FP Unavailable Exception 800 at c0648ed4
EE,ME,IR,DRCR: 24022442  XER: 0012
DAR: c0648f54, DSISR: 4001
TASK = c001096e4900[7353] 'fs_racer_file_s' THREAD: c0010978c000 CPU: 10
GPR00: 4000 c0010978fb80 c0d3d798 0001
GPR04: c083539e c1610228  c54c6880
GPR08: 06a5 c0648f54 0007 049b
GPR12:  cf601900  
GPR16: 4b7dc520   c0010978fea0
GPR20: 0fffcca7e7a0 0fffcca7e7a0 0fffabf7dfd0 0fffabf7dfd0
GPR24:  01200011 c0e1c0a8 c0648ed4
GPR28:  c001096e4900 c0ca0458 c0010725d400
NIP [c008d0f4] .copy_process+0x310/0xf40
LR [c008d0d0] .copy_process+0x2ec/0xf40
Call Trace:
[c0010978fb80] [c008d0d0] .copy_process+0x2ec/0xf40 (unreliable)
[c0010978fc80] [c008deb4] .do_fork+0x190/0x3cc
[c0010978fdc0] [c0011ef4] .sys_clone+0x58/0x70
[c0010978fe30] [c00087f0] .ppc_clone+0x8/0xc
Instruction dump:
419e0010 7fe3fb78 480774cd 6000 801f0014 e93f0008 7800b842 39290080
78004800 6042 901f0014 380040007d6048a8   7d6b0078 7d6049ad 40c2fff4

Kernel version 2.6.34-rc3-git3 works fine.

Thanks
Divya


Using 007dfade bytes for initrd buffer
Please wait, loading kernel...
Allocated 0180 bytes for kernel @ 01e0
   Elf64 kernel loaded...
Loading ramdisk...
ramdisk loaded 007dfade @ 0360
OF stdout device is: /vdevice/v...@3000
Preparing to boot Linux version 2.6.35-rc3-git4-autotest (r...@p55alp2) (gcc version 4.3.2 [gcc-4_3-branch revision 141291] (SUSE Linux) ) #1 SMP Wed Jun 30 08:47:11 IST 2010
Max number of cores passed to firmware: 0x0200
Calling ibm,client-architecture-support... not implemented
command line: root=/dev/sda5 IDENT=1277868480
memory layout at init:
  memory_limit :  (16 MB aligned)
  alloc_bottom : 03de
  alloc_top: 1000
  alloc_top_hi : 0001f000
  rmo_top  : 1000
  ram_top  : 0001f000
instantiating rtas at 0x0f6a... done
boot cpu hw idx 
starting cpu hw idx 0002... done
starting cpu hw idx 0004... done
starting cpu hw idx 0006... done
starting cpu hw idx 0008... done
starting cpu hw idx 000a... done
starting cpu 

Re: Oops while running fs_racer test on a POWER6 box against latest git

2010-06-30 Thread Michael Neuling
 While running fs_racer test from LTP on a POWER6 box against latest git(2.6.3
5-rc3-git4 - commitid 984bc9601f64fd)
 came across the following warning followed by multiple oops.
 
 [ cut here ]
 
 Badness at kernel/mutex-debug.c:64
 NIP: c00be9e8 LR: c00be9cc CTR: 
 REGS: c0010be8f6f0 TRAP: 0700   Not tainted  (2.6.35-rc3-git4-autotest)
 MSR: 80029032EE,ME,CE,IR,DRCR: 24224422  XER: 0012
 TASK = c0010727cf00[8211] 'fs_racer_file_c' THREAD: c0010be8bb50 CPU:
 2
 GPR00:  c0010be8f970 c0d3d798 0001
 GPR04: c0010be8fa70 c0010be8c000 c0010727d9f8 
 GPR08: c43042f0 c16534e8 017a c0c29a1c
 GPR12: 28228424 cf600500 c0010be8fc40 2000
 GPR16: f000 c00109c73000 c0010be8fc30 00010442
 GPR20:   01b6 c0010dd12250
 GPR24: c017c08c c0010727cf00 c0010dd12278 c0010dd12210
 GPR28: 0001 c0010be8c000 c0ca2008 c0010be8fa70
 NIP [c00be9e8] .mutex_remove_waiter+0xa4/0x130
 LR [c00be9cc] .mutex_remove_waiter+0x88/0x130
 Call Trace:
 [c0010be8f970] [c0010be8fa00] 0xc0010be8fa00 (unreliable)
 [c0010be8fa00] [c064a9f0] .mutex_lock_nested+0x384/0x430
 Instruction dump:
 e81f0010 e93d 7fa04800 41fe0028 482e96e5 6000 2fa3 419e0018
 e93e8008 8009 2f80 409e00080fe0   e93e8000 8009 2f80
 Unable to handle kernel paging request for unknown fault
 Faulting instruction address: 0xc008d0f4
 Oops: Kernel access of bad area, sig: 7 [#1]
 SMP NR_CPUS=1024 NUMA
 Unrecoverable FP Unavailable Exception 800 at c0648ed4
 pSeries
 last sysfs file: /sys/devices/system/cpu/cpu19/cache/index1/shared_cpu_map
 Modules linked in: ipv6 fuse loop dm_mod sr_mod cdrom ibmveth sg
 sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt scsi_mod
 NIP: c008d0f4 LR: c008d0d0 CTR: 
 REGS: c0010978f900 TRAP: 0600   Tainted: GW(2.6.35-rc3-git4-a
utotest)
 MSR: 80009032
 Unrecoverable FP Unavailable Exception 800 at c0648ed4
 Unrecoverable FP Unavailable Exception 800 at c0648ed4
 Unrecoverable FP Unavailable Exception 800 at c0648ed4
 Unrecoverable FP Unavailable Exception 800 at c0648ed4
 Unrecoverable FP Unavailable Exception 800 at c0648ed4
 EE,ME,IR,DRCR: 24022442  XER: 0012
 DAR: c0648f54, DSISR: 4001
 TASK = c001096e4900[7353] 'fs_racer_file_s' THREAD: c0010978c000 CPU:
 10
 GPR00: 4000 c0010978fb80 c0d3d798 0001
 GPR04: c083539e c1610228  c54c6880
 GPR08: 06a5 c0648f54 0007 049b
 GPR12:  cf601900  
 GPR16: 4b7dc520   c0010978fea0
 GPR20: 0fffcca7e7a0 0fffcca7e7a0 0fffabf7dfd0 0fffabf7dfd0
 GPR24:  01200011 c0e1c0a8 c0648ed4
 GPR28:  c001096e4900 c0ca0458 c0010725d400
 NIP [c008d0f4] .copy_process+0x310/0xf40
 LR [c008d0d0] .copy_process+0x2ec/0xf40
 Call Trace:
 [c0010978fb80] [c008d0d0] .copy_process+0x2ec/0xf40 (unreliable)
 [c0010978fc80] [c008deb4] .do_fork+0x190/0x3cc
 [c0010978fdc0] [c0011ef4] .sys_clone+0x58/0x70
 [c0010978fe30] [c00087f0] .ppc_clone+0x8/0xc
 Instruction dump:
 419e0010 7fe3fb78 480774cd 6000 801f0014 e93f0008 7800b842 39290080
 78004800 6042 901f0014 380040007d6048a8   7d6b0078 7d6049ad 40c2fff4
 
 Kernel version 2.6.34-rc3-git3 works fine.

Should this read 2.6.35-rc3-git3?

If so, there's only about 20 commits in:
5904b3b81d2516..984bc9601f64fd

The likely fs related candidates are from Christoph and Nick Piggin
(added to CC)

No commits relating to POWER6 or PPC.

Mikey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev