On 8/1/07, Murali Vilayannur <murali.vilayannur at gmail.com> wrote:
> Jan,
> Hmm.. kernel panics are not good. :(
> Can you provide the kernel back trace or the complete oops trace if possible?
> I don't know what has changed in the recent kernels to cause crashes
> in the kmod..
> thanks,
> Murali
>
Murali,
Here's a more complete kernel dump from yesterday.
Regards,
Jan Lindheim
Aug 1 11:08:38 shc kernel: Unable to handle kernel NULL pointer dereference at
0000000000000000 RIP:
Aug 1 11:08:38 shc kernel: <ffffffff883c62a3>{:pvfs2:wait_for_a_slot+126}
Aug 1 11:08:38 shc kernel: PGD 204bb067 PUD d98b8067 PMD 0
Aug 1 11:08:38 shc kernel: Oops: 0000 [1] SMP
Aug 1 11:08:38 shc kernel: CPU 0
Aug 1 11:08:38 shc kernel: Modules linked in: pvfs2 usbserial nfsd parport_pc
lp parport floppy raw nfs lockd nfs_acl sunrpc ib_ipoib ib_sa ib_uverbs ib_umad
ib_mthca ib_mad ib_core ipv6 edd joydev sg st sr_mod ide_cd cdrom ohci_hcd
i2c_amd756 i2c_core e1000 usbcore tg3 ipt_MASQUERADE iptable_nat ip_nat
ip_conntrack nfnetlink ip_tables x_tables capability commoncap raid0 dm_mod
amd74xx ide_core mptsas mptscsih mptbase scsi_transport_sas sata_nv libata xfs
exportfs cciss sd_mod scsi_mod
Aug 1 11:08:38 shc kernel: Pid: 5940, comm: scp Not tainted 2.6.16.52-smp #2
Aug 1 11:08:38 shc kernel: RIP: 0010:[<ffffffff883c62a3>]
<ffffffff883c62a3>{:pvfs2:wait_for_a_slot+126}
Aug 1 11:08:38 shc kernel: RSP: 0018:ffff810064177cf8 EFLAGS: 00010293
Aug 1 11:08:38 shc kernel: RAX: 0000000000000000 RBX: ffff810064177d78 RCX:
0000000000000005
Aug 1 11:08:38 shc kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI:
ffffffff883d33a8
Aug 1 11:08:38 shc kernel: RBP: 00000000ffffffff R08: 0000000000000000 R09:
ffff810064177e98
Aug 1 11:08:38 shc kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI:
ffffffff883d33a8
Aug 1 11:08:38 shc kernel: RBP: 00000000ffffffff R08: 0000000000000000 R09:
ffff810064177e98
Aug 1 11:08:38 shc kernel: R10: 0000000000000246 R11: ffff810064177f50 R12:
ffff810064177e1c
Aug 1 11:08:38 shc kernel: R13: 0000000000400000 R14: ffff810064177e88 R15:
ffff810064177e98
Aug 1 11:08:38 shc kernel: FS: 00002ab85b30c100(0000)
GS:ffffffff8041f000(0000) knlGS:00000000557a00e0
Aug 1 11:08:38 shc kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Aug 1 11:08:38 shc kernel: CR2: 0000000000000000 CR3: 000000006115f000 CR4:
00000000000006e0
Aug 1 11:08:38 shc kernel: Process scp (pid: 5940, threadinfo
ffff810064176000, task ffff81001a802080)
Aug 1 11:08:38 shc kernel: Stack: 0000000000000000 ffff81001a802080
ffffffff801299c8 0000000000000000
Aug 1 11:08:38 shc kernel: 0000000000000000 ffff810064177e08
0000000000000001 ffff81001a802080
Aug 1 11:08:38 shc kernel: ffffffff801299c8 ffffffff883d3398
Aug 1 11:08:38 shc kernel: Call Trace:
<ffffffff801299c8>{default_wake_function+0}
Aug 1 11:08:38 shc kernel: ffffffff801299c8 ffffffff883d3398
Aug 1 11:08:38 shc kernel: Call Trace:
<ffffffff801299c8>{default_wake_function+0}
Aug 1 11:08:38 shc kernel: <ffffffff801299c8>{default_wake_function+0}
<ffffffff883c63d4>{:pvfs2:pvfs_bufmap_get+57}
Aug 1 11:08:38 shc kernel:
<ffffffff883c185a>{:pvfs2:do_direct_readv_writev+1767}
Aug 1 11:08:38 shc kernel:
<ffffffff883c1e56>{:pvfs2:pvfs2_file_write+149}
<ffffffff8017beb3>{vfs_write+212}
Aug 1 11:08:38 shc kernel: <ffffffff8017c010>{sys_write+69}
<ffffffff8010a872>{system_call+126}
Aug 1 11:08:38 shc kernel:
Aug 1 11:08:38 shc kernel: Code: 83 3c 90 00 74 67 ff c6 39 ce 7c ed 48 8b 43
10 c7 00 01 00
Aug 1 11:08:38 shc kernel: RIP <ffffffff883c62a3>{:pvfs2:wait_for_a_slot+126}
RSP <ffff810064177cf8>
Aug 1 11:08:38 shc kernel: CR2: 0000000000000000
> On 8/1/07, Jan Lindheim <lindheim at cacr.caltech.edu> wrote:
>> We have over the last few months experienced a lot of PVFS client stability
>> problems with both pvfs-2.6.2 and pvfs-2.6.3. The syslog would typically
>> show the following message just before a crash:
>>
>> Jul 30 20:42:06 shc kernel: Unable to handle kernel NULL pointer dereference
>> at 0000000000000000 RIP:
>> Jul 30 20:42:06 shc kernel: <ffffffff8839c2a3>{:pvfs2:wait_for_a_slot+126}
>> Jul 30 20:42:06 shc kernel: PGD 1c3f3a067 PUD 1f2a1e067 PMD 0
>> Jul 30 20:42:06 shc kernel: Oops: 0000 [2] SMP
>> Jul 30 20:42:06 shc kernel: CPU 3
>>
>> We have been using kernel 2.6.16.48 and 2.6.16.52 on x86_64 architecture.
>>
>> Typically users have been busy copying files into the PVFS file system,
>> using cp or rsync, just before the crash.
>>
>> Any insight or suggestions for how to debug this, is much appreciated.
>>
>> Regards,
>> Jan Lindheim
>> _______________________________________________
>> Pvfs2-users mailing list
>> Pvfs2-users at beowulf-underground.org
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>>
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users