Jan,
I don't have an x86-64 machine to repro this..
I am quite lost on how/why it crashed there. I have read through this
code several times..
Sigh..
I will keep digging.
thanks,
Murali

On 8/2/07, Jan Lindheim <[EMAIL PROTECTED]> wrote:
> On 8/1/07, Murali Vilayannur <murali.vilayannur at gmail.com> wrote:
> > Jan,
> > Hmm.. kernel panics are not good. :(
> > Can you provide the kernel back trace or the complete oops trace if 
> > possible?
> > I don't know what has changed in the recent kernels to cause crashes
> > in the kmod..
> > thanks,
> > Murali
> >
>
> Murali,
> Here's a more complete kernel dump from yesterday.
>
> Regards,
> Jan Lindheim
>
>
> Aug  1 11:08:38 shc kernel: Unable to handle kernel NULL pointer dereference 
> at 0000000000000000 RIP:
> Aug  1 11:08:38 shc kernel: <ffffffff883c62a3>{:pvfs2:wait_for_a_slot+126}
> Aug  1 11:08:38 shc kernel: PGD 204bb067 PUD d98b8067 PMD 0
> Aug  1 11:08:38 shc kernel: Oops: 0000 [1] SMP
> Aug  1 11:08:38 shc kernel: CPU 0
> Aug  1 11:08:38 shc kernel: Modules linked in: pvfs2 usbserial nfsd 
> parport_pc lp parport floppy raw nfs lockd nfs_acl sunrpc ib_ipoib ib_sa 
> ib_uverbs ib_umad ib_mthca ib_mad ib_core ipv6 edd joydev sg st sr_mod ide_cd 
> cdrom ohci_hcd i2c_amd756 i2c_core e1000 usbcore tg3 ipt_MASQUERADE 
> iptable_nat ip_nat ip_conntrack nfnetlink ip_tables x_tables capability 
> commoncap raid0 dm_mod amd74xx ide_core mptsas mptscsih mptbase 
> scsi_transport_sas sata_nv libata xfs exportfs cciss sd_mod scsi_mod
> Aug  1 11:08:38 shc kernel: Pid: 5940, comm: scp Not tainted 2.6.16.52-smp #2
> Aug  1 11:08:38 shc kernel: RIP: 0010:[<ffffffff883c62a3>] 
> <ffffffff883c62a3>{:pvfs2:wait_for_a_slot+126}
> Aug  1 11:08:38 shc kernel: RSP: 0018:ffff810064177cf8  EFLAGS: 00010293
> Aug  1 11:08:38 shc kernel: RAX: 0000000000000000 RBX: ffff810064177d78 RCX: 
> 0000000000000005
> Aug  1 11:08:38 shc kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
> ffffffff883d33a8
> Aug  1 11:08:38 shc kernel: RBP: 00000000ffffffff R08: 0000000000000000 R09: 
> ffff810064177e98
> Aug  1 11:08:38 shc kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
> ffffffff883d33a8
> Aug  1 11:08:38 shc kernel: RBP: 00000000ffffffff R08: 0000000000000000 R09: 
> ffff810064177e98
> Aug  1 11:08:38 shc kernel: R10: 0000000000000246 R11: ffff810064177f50 R12: 
> ffff810064177e1c
> Aug  1 11:08:38 shc kernel: R13: 0000000000400000 R14: ffff810064177e88 R15: 
> ffff810064177e98
> Aug  1 11:08:38 shc kernel: FS:  00002ab85b30c100(0000) 
> GS:ffffffff8041f000(0000) knlGS:00000000557a00e0
> Aug  1 11:08:38 shc kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Aug  1 11:08:38 shc kernel: CR2: 0000000000000000 CR3: 000000006115f000 CR4: 
> 00000000000006e0
> Aug  1 11:08:38 shc kernel: Process scp (pid: 5940, threadinfo 
> ffff810064176000, task ffff81001a802080)
> Aug  1 11:08:38 shc kernel: Stack: 0000000000000000 ffff81001a802080 
> ffffffff801299c8 0000000000000000
> Aug  1 11:08:38 shc kernel:        0000000000000000 ffff810064177e08 
> 0000000000000001 ffff81001a802080
> Aug  1 11:08:38 shc kernel:        ffffffff801299c8 ffffffff883d3398
> Aug  1 11:08:38 shc kernel: Call Trace: 
> <ffffffff801299c8>{default_wake_function+0}
> Aug  1 11:08:38 shc kernel:        ffffffff801299c8 ffffffff883d3398
> Aug  1 11:08:38 shc kernel: Call Trace: 
> <ffffffff801299c8>{default_wake_function+0}
> Aug  1 11:08:38 shc kernel:        
> <ffffffff801299c8>{default_wake_function+0} 
> <ffffffff883c63d4>{:pvfs2:pvfs_bufmap_get+57}
> Aug  1 11:08:38 shc kernel:        
> <ffffffff883c185a>{:pvfs2:do_direct_readv_writev+1767}
> Aug  1 11:08:38 shc kernel:        
> <ffffffff883c1e56>{:pvfs2:pvfs2_file_write+149} 
> <ffffffff8017beb3>{vfs_write+212}
> Aug  1 11:08:38 shc kernel:        <ffffffff8017c010>{sys_write+69} 
> <ffffffff8010a872>{system_call+126}
> Aug  1 11:08:38 shc kernel:
> Aug  1 11:08:38 shc kernel: Code: 83 3c 90 00 74 67 ff c6 39 ce 7c ed 48 8b 
> 43 10 c7 00 01 00
> Aug  1 11:08:38 shc kernel: RIP 
> <ffffffff883c62a3>{:pvfs2:wait_for_a_slot+126} RSP <ffff810064177cf8>
> Aug  1 11:08:38 shc kernel: CR2: 0000000000000000
>
>
> > On 8/1/07, Jan Lindheim <lindheim at cacr.caltech.edu> wrote:
> >> We have over the last few months experienced a lot of PVFS client stability
> >> problems with both pvfs-2.6.2 and pvfs-2.6.3.  The syslog would typically
> >> show the following message just before a crash:
> >>
> >> Jul 30 20:42:06 shc kernel: Unable to handle kernel NULL pointer 
> >> dereference
> >> at 0000000000000000 RIP:
> >> Jul 30 20:42:06 shc kernel: <ffffffff8839c2a3>{:pvfs2:wait_for_a_slot+126}
> >> Jul 30 20:42:06 shc kernel: PGD 1c3f3a067 PUD 1f2a1e067 PMD 0
> >> Jul 30 20:42:06 shc kernel: Oops: 0000 [2] SMP
> >> Jul 30 20:42:06 shc kernel: CPU 3
> >>
> >> We have been using kernel 2.6.16.48 and 2.6.16.52 on x86_64 architecture.
> >>
> >> Typically users have been busy copying files into the PVFS file system,
> >> using cp or rsync, just before the crash.
> >>
> >> Any insight or suggestions for how to debug this, is much appreciated.
> >>
> >> Regards,
> >> Jan Lindheim
> >> _______________________________________________
> >> Pvfs2-users mailing list
> >> Pvfs2-users at beowulf-underground.org
> >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
> >>
> _______________________________________________
> Pvfs2-users mailing list
> [email protected]
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to