Hi there,
We have a moderately-loaded NFS server running Debian/Sarge with the
stock kernel 2.6.8-10-amd64-k8 (the current Debian binary deb). It
serves around 10 NFS client connections from a Solaris 8 box. For
what it's worth, the exported filesystems are also being shared
simultaneously via Samba to windows clients (although I doubt this is
the problem).
We're getting kernel GPFs once every 3 weeks or so, and it seems to be
on process nfsd. This is the case at least on the two occasions where
there was a trace left in the logs. See below.
I checked my ram with memtest86+, and I even changed the eth card to a
more stable e1000, but the problem must be somewhere else... Any ideas?
===============
Apr 8 12:09:38 anakin kernel: general protection fault: 0000 [1]
Apr 8 12:09:38 anakin kernel: CPU 0
Apr 8 12:09:38 anakin kernel: Modules linked in: ipv6 nfsd exportfs lockd
sunrpc evdev ehci_hcd ohci_hcd ide_cd cdrom forcedeth rtc raid1 md ext2 ext3
jbd mbcache ide_generic ide_disk amd74xx ide_core unix font vesafb cfbcopyarea
cfbimgblt cfbfillrect
Apr 8 12:09:38 anakin kernel: Pid: 7980, comm: nfsd Not tainted
2.6.8-10-amd64-k8
Apr 8 12:09:38 anakin kernel: RIP: 0010:[<ffffffff80152aeb>]
<ffffffff80152aeb>{cache_alloc_refill+283}
Apr 8 12:09:38 anakin kernel: RSP: 0000:0000010037a55788 EFLAGS: 00010086
Apr 8 12:09:38 anakin kernel: RAX: 8e00000018b8dd89 RBX: 000001003ec92000 RCX:
0000010000100000
Apr 8 12:09:38 anakin kernel: RDX: 000001003f552210 RSI: 000000000000000e RDI:
0000010016f1d028
Apr 8 12:09:38 anakin kernel: RBP: 000001003f552200 R08: 000001003ec92010 R09:
000001003f552220
Apr 8 12:09:38 anakin kernel: R10: 000001003f552230 R11: 0000010007640018 R12:
000001003f552210
Apr 8 12:09:38 anakin kernel: R13: 000001003f92a220 R14: 0000000000000050 R15:
000001003d48ad80
Apr 8 12:09:38 anakin kernel: FS: 0000000000000000(0000)
GS:ffffffff803b1180(0000) knlGS:00000000557cb080
Apr 8 12:09:38 anakin kernel: CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
Apr 8 12:09:38 anakin kernel: CR2: 0000000055be4000 CR3: 0000000000101000 CR4:
00000000000006e0
Apr 8 12:09:38 anakin kernel: Process nfsd (pid: 7980, threadinfo
0000010037a54000, task 000001003e9dea40)
Apr 8 12:09:38 anakin kernel: Stack: 000001001b7a3360 0000000000000000
000001003f4c0800 00000000001a8831
Apr 8 12:09:38 anakin kernel: 000001003f4c0800 0000010002051f40
000001003d48ad80 ffffffff8015294b
Apr 8 12:09:38 anakin kernel: 0000000000000212 ffffffffa006ddd5
Apr 8 12:09:38 anakin kernel: Call
Trace:<ffffffff8015294b>{kmem_cache_alloc+43}
<ffffffffa006ddd5>{:ext3:ext3_alloc_inode+21}
Apr 8 12:09:38 anakin kernel: <ffffffff8017e105>{alloc_inode+21}
<ffffffff8017f378>{iget_locked+168}
Apr 8 12:09:38 anakin kernel: <ffffffffa006ae74>{:ext3:ext3_lookup+100}
<ffffffff80174a02>{__lookup_hash+258}
Apr 8 12:09:38 anakin kernel: <ffffffff80174abc>{lookup_one_len+108}
<ffffffffa01298fd>{:nfsd:compose_entry_fh+205}
Apr 8 12:09:38 anakin kernel:
<ffffffffa0129b25>{:nfsd:encode_entry+437} <ffffffff8011d942>{pci_map_sg+642}
Apr 8 12:09:38 anakin kernel:
<ffffffffa0067da4>{:ext3:ext3_get_block_handle+228}
Apr 8 12:09:38 anakin kernel:
<ffffffffa0023a28>{:ide_core:__ide_dma_begin+40}
<ffffffffa008a849>{:ide_disk:__ide_do_rw_disk+809}
Apr 8 12:09:38 anakin kernel:
<ffffffffa0129e60>{:nfsd:nfs3svc_encode_entry_plus+16}
Apr 8 12:09:38 anakin kernel:
<ffffffffa0064995>{:ext3:ext3_readdir+1157}
<ffffffffa0129e50>{:nfsd:nfs3svc_encode_entry_plus+0}
Apr 8 12:09:38 anakin kernel: <ffffffffa011deb5>{:nfsd:fh_verify+1333}
<ffffffffa00e6e11>{:sunrpc:svc_sock_enqueue+561}
Apr 8 12:09:38 anakin kernel:
<ffffffffa0129e50>{:nfsd:nfs3svc_encode_entry_plus+0}
Apr 8 12:09:38 anakin kernel: <ffffffff80178d5d>{vfs_readdir+157}
<ffffffffa0129e50>{:nfsd:nfs3svc_encode_entry_plus+0}
Apr 8 12:09:38 anakin kernel:
<ffffffffa0120205>{:nfsd:nfsd_readdir+149}
<ffffffffa0126bd1>{:nfsd:nfsd3_proc_readdirplus+241}
Apr 8 12:09:38 anakin kernel:
<ffffffffa011b5f0>{:nfsd:nfsd_dispatch+240}
<ffffffffa00e6922>{:sunrpc:svc_process+914}
Apr 8 12:09:38 anakin kernel: <ffffffffa011b1e0>{:nfsd:nfsd+0}
<ffffffffa011b39a>{:nfsd:nfsd+442}
Apr 8 12:09:38 anakin kernel: <ffffffff8012edae>{schedule_tail+14}
<ffffffff80110c67>{child_rip+8}
Apr 8 12:09:38 anakin kernel: <ffffffffa011b1e0>{:nfsd:nfsd+0}
<ffffffffa011b1e0>{:nfsd:nfsd+0}
Apr 8 12:09:38 anakin kernel: <ffffffff80110c5f>{child_rip+0}
Apr 8 12:09:38 anakin kernel:
Apr 8 12:09:38 anakin kernel: Code: 48 89 50 08 48 89 02 66 83 79 24 ff 48 c7
01 00 01 10 00 48
Apr 8 12:09:38 anakin kernel: RIP <ffffffff80152aeb>{cache_alloc_refill+283}
RSP <0000010037a55788>
===============
===============
May 5 15:51:15 anakin kernel: general protection fault: 0000 [1]
May 5 15:51:15 anakin kernel: CPU 0
May 5 15:51:15 anakin kernel: Modules linked in: nfsd exportfs lockd sunrpc
ipv6 evdev forcedeth ehci_hcd ohci_hcd ide_cd cdrom e1000 rtc
raid1 md ext2 ext3 jbd mbcache ide_generic ide_disk amd74xx ide_core unix font
vesafb cfbcopyarea cfbimgblt cfbfillrect
May 5 15:51:15 anakin kernel: Pid: 1609, comm: nfsd Not tainted
2.6.8-10-amd64-k8
May 5 15:51:15 anakin kernel: RIP: 0010:[<ffffffff80152aeb>]
<ffffffff80152aeb>{cache_alloc_refill+283}
May 5 15:51:15 anakin kernel: RSP: 0000:000001003de09788 EFLAGS: 00010086
May 5 15:51:15 anakin kernel: RAX: 8e00000018b8dd89 RBX: 000001003ecb4000 RCX:
0000010000100000
May 5 15:51:15 anakin kernel: RDX: 000001003f556210 RSI: 0000000000000009 RDI:
000001000eb6f028
May 5 15:51:15 anakin kernel: RBP: 000001003f556200 R08: 000001003ecb4010 R09:
000001003f556220
May 5 15:51:15 anakin kernel: R10: 000001003f556230 R11: 000001002582f0c0 R12:
000001003f556210
May 5 15:51:15 anakin kernel: R13: 000001003f92e220 R14: 0000000000000050 R15:
000001003f2d3380
May 5 15:51:15 anakin kernel: FS: 0000000000000000(0000)
GS:ffffffff803b1180(0000) knlGS:00000000557cb080
May 5 15:51:15 anakin kernel: CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
May 5 15:51:15 anakin kernel: CR2: 000000000816fd78 CR3: 0000000000101000 CR4:
00000000000006e0
May 5 15:51:15 anakin kernel: Process nfsd (pid: 1609, threadinfo
000001003de08000, task 000001003e524430)
May 5 15:51:15 anakin kernel: Stack: 000001003de097f8 0000000000000000
000001003f4c2800 00000000004451fc
May 5 15:51:15 anakin kernel: 000001003f4c2800 00000100020de130
000001003f2d3380 ffffffff8015294b
May 5 15:51:15 anakin kernel: 0000000000000212 ffffffffa006ddd5
May 5 15:51:15 anakin kernel: Call
Trace:<ffffffff8015294b>{kmem_cache_alloc+43}
<ffffffffa006ddd5>{:ext3:ext3_alloc_inode+21}
May 5 15:51:15 anakin kernel: <ffffffff8017e105>{alloc_inode+21}
<ffffffff8017f378>{iget_locked+168}
May 5 15:51:15 anakin kernel: <ffffffffa006ae74>{:ext3:ext3_lookup+100}
<ffffffff80174a02>{__lookup_hash+258}
May 5 15:51:15 anakin kernel: <ffffffff80174abc>{lookup_one_len+108}
<ffffffffa01818fd>{:nfsd:compose_entry_fh+205}
May 5 15:51:15 anakin kernel:
<ffffffffa0181b25>{:nfsd:encode_entry+437} <ffffffff8011d942>{pci_map_sg+642}
May 5 15:51:15 anakin kernel:
<ffffffffa0067da4>{:ext3:ext3_get_block_handle+228}
May 5 15:51:15 anakin kernel:
<ffffffffa0023a28>{:ide_core:__ide_dma_begin+40}
<ffffffffa008a849>{:ide_disk:__ide_do_rw_disk+809}
May 5 15:51:15 anakin kernel:
<ffffffffa0181e60>{:nfsd:nfs3svc_encode_entry_plus+16}
May 5 15:51:15 anakin kernel:
<ffffffffa0064995>{:ext3:ext3_readdir+1157}
<ffffffffa0181e50>{:nfsd:nfs3svc_encode_entry_plus+0}
May 5 15:51:15 anakin kernel: <ffffffffa0175eb5>{:nfsd:fh_verify+1333}
<ffffffffa013ee11>{:sunrpc:svc_sock_enqueue+561}
May 5 15:51:15 anakin kernel:
<ffffffffa0181e50>{:nfsd:nfs3svc_encode_entry_plus+0}
May 5 15:51:15 anakin kernel: <ffffffff80178d5d>{vfs_readdir+157}
<ffffffffa0181e50>{:nfsd:nfs3svc_encode_entry_plus+0}
May 5 15:51:15 anakin kernel:
<ffffffffa0178205>{:nfsd:nfsd_readdir+149}
<ffffffffa017ebd1>{:nfsd:nfsd3_proc_readdirplus+241}
May 5 15:51:15 anakin kernel:
<ffffffffa01735f0>{:nfsd:nfsd_dispatch+240}
<ffffffffa013e922>{:sunrpc:svc_process+914}
May 5 15:51:15 anakin kernel: <ffffffffa01731e0>{:nfsd:nfsd+0}
<ffffffffa017339a>{:nfsd:nfsd+442}
May 5 15:51:15 anakin kernel: <ffffffff8012edae>{schedule_tail+14}
<ffffffff80110c67>{child_rip+8}
May 5 15:51:15 anakin kernel: <ffffffffa01731e0>{:nfsd:nfsd+0}
<ffffffffa01731e0>{:nfsd:nfsd+0}
May 5 15:51:15 anakin kernel: <ffffffff80110c5f>{child_rip+0}
May 5 15:51:15 anakin kernel:
May 5 15:51:15 anakin kernel: Code: 48 89 50 08 48 89 02 66 83 79 24 ff 48 c7
01 00 01 10 00 48
May 5 15:51:15 anakin kernel: RIP <ffffffff80152aeb>{cache_alloc_refill+283}
RSP <000001003de09788>
===============
--
JL
--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]