Am 20.04.2012 03:55, schrieb Ken Elkabany:
We have 2 OpenAFS servers running 1.4.14. We have many clients that we
just switched over to 1.6.1pre1. Starting earlier today, we started
Not sure if it helps in your situation, but 1.6.1 is out. Try using this.

T/Christof

getting NULL pointer dereferences, which has been completely hosing the
clients. The client machines hang on any call that deals with AFS,
whether it's "ls /", "ls /afs", "klist", etc... A "vos changeaddr" was
done earlier today, whereby a large collection (4000) of volumes were
mistakenly assigned to another server. These were corrected with "vos
syncvldb" followed by "vos syncserv". I mention it here, as it's the
only thing we've done to the AFS cluster today.

Here's what we found in the syslog:

Apr 20 01:30:43 SERVER kernel: [12861236.027818] BUG: unable to handle
kernel NULL pointer dereference at 0000000000000028
Apr 20 01:30:43 SERVER kernel: [12861236.027836] IP:
[<ffffffffa0048087>] afs_Conn+0x1e7/0x260 [openafs]
Apr 20 01:30:43 SERVER kernel: [12861236.027868] PGD 0
Apr 20 01:30:43 SERVER kernel: [12861236.027874] Oops: 0000 [#1] SMP
Apr 20 01:30:43 SERVER kernel: [12861236.027882] CPU 6
Apr 20 01:30:43 SERVER kernel: [12861236.027885] Modules linked in:
openafs(P) isofs acpiphp
Apr 20 01:30:43 SERVER kernel: [12861236.027897]
Apr 20 01:30:43 SERVER kernel: [12861236.027902] Pid: 1568, comm:
apache2 Tainted: P           O 3.2.0-23-virtual #36-Ubuntu
Apr 20 01:30:43 SERVER kernel: [12861236.027912] RIP:
e030:[<ffffffffa0048087>]  [<ffffffffa0048087>] afs_Conn+0x1e7/0x260
[openafs]
Apr 20 01:30:43 SERVER kernel: [12861236.027936] RSP:
e02b:ffff88017f417808  EFLAGS: 00010282
Apr 20 01:30:43 SERVER kernel: [12861236.027942] RAX: ffffc9000188dbe0
RBX: 0000000000000000 RCX: 000000000000581b
Apr 20 01:30:43 SERVER kernel: [12861236.027950] RDX: ffff8801b112a000
RSI: 0000000000000001 RDI: ffff88017f761680
Apr 20 01:30:43 SERVER kernel: [12861236.027957] RBP: ffff88017f417858
R08: 0000000000000000 R09: 0000000000000000
Apr 20 01:30:43 SERVER kernel: [12861236.027964] R10: 0000000000000002
R11: 0000000000000000 R12: ffff880184756f48
Apr 20 01:30:43 SERVER kernel: [12861236.027971] R13: ffff88017f417a20
R14: 0000000000000004 R15: ffff88017f4178f0
Apr 20 01:30:43 SERVER kernel: [12861236.027983] FS:
  00007f1f6ae2f700(0000) GS:ffff8801bff73000(0000) knlGS:0000000000000000
Apr 20 01:30:43 SERVER kernel: [12861236.027991] CS:  e033 DS: 0000 ES:
0000 CR0: 000000008005003b
Apr 20 01:30:43 SERVER kernel: [12861236.027998] CR2: 0000000000000028
CR3: 0000000181465000 CR4: 0000000000002660
Apr 20 01:30:43 SERVER kernel: [12861236.028006] DR0: 0000000000000000
DR1: 0000000000000000 DR2: 0000000000000000
Apr 20 01:30:43 SERVER kernel: [12861236.028013] DR3: 0000000000000000
DR6: 00000000ffff0ff0 DR7: 0000000000000400
Apr 20 01:30:43 SERVER kernel: [12861236.028021] Process apache2 (pid:
1568, threadinfo ffff88017f416000, task ffff88017f41adc0)
Apr 20 01:30:43 SERVER kernel: [12861236.028028] Stack:
Apr 20 01:30:43 SERVER kernel: [12861236.028032]  000000004e2a6741
0000000000000000 0000000000000000 000000004f90bc43
Apr 20 01:30:43 SERVER kernel: [12861236.028046]  000000000001584a
ffff880184756cc0 ffff88017f41adc0 ffff88017f417a20
Apr 20 01:30:43 SERVER kernel: [12861236.028059]  ffff880184756f48
ffff880184756cc0 ffff88017f417928 ffffffffa0068658
Apr 20 01:30:43 SERVER kernel: [12861236.028072] Call Trace:
Apr 20 01:30:43 SERVER kernel: [12861236.028092]  [<ffffffffa0068658>]
afs_FetchStatus+0x58/0x450 [openafs]
Apr 20 01:30:43 SERVER kernel: [12861236.028113]  [<ffffffffa004672b>] ?
afs_GetCellStale+0x3b/0x60 [openafs]
Apr 20 01:30:43 SERVER kernel: [12861236.028134]  [<ffffffffa0046a25>] ?
afs_IsPrimaryCell+0x25/0x40 [openafs]
Apr 20 01:30:43 SERVER kernel: [12861236.028157]  [<ffffffffa0082b80>] ?
afs_GetVolume+0x40/0x1d0 [openafs]
Apr 20 01:30:43 SERVER kernel: [12861236.028179]  [<ffffffffa006ae8d>]
afs_GetVCache+0x26d/0x5d0 [openafs]
Apr 20 01:30:43 SERVER kernel: [12861236.028200]  [<ffffffffa006b343>]
afs_VerifyVCache2+0x153/0x200 [openafs]
Apr 20 01:30:43 SERVER kernel: [12861236.028222]  [<ffffffffa006ccec>]
afs_getattr+0x29c/0x350 [openafs]
Apr 20 01:30:43 SERVER kernel: [12861236.028242]  [<ffffffffa009340f>]
afs_linux_dentry_revalidate+0x39f/0x470 [openafs]
Apr 20 01:30:43 SERVER kernel: [12861236.028265]  [<ffffffffa006bf43>] ?
afs_AccessOK+0x113/0x1e0 [openafs]
Apr 20 01:30:43 SERVER kernel: [12861236.028279]  [<ffffffff816552de>] ?
_raw_spin_lock+0xe/0x20
Apr 20 01:30:43 SERVER kernel: [12861236.028290]  [<ffffffff811818eb>]
do_lookup+0x18b/0x310
Apr 20 01:30:43 SERVER kernel: [12861236.028298]  [<ffffffff8129885c>] ?
security_inode_permission+0x1c/0x30
Apr 20 01:30:43 SERVER kernel: [12861236.028306]  [<ffffffff81182268>]
link_path_walk+0x138/0x870
Apr 20 01:30:43 SERVER kernel: [12861236.028313]  [<ffffffff811834ad>] ?
path_init+0x2ed/0x3c0
Apr 20 01:30:43 SERVER kernel: [12861236.028319]  [<ffffffff811835d8>]
path_lookupat+0x58/0x750
Apr 20 01:30:43 SERVER kernel: [12861236.028339]  [<ffffffffa006cb3c>] ?
afs_getattr+0xec/0x350 [openafs]
Apr 20 01:30:43 SERVER kernel: [12861236.028348]  [<ffffffff810067be>] ?
xen_pmd_val+0xe/0x10
Apr 20 01:30:43 SERVER kernel: [12861236.028355]  [<ffffffff81183d01>]
do_path_lookup+0x31/0xc0
Apr 20 01:30:43 SERVER kernel: [12861236.028362]  [<ffffffff81184809>]
user_path_at_empty+0x59/0xa0
Apr 20 01:30:43 SERVER kernel: [12861236.028369]  [<ffffffff8100aa32>] ?
check_events+0x12/0x20
Apr 20 01:30:43 SERVER kernel: [12861236.028377]  [<ffffffff8100a25d>] ?
xen_force_evtchn_callback+0xd/0x10
Apr 20 01:30:43 SERVER kernel: [12861236.028384]  [<ffffffff81184861>]
user_path_at+0x11/0x20
Apr 20 01:30:43 SERVER kernel: [12861236.028391]  [<ffffffff8117995a>]
vfs_fstatat+0x3a/0x70
Apr 20 01:30:43 SERVER kernel: [12861236.028398]  [<ffffffff8100aa1f>] ?
xen_restore_fl_direct_reloc+0x4/0x4
Apr 20 01:30:43 SERVER kernel: [12861236.028405]  [<ffffffff8100465d>] ?
xen_clts+0x8d/0x190
Apr 20 01:30:43 SERVER kernel: [12861236.028412]  [<ffffffff811799ae>]
vfs_lstat+0x1e/0x20
Apr 20 01:30:43 SERVER kernel: [12861236.028418]  [<ffffffff81179b4a>]
sys_newlstat+0x1a/0x40
Apr 20 01:30:43 SERVER kernel: [12861236.028427]  [<ffffffff810146e1>] ?
math_state_restore+0x51/0x80
Apr 20 01:30:43 SERVER kernel: [12861236.028435]  [<ffffffff816562fe>] ?
do_device_not_available+0xe/0x10
Apr 20 01:30:43 SERVER kernel: [12861236.028445]  [<ffffffff8165f8cb>] ?
device_not_available+0x1b/0x20
Apr 20 01:30:43 SERVER kernel: [12861236.028452]  [<ffffffff8165d8c2>]
system_call_fastpath+0x16/0x1b
Apr 20 01:30:43 SERVER kernel: [12861236.028458] Code: 89 ef 48 89 45 c8
e8 39 c4 01 00 48 8b 45 c8 48 83 c4 28 5b 41 5c 41 5d 41 5e 41 5f 5d c3
48 85 ff 0f 84 95 fe ff ff 48 8b 5f 58 <f6> 43 28 20 0f 85 87 fe ff ff
41 80 7d 12 00 7e 29 41 80 7d 13
Apr 20 01:30:43 SERVER kernel: [12861236.028543] RIP
  [<ffffffffa0048087>] afs_Conn+0x1e7/0x260 [openafs]
Apr 20 01:30:43 SERVER kernel: [12861236.028563]  RSP <ffff88017f417808>
Apr 20 01:30:43 SERVER kernel: [12861236.028568] CR2: 0000000000000028

--
The future is all around us, waiting in moments of transition to be born
in moments of revelation. No one knows the shape of that future or where
it will take us. We know only that it is always born in pain.
  -- G'Quan
Let's update the servers!
-----------------------------------------------------------------
Christof Hanke                          e-mail [email protected]
RZG (Rechenzentrum Garching)            phone +49-89-3299-1041
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut für Plasmaphysik (IPP)

_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info

Reply via email to