We fixed that bug in what will be 1.4.3rc2.
On Wed, 7 Feb 2007, Jakub Witkowski wrote:
Dnia 04-02-2007, nie o godzinie 10:21 -0500, Derrick J Brashear
napisaÿÿ(a):
On Sun, 4 Feb 2007, Jakub Witkowski wrote:
Well, we haven't recommended 1.5.14 so I'm curious why you chose it, but,
do you have an oops?
No, no oops. The system just... blocks. You can interact with programs
already in memory, access open files, but not open new.
I chose .14 mostly because I was having problems building the module for
Xen kernel and this version simply was first that I got compiled. I may
fall back to something more stable now, as I know how to get things
running.
Which OpenAFS version you recommend for installation on a client? On a
server?
For Linux, we haven't recommended any 1.5.x client. 1.4.2, generally,
though 1.4.3rc2 should be out in a day or so.
If you can get cmdebug information when it's hung, that's be useful to
see.
I have done some experiments and my findings are not exactly optimistic.
First of all, I found out that the hang was actually caused by some
weird interaction between OpenAFS client and libnss-ldap library; in
test enviroinment I can reproduce the systemwide hang described above
when I set up nsswitch library to look uids up in ldap, but if it is not
configured to do so, only the find process hangs - and then, only for a
few minutes. Adding -fakestat-all switch makes the problem less
pronounced (i.e. find lists more files) but not go away.
On the other hand, 1.4.2 appears to be free of this problem, at least I
have not yet found a way to crash nor hang it.
1.4.3rc1 has a bug:
Unable to handle kernel paging request at ffff880040000000 RIP:
[<ffffffff880d3462>] :libafs:InstallUVolumeEntry+0x162/0x480
PGD e74067 PUD 1076067 PMD 1077067 PTE 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in: libafs xt_tcpudp ip6table_filter ip6_tables xt_state
xt_pkttype iptable_raw xt_CLASSIFY xt_CONNMARK xt_connmark xt_policy
xt_multiport xt_conntrack iptable_mangle ipt_ULOG ipt_TTL ipt_ttl
ipt_TOS ipt_tos ipt_TCPMSS ipt_SAME ipt_REJECT ipt_REDIRECT ipt_recent
ipt_owner ipt_NETMAP ipt_MASQUERADE ipt_LOG ipt_iprange ipt_hashlimit
ipt_ECN ipt_ecn ipt_DSCP ipt_dscp ipt_CLUSTERIP ipt_ah ipt_addrtype
ip_nat_irc ip_nat_tftp ip_nat_ftp ip_conntrack_irc ip_conntrack_tftp
ip_conntrack_ftp iptable_nat ip_nat ip_conntrack nfnetlink
iptable_filter ip_tables x_tables xenbus_be xenblk
Pid: 11703, comm: find Tainted: P 2.6.18-xenU5 #6
RIP: e030:[<ffffffff880d3462>]
[<ffffffff880d3462>] :libafs:InstallUVolumeEntry+0x162/0x480
RSP: e02b:ffff88003a80bac8 EFLAGS: 00010206
RAX: 00000000006b5aa0 RBX: ffffc200001d8850 RCX: ffff88003f08e080
RDX: 000000005c0c0000 RSI: ffff88003a80ba94 RDI: ffff88003e20d9c0
RBP: 0000000000000000 R08: ffff88003a80bd68 R09: 0000000000000000
R10: 0000000000000020 R11: 0000000000000008 R12: ffff88003e529400
R13: 00000000006b5aa0 R14: ffff880045083e00 R15: ffffc200001d8850
FS: 00002b0b5e7986d0(0000) GS:ffffffff80582000(0000)
knlGS:0000000000000000
CS: e033 DS: 0000 ES: 0000
Process find (pid: 11703, threadinfo ffff88003a80a000, task
ffff88003f08e080)
Stack: ffffffff88134fa0 0000000000000034 ffff88003a80bb68
ffff88003e20d9e0
ffff88003a80bd68 ffff88003e20d9c0 000000a600000000 ffffffff8037c3a9
ffff88003e20d9c0 0000000000000000 ffff8800230883c0 ffffffff80290439
Call Trace:
[<ffffffff8037c3a9>] _atomic_dec_and_lock+0x39/0x58
[<ffffffff80290439>] dput+0x34/0x153
[<ffffffff802940eb>] mntput_no_expire+0x19/0x8b
[<ffffffff880d3ea2>] :libafs:afs_SetupVolume+0x372/0x440
[<ffffffff880d44d1>] :libafs:afs_NewVolumeByName+0x561/0x610
[<ffffffff880a7cb7>] :libafs:afs_TraverseCells_nl+0x37/0x60
[<ffffffff880d4607>] :libafs:afs_GetVolumeByName+0x87/0x140
[<ffffffff880ca557>] :libafs:EvalMountPoint+0x1d7/0x400
[<ffffffff880ca8ac>] :libafs:afs_EvalFakeStat_int+0x12c/0x3e0
[<ffffffff880c3a7c>] :libafs:afs_access+0x9c/0x380
[<ffffffff880f7faf>] :libafs:afs_linux_permission+0x7f/0xf0
[<ffffffff8028758a>] permission+0x81/0xc8
[<ffffffff802886a7>] may_open+0x58/0x21e
[<ffffffff8028ae4a>] open_namei+0x2b5/0x6c6
[<ffffffff802792e3>] do_filp_open+0x1c/0x38
[<ffffffff80279343>] do_sys_open+0x44/0xbe
[<ffffffff80209d7a>] system_call+0x86/0x8b
[<ffffffff80209cf4>] system_call+0x0/0x8b
Code: 43 8b 84 ac 80 01 00 00 85 44 24 4c 0f 84 cc 02 00 00 a8 20
RIP [<ffffffff880d3462>] :libafs:InstallUVolumeEntry+0x162/0x480
RSP <ffff88003a80bac8>
CR2: ffff880040000000
The failed command was find -L /afs/wszib.edu.pl/
I think the oops happened when I pressed ctrl-C to kill it, but I'm not
exactly sure.
Jakub.
_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info