I just upgraded two SMP servers running Linux 2.6.32 to the experimental 1.6.0-pre1 Debian packages, a RW master and a RO backup. I have configured them both for demand attach fileserver using the appropriate 'da' prefix server processes.
Whenever I vos release, after a while ptserver and vlserver dump core on both machines either with signal 11 or signal 6 and bos reloads them. The backtraces are very similar in both. VLLog describes the crash but nothing pertinent is in PtLog. The servers are externally NAT'd which wasn't a problem with earlier versions which worked fine. Not sure how to debug this further. ptserver: #0 0xb7792b81 in free () from /lib/i686/cmov/libc.so.6 #1 0x0807a379 in rxi_CleanupConnection (conn=0xb7864ff4) at rx.c:980 #2 0x0807dbf9 in rxi_CheckCall (call=0x9cc33d8) at rx.c:6001 #3 0x0807e03d in rxi_GrowMTUEvent (event=0x0, arg1=0x9cc33d8, dummy=0x0) at rx.c:6233 #4 0x080876ad in rxevent_RaiseEvents (next=0xb7653f6c) at rx_event.c:499 #5 0x08077b08 in rxi_ListenerProc (rfds=<value optimized out>, tnop=<value optimized out>, newcallp=<value optimized out>) at rx_lwp.c:203 #6 0x08077e5a in rx_ListenerProc (dummy=0x0) at rx_lwp.c:335 #7 0x08088701 in Create_Process_Part2 () at ./lwp.c:805 #8 0xb775ccdb in makecontext () from /lib/i686/cmov/libc.so.6 #9 0x0d696910 in ?? () #10 0x08088e88 in LWP_MwaitProcess (event=0x9d4ef10) at ./lwp.c:756 #11 LWP_WaitProcess (event=0x9d4ef10) at ./lwp.c:708 #12 0x080809f0 in rx_GetCall (tno=10, cur_service=0x9cb3020, socketp=0xbfa59c8c) at rx.c:2027 #13 0x08080c3a in rxi_ServerProc (threadID=10, newcall=0x0, socketp=0xbfa59c8c) at rx.c:1619 #14 0x08077dfa in rx_ServerProc (unused=0x0) at rx_lwp.c:369 #15 0x08081228 in rx_StartServer (donateMe=1) at rx.c:793 #16 0x0804a8ba in main (argc=1, argv=0xbfa5a2c4) at ptserver.c:565 vlserver: #0 0xb78da424 in __kernel_vsyscall () #1 0xb779f751 in raise () from /lib/i686/cmov/libc.so.6 #2 0xb77a2b82 in abort () from /lib/i686/cmov/libc.so.6 #3 0xb77d618d in ?? () from /lib/i686/cmov/libc.so.6 #4 0xb77e0281 in ?? () from /lib/i686/cmov/libc.so.6 #5 0xb77e1ad8 in ?? () from /lib/i686/cmov/libc.so.6 #6 0xb77e4bbd in free () from /lib/i686/cmov/libc.so.6 #7 0x080786ed in rxi_CleanupConnection (conn=0xb78b6ff4) at rx.c:990 #8 0x0807bf49 in rxi_CheckCall (call=0xa0914c8) at rx.c:6001 #9 0x0807c38d in rxi_GrowMTUEvent (event=0x0, arg1=0xa0914c8, dummy=0x0) at rx.c:6233 #10 0x08085aad in rxevent_RaiseEvents (next=0xb76a5f6c) at rx_event.c:499 #11 0x08075e58 in rxi_ListenerProc (rfds=<value optimized out>, tnop=<value optimized out>, newcallp=<value optimized out>) at rx_lwp.c:203 #12 0x080761aa in rx_ListenerProc (dummy=0x0) at rx_lwp.c:335 #13 0x08086b01 in Create_Process_Part2 () at ./lwp.c:805 #14 0xb77aecdb in makecontext () from /lib/i686/cmov/libc.so.6 #15 0x0d696910 in ?? () #16 0x08087288 in LWP_MwaitProcess (event=0xa1635b0) at ./lwp.c:756 #17 LWP_WaitProcess (event=0xa1635b0) at ./lwp.c:708 #18 0x0807ed40 in rx_GetCall (tno=15, cur_service=0xa082020, socketp=0xbfb74e7c) at rx.c:2027 #19 0x0807ef8a in rxi_ServerProc (threadID=15, newcall=0x0, socketp=0xbfb74e7c) at rx.c:1619 #20 0x0807614a in rx_ServerProc (unused=0x0) at rx_lwp.c:369 #21 0x0807f578 in rx_StartServer (donateMe=1) at rx.c:793 #22 0x0804a8c6 in main (argc=1, argv=0xbfb75714) at vlserver.c:407 Here is my bos config: restrictmode 0 restarttime 11 0 4 0 0 checkbintime 3 0 5 0 0 bnode simple ptserver 1 parm /usr/lib/openafs/ptserver end bnode simple vlserver 1 parm /usr/lib/openafs/vlserver end bnode cron userbackup 1 parm /usr/bin/nice /afs/icequake.net/pub/adm/backup_afs.sh -d -u parm 3:00 end bnode dafs dafs 1 parm /usr/afs/bin/dafileserver -p 123 -pctspare 20 -L -busyat 50 -rxpck 2000 -rxbind -cb 4000000 -vattachpar 128 -vlruthresh 1440 -vlrumax 8 -vhashsize 11 parm /usr/afs/bin/davolserver -p 64 -log -rxbind parm /usr/afs/bin/salvageserver parm /usr/afs/bin/dasalvager -parallel all32 end BosLog: Sat Feb 26 19:28:11 2011: Core limits now -1 -1 Sat Feb 26 19:28:11 2011: Server directory access is okay Sat Feb 26 19:31:29 2011: vlserver exited on signal 6 (core dumped) Sat Feb 26 19:36:33 2011: ptserver exited on signal 6 (core dumped) Sat Feb 26 19:50:57 2011: vlserver exited on signal 6 (core dumped) Sat Feb 26 19:51:49 2011: ptserver exited on signal 6 (core dumped) Sat Feb 26 20:16:07 2011: ptserver exited on signal 6 (core dumped) Sat Feb 26 20:18:33 2011: vlserver exited on signal 11 (core dumped) Sat Feb 26 20:31:22 2011: ptserver exited on signal 6 (core dumped) Sat Feb 26 20:39:44 2011: ptserver exited on signal 6 (core dumped) Sat Feb 26 20:54:11 2011: vlserver exited on signal 11 (core dumped) Sat Feb 26 20:54:59 2011: ptserver exited on signal 11 (core dumped) Sat Feb 26 20:58:58 2011: vlserver exited on signal 6 (core dumped) Sat Feb 26 21:13:19 2011: ptserver exited on signal 11 (core dumped) Sat Feb 26 21:21:41 2011: ptserver exited on signal 6 (core dumped) Sat Feb 26 21:26:58 2011: ptserver exited on signal 6 (core dumped) Sat Feb 26 21:48:20 2011: ptserver exited on signal 11 (core dumped) Sat Feb 26 21:53:37 2011: ptserver exited on signal 6 (core dumped) Sat Feb 26 21:58:54 2011: ptserver exited on signal 6 (core dumped) Sat Feb 26 22:17:10 2011: ptserver exited on signal 6 (core dumped) Sat Feb 26 22:22:27 2011: ptserver exited on signal 6 (core dumped) Sat Feb 26 22:27:44 2011: ptserver exited on signal 6 (core dumped) Sat Feb 26 22:43:00 2011: ptserver exited on signal 11 (core dumped) Sat Feb 26 22:48:42 2011: vlserver exited on signal 6 (core dumped) PtLog: Sat Feb 26 22:27:44 2011 Using 10.0.1.232 as my primary address Sat Feb 26 22:27:58 2011 Starting AFS ptserver 1.1 (/usr/lib/openafs/ptserver) Sat Feb 26 22:34:49 2011 ubik: A Remote Server has addresses: Sat Feb 26 22:34:49 2011 10.0.1.230 Sat Feb 26 22:34:49 2011 65.38.17.159 Sat Feb 26 22:34:49 2011 VLLog: Sat Feb 26 20:58:58 2011 Using 10.0.1.232 as my primary address Sat Feb 26 20:59:12 2011 Starting AFS vlserver 4 (/usr/lib/openafs/vlserver) *** glibc detected *** /usr/lib/openafs/vlserver: corrupted double-linked list: 0x086d8988 *** ======= Backtrace: ========= /lib/i686/cmov/libc.so.6(+0x6b281)[0xb7759281] /lib/i686/cmov/libc.so.6(+0x6cb31)[0xb775ab31] /lib/i686/cmov/libc.so.6(cfree+0x6d)[0xb775dbbd] /usr/lib/openafs/vlserver[0x80786ed] /usr/lib/openafs/vlserver[0x807bf49] /usr/lib/openafs/vlserver[0x807c38d] /usr/lib/openafs/vlserver[0x8085aad] /usr/lib/openafs/vlserver[0x8075e58] /usr/lib/openafs/vlserver[0x80761aa] /usr/lib/openafs/vlserver[0x8086b01] /lib/i686/cmov/libc.so.6(makecontext+0x4b)[0xb7727cdb] /usr/lib/openafs/vlserver[0x8087288] /usr/lib/openafs/vlserver[0x807ed40] /usr/lib/openafs/vlserver[0x807ef8a] /usr/lib/openafs/vlserver[0x807614a] /usr/lib/openafs/vlserver[0x807f578] /usr/lib/openafs/vlserver[0x804a8c6] /lib/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0xb7704c76] /usr/lib/openafs/vlserver[0x804a301] ======= Memory map: ======== 08048000-0809a000 r-xp 00000000 fe:01 395449 /usr/lib/openafs/vlserver 0809a000-0809b000 rw-p 00052000 fe:01 395449 /usr/lib/openafs/vlserver 0809b000-080f3000 rw-p 00000000 00:00 0 085ee000-086f9000 rw-p 00000000 00:00 0 [heap] b6f00000-b6f21000 rw-p 00000000 00:00 0 b6f21000-b7000000 ---p 00000000 00:00 0 b7031000-b704e000 r-xp 00000000 fe:01 260908 /lib/libgcc_s.so.1 b704e000-b704f000 rw-p 0001c000 fe:01 260908 /lib/libgcc_s.so.1 b7058000-b76d8000 rw-p 00000000 00:00 0 b76d8000-b76e2000 r-xp 00000000 fe:01 260884 /lib/i686/cmov/libnss_files-2.11.2.so b76e2000-b76e3000 r--p 00009000 fe:01 260884 /lib/i686/cmov/libnss_files-2.11.2.so b76e3000-b76e4000 rw-p 0000a000 fe:01 260884 /lib/i686/cmov/libnss_files-2.11.2.so b76ed000-b76ee000 rw-p 00000000 00:00 0 b76ee000-b782e000 r-xp 00000000 fe:01 260891 /lib/i686/cmov/libc-2.11.2.so b782e000-b7830000 r--p 0013f000 fe:01 260891 /lib/i686/cmov/libc-2.11.2.so b7830000-b7831000 rw-p 00141000 fe:01 260891 /lib/i686/cmov/libc-2.11.2.so b7831000-b7834000 rw-p 00000000 00:00 0 b7834000-b7844000 r-xp 00000000 fe:01 260851 /lib/i686/cmov/libresolv-2.11.2.so b7844000-b7845000 r--p 00010000 fe:01 260851 /lib/i686/cmov/libresolv-2.11.2.so b7845000-b7846000 rw-p 00011000 fe:01 260851 /lib/i686/cmov/libresolv-2.11.2.so b7846000-b784a000 rw-p 00000000 00:00 0 b784a000-b784b000 rw-p 00000000 00:00 0 b784b000-b784f000 r-xp 00000000 fe:01 260868 /lib/i686/cmov/libnss_dns-2.11.2.so b784f000-b7850000 r--p 00004000 fe:01 260868 /lib/i686/cmov/libnss_dns-2.11.2.so b7850000-b7851000 rw-p 00005000 fe:01 260868 /lib/i686/cmov/libnss_dns-2.11.2.so b7851000-b7853000 rw-p 00000000 00:00 0 b7853000-b7854000 r-xp 00000000 00:00 0 [vdso] b7854000-b786f000 r-xp 00000000 fe:01 270748 /lib/ld-2.11.2.so b786f000-b7870000 r--p 0001a000 fe:01 270748 /lib/ld-2.11.2.so b7870000-b7871000 rw-p 0001b000 fe:01 270748 /lib/ld-2.11.2.so bfe89000-bfe9e000 rw-p 00000000 00:00 0 [stack] @(#) OpenAFS 1.6.0~pre1-1-debian built 2010-12-29 -- Ryan C. Underwood, <[email protected]> _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
