On Tue, 15 May 2001, Jan Harkes wrote:
> On Mon, May 14, 2001 at 10:18:31PM -0500, Ryan M. Lefever wrote:
> > Hi,
> >
> > I am trying to fix some RPC2 problems that I have when using volutil.
> >
> > When I do a "volutil setdebug", the following happens no matter whether I
> > do it locally or remotely, or to the SCM or a non-SCM. Also, a
> > /vice/srv/CRASH file is created.
> >
> > --
> > [root@nsx srv]# startserver -d 1000
> > [root@nsx srv]# volutil setdebug 100
> > V_BindToServer: binding to host nsx.crhc.uiuc.edu
> > VolSetDebug failed with RPC2_DEAD (F)
> ...
> > --
> >
> > The SrvErr file reads:
> >
> > --
> > could not open key 2 file: No such file or directory
> > Assertion failed: 0, file "srv.cc", line 336
> > EXITING! Bye!
> > --
>
> This is a generic assertion point where we always end up when a SIGSEGV
> is received. If you create the file /vice/srv/ZOMBIFY, the server should
> end up in an infinite loop at this point. Then you can easily attach gdb
> and get a stacktrace.
>
> # gdb /usr/sbin/codasrv `pidof codasrv`
> (gdb) bt
>
> The trace will be a bit funny, because the actual point where the
> segfault was triggered won't show up. The stack is clobbered by the
> signal handler. However, the function that called the function that
> crashed will show up and from the line number it is possible to figure
> out at least which function had a problem.
>
> It will probably be something like,
>
> #1 coda_assert function where we are waiting
> #2 sigsegv handler
> #3 ???
> #4 function before the segv was received.
> x/x/volutil/vol_setdebug.cc:666
>
>
I tried this method and got the following:
--
(gdb) bt
#0 0x40184c61 in __libc_nanosleep () from /lib/libc.so.6
#1 0x40184bed in __sleep (seconds=1) at
../sysdeps/unix/sysv/linux/sleep.c:82
#2 0x80c34a7 in coda_assert (pred=0x80c48e7 "0", file=0x80c48e0 "srv.cc",
line=336) at coda_assert.c:45
#3 0x804be04 in zombie (sig=11) at srv.cc:336
#4 0x40111c68 in __restore ()
at ../sysdeps/unix/sysv/linux/i386/sigaction.c:127
#5 0x40138986 in _IO_vfprintf (s=0x401e1ce0,
format=0x4005bed5 "[%s]%s: \"%s\", line %d: ", ap=0x151a0f00)
at vfprintf.c:1029
#6 0x40141047 in fprintf (stream=0x401e1ce0,
format=0x4005bed5 "[%s]%s: \"%s\", line %d: ") at fprintf.c:32
#7 0x40047200 in RPC2_SendResponse (ConnHandle=505527757,
Reply=0x8165ad0)
at rpc2a.c:154
#8 0x8084958 in volUtil_ExecuteRequest (_cid=505527757, _reqbuffer=0x0,
_bd=0x0) at volutil.server.c:1808
#9 0x8065ccc in VolUtilLWP (myindex=0xbffff8d0) at volutil.cc:135
#10 0x400829be in Create_Process_Part2 () at lwp.c:795
--
> The other (and perhaps easier) way to debug this is by running codasrv
> under the control of gdb at the time the segfault happens. That way the
> stacktrace shows up a lot nicer.
>
> # gdb /usr/sbin/codasrv `pidof codasrv`
> (gdb) continue
> /* trigger the volutil setdebug crash */
> SEGV received
> (gdb) bt
> #1 culprit function
> file.cc:line
I tried this method, and the backtrace gave the following:
--
Program received signal SIGSEGV, Segmentation fault.
0x401e1d88 in main_arena () from /lib/libc.so.6
(gdb) bt
#0 0x401e1d88 in main_arena () from /lib/libc.so.6
#1 0x3f3e002b in ?? ()
#2 0x40138986 in _IO_vfprintf (s=0x401e1ce0,
format=0x4005bed5 "[%s]%s: \"%s\", line %d: ", ap=0x151a0f00)
at vfprintf.c:1029
#3 0x40141047 in fprintf (stream=0x401e1ce0,
format=0x4005bed5 "[%s]%s: \"%s\", line %d: ") at fprintf.c:32
#4 0x40047200 in RPC2_SendResponse (ConnHandle=1052104457,
Reply=0x8165ad0)
at rpc2a.c:154
#5 0x8084958 in volUtil_ExecuteRequest (_cid=1052104457, _reqbuffer=0x0,
_bd=0x0) at volutil.server.c:1808
#6 0x8065ccc in VolUtilLWP (myindex=0xbffff8d0) at volutil.cc:135
#7 0x400829be in Create_Process_Part2 () at lwp.c:795
--
Since I didn't write any of the Coda code, its kind of hard for me to
debug. Jan, does this help you any.
Thanks,
Ryan