Hey guys,

I am running into some server segfaults with the latest release. I have a
single RHEL 5.5 64bit virtual machine with 4GB memory running four daemons
using network attached storage. This is 2.8.2 plus a couple of recent
patches.

The log output looks like this:

Sep 21 07:42:54 node1 PVFS2: [E] SM current state or trtbl is invalid (smcb
= 0x11d9a390)
Sep 21 07:42:54 node1 PVFS2: [E]      [bt]
/usr/sbin//pvfs2-server(PINT_state_machine_next+0x81) [0x43c2a0]
Sep 21 07:42:54 node1 PVFS2: [E]      [bt]
/usr/sbin//pvfs2-server(PINT_state_machine_continue+0x1d) [0x43c4af]
Sep 21 07:42:54 node1 PVFS2: [E]      [bt]
/usr/sbin//pvfs2-server(main+0x54f) [0x410ef2]
Sep 21 07:42:54 node1 PVFS2: [E]      [bt]
/lib64/libc.so.6(__libc_start_main+0xf4) [0x368ae1d994]
Sep 21 07:42:54 node1 PVFS2: [E]      [bt] /usr/sbin//pvfs2-server
[0x4108e9]

The GDB backtrace looks like this:

#0  0x000000368ae30265 in raise () from /lib64/libc.so.6
#1  0x000000368ae31d10 in abort () from /lib64/libc.so.6
#2  0x000000368ae296e6 in __assert_fail () from /lib64/libc.so.6
#3  0x000000000043c2b9 in PINT_state_machine_next (smcb=0x15d49f10,
r=0x15c359e0) at ../pvfs2_src/src/common/misc/state-machine-fns.c:246
#4  0x000000000043c4af in PINT_state_machine_continue (smcb=0x15d49f10,
r=0x15c359e0) at ../pvfs2_src/src/common/misc/state-machine-fns.c:327
#5  0x0000000000410ef2 in main (argc=6, argv=0x7fffb3baa138) at
../pvfs2_src/src/server/pvfs2-server.c:413

It looks like an assert is failing in this block of state-machine-fns.c:

if (!smcb->current_state || !smcb->current_state->trtbl)
{
    gossip_err("SM current state or trtbl is invalid "
           "(smcb = %p)\n", smcb);
    gossip_backtrace();
    assert(0);
    return -1;
}

Here is some GDB output of the relevant variables in the if test:

(gdb) print smcb->current_state
$1 = (struct PINT_state_s *) 0x6d1990
(gdb) print smcb->current_state->trtbl
$2 = (struct PINT_tran_tbl_s *) 0x0

This particular server has segfaulted at least 3 times with the same logs
and core data, so it does not seem to be a fluke. Does anyone recognize this
error or know why that translation table might be empty?

Bart.
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to