flowproto-multiqueue.c

Sam Lang Wed, 15 Jul 2009 08:51:25 -0700


Hi Randy,

I don't have any ideas where the problem is, but could you try runningthe server in gdb? That may give you a better backtrace. The codethat generates the backtrace and writes it to the log when a segfaultoccurs isn't that reliable and may not work on your system. Also, youcould try running the server in valgrind to see if there's memoryerrors elsewhere. That may pinpoint the problem better.

Was PVFS configured with optimizations (--enable-fast), or without (--enable-strict)? And did you specify CFLAGS=-g when runningconfigure? For debugging environments I usually run configure likethis:


CFLAGS=-g ./configure --enable-strict

That will enable the most debugging, and hopefully get betterbacktraces.


-sam

On Jul 15, 2009, at 7:47 AM, Randall Martin wrote:

I occasionally get a server crash in what appears to be src/io/flow/flowproto-bmi-trove/flowproto-multiqueue.c. The backtrace isuseless. I’m running off the head branch code that I compiled on 7/3.
[E 07/14 18:06] PVFS2 server: signal 11, faulty address is (nil),from (nil)
[E 07/14 18:06] [bt] [(nil)]
[D 07/15 08:19] PVFS2 Server version 2.8.1pre1-2009-07-03-123548starting.
I added a few extra gossip_err statements in the handle_io_errorroutine and narrowed it down to the following few lines:
        else if (src == TROVE_ENDPOINT && dest == BMI_ENDPOINT)
        {
ret = cancel_pending_trove(&flow_data->src_list,flow_data->parent->src.u.trove.coll_id);
            flow_data->cleanup_pending_count += ret;

Any ideas?

Thanks,
Randy
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] Server crash in src/io/flow/flowproto-bmi-trove/flowproto-multiqueue.c

Reply via email to