I occasionally get a server crash in what appears to be
src/io/flow/flowproto-bmi-trove/flowproto-multiqueue.c. The backtrace is
useless. I¹m running off the head branch code that I compiled on 7/3.
[E 07/14 18:06] PVFS2 server: signal 11, faulty address is (nil), from (nil)
[E 07/14 18:06] [bt] [(nil)]
[D 07/15 08:19] PVFS2 Server version 2.8.1pre1-2009-07-03-123548 starting.
I added a few extra gossip_err statements in the handle_io_error routine and
narrowed it down to the following few lines:
else if (src == TROVE_ENDPOINT && dest == BMI_ENDPOINT)
{
ret = cancel_pending_trove(&flow_data->src_list,
flow_data->parent->src.u.trove.coll_id);
flow_data->cleanup_pending_count += ret;
Any ideas?
Thanks,
Randy
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers