Hello,

I'm trying to evaluate the performance of my PVFS installation over an
InfiniBand network, but from time to time a server crashes with this trace
in the log:

[E 03/26 21:58] Error: encourage_recv_incoming: mop_id 12952a0 in RTS_DONE
message not found.
[E 03/26 21:58]     [bt] /usr/sbin/pvfs2-server(error+0xca) [0x46f55a]
[E 03/26 21:58]     [bt] /usr/sbin/pvfs2-server [0x46c88c]
[E 03/26 21:58]     [bt] /usr/sbin/pvfs2-server [0x46e485]
[E 03/26 21:58]     [bt] /usr/sbin/pvfs2-server(BMI_testunexpected+0x384)
[0x421004]
[E 03/26 21:58]     [bt] /usr/sbin/pvfs2-server [0x41cf4a]
[E 03/26 21:58]     [bt] /lib/libpthread.so.0 [0x7f6422ff0fc7]
[E 03/26 21:58]     [bt] /lib/libc.so.6(clone+0x6d) [0x7f642295164d]

I've seen that some other users reported this kind of error in some archives
of the mailing list, but didn't find any answer to solve the problem. Any
idea how to solve this problem?

If it can be of any use: I'm working with 16 PVFS servers (IO server and
metadata server at the same time), and I'm benchmarking with the IOR
program, for now I have 648 processes writing 8MB each in a shared file with
a transfer size that corresponds to the strip size (64KB).

Thank you,

Matthieu

-- 
Matthieu Dorier
ENS Cachan, Brittany (Computer Science dpt.)
IRISA Rennes, Office E324
http://perso.eleves.bretagne.ens-cachan.fr/~mdori307/wiki/
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to