Hi Matthieu, If you could print the source code line associated with that crash address that will help get us started. Something like: gdb <path to pvfs2-server binary> list *0x46f55a
Then with that and the info from Kyle we can work on getting it resolved. As a side note, if you have the opportunity you should upgrade your installation to 2.8.3 (under the name OrangeFS at orangefs.org) which has additional functionality and bug fixes although I don't believe any of the fixes are applicable to this issue. Michael On Sat, Mar 26, 2011 at 5:35 PM, Kyle Schochenmaier <[email protected]>wrote: > HI Matthieu - > > The last time I worked on this we ran into this problem and I think we > narrowed it down to a mopid reuse issue, we tried to insert some thread > locking mechanisms into the mopid 'cache' but I dont think it ever got > resolved. This was years ago and only occurred under very heavy load of > relatively small messages. > > That would be the place to start I would imagine. > > Cheers, > Kyle Schochenmaier > > > On Sat, Mar 26, 2011 at 4:21 PM, Matthieu Dorier < > [email protected]> wrote: > >> Hello, >> >> I'm trying to evaluate the performance of my PVFS installation over an >> InfiniBand network, but from time to time a server crashes with this trace >> in the log: >> >> [E 03/26 21:58] Error: encourage_recv_incoming: mop_id 12952a0 in RTS_DONE >> message not found. >> [E 03/26 21:58] [bt] /usr/sbin/pvfs2-server(error+0xca) [0x46f55a] >> [E 03/26 21:58] [bt] /usr/sbin/pvfs2-server [0x46c88c] >> [E 03/26 21:58] [bt] /usr/sbin/pvfs2-server [0x46e485] >> [E 03/26 21:58] [bt] /usr/sbin/pvfs2-server(BMI_testunexpected+0x384) >> [0x421004] >> [E 03/26 21:58] [bt] /usr/sbin/pvfs2-server [0x41cf4a] >> [E 03/26 21:58] [bt] /lib/libpthread.so.0 [0x7f6422ff0fc7] >> [E 03/26 21:58] [bt] /lib/libc.so.6(clone+0x6d) [0x7f642295164d] >> >> I've seen that some other users reported this kind of error in some >> archives of the mailing list, but didn't find any answer to solve the >> problem. Any idea how to solve this problem? >> >> If it can be of any use: I'm working with 16 PVFS servers (IO server and >> metadata server at the same time), and I'm benchmarking with the IOR >> program, for now I have 648 processes writing 8MB each in a shared file with >> a transfer size that corresponds to the strip size (64KB). >> >> Thank you, >> >> Matthieu >> >> -- >> Matthieu Dorier >> ENS Cachan, Brittany (Computer Science dpt.) >> IRISA Rennes, Office E324 >> http://perso.eleves.bretagne.ens-cachan.fr/~mdori307/wiki/ >> >> _______________________________________________ >> Pvfs2-users mailing list >> [email protected] >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users >> >> > > _______________________________________________ > Pvfs2-users mailing list > [email protected] > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users > >
_______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
