Josh. The ompi-checkpoint with his restart now are working great, but the same error persist with ompi-migrate. I've also tried using "-r", but i get the same error.
Best regards. Hugo Meyer 2011/1/26 Hugo Meyer <meyer.h...@gmail.com> > Thanks Josh. > > I've already check te prelink and is set to "no". > > I'm going to try with the trunk head, and then i'll let you know how it > goes. > > Best regards. > > Hugo Meyer > > 2011/1/25 Joshua Hursey <jjhur...@open-mpi.org> > > Can you try with the current trunk head (r24296)? >> I just committed a fix for the C/R functionality in which restarts were >> getting stuck. This will likely affect the migration functionality, but I >> have not had an opportunity to test just yet. >> >> Another thing to check is that prelink is turned off on all of your >> machines. >> https://upc-bugs.lbl.gov//blcr/doc/html/FAQ.html#prelink >> >> Let me know if the problem persists, and I'll dig into a bit more. >> >> Thanks, >> Josh >> >> On Jan 24, 2011, at 11:37 AM, Hugo Meyer wrote: >> >> > Hello @ll >> > >> > I've got a problem when i try to use the ompi-migrate command. >> > >> > What i'm doing is execute for example the next application in one node >> of a cluster (both process wil run on the same node): >> > >> > mpirun -np 2 -am ft-enable-cr ./whoami 10 10 >> > >> > Then in the same node i try to migrate the processes to another node: >> > >> > ompi-migrate -x node9 -t node3 14914 >> > >> > And then i get this message: >> > >> > [clus9:15620] *** Process received signal *** >> > [clus9:15620] Signal: Segmentation fault (11) >> > [clus9:15620] Signal code: Address not mapped (1) >> > [clus9:15620] Failing at address: (nil) >> > [clus9:15620] [ 0] /lib64/libpthread.so.0 [0x2aaaac0b8d40] >> > [clus9:15620] *** End of error message *** >> > Segmentation fault >> > >> > I assume that maybe there is something wrong with the thread level, but >> i have configured the open-mpi like this: >> > >> > ../configure --prefix=/home/hmeyer/desarrollo/ompi-code/binarios/ >> --enable-debug --enable-debug-symbols --enable-trace --with-ft=cr >> --disable-ipv6 --enable-opal-multi-threads --enable-ft-thread >> --without-hwloc --disable-vt --with-blcr=/soft/blcr-0.8.2/ >> --with-blcr-libdir=/soft/blcr-0.8.2/lib/ >> > >> > The checkpoint and restart works fine, but when i restore an application >> that has more than one process, this one is restored and executed until the >> last line before MPI_FINALIZE(), but the processes never finalize, i assume >> that they never call the MPI_FINALIZE(), but with one process >> ompi-checkpoint and ompi-restart work great. >> > >> > Best regards. >> > >> > Hugo Meyer >> > _______________________________________________ >> > devel mailing list >> > de...@open-mpi.org >> > http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> ------------------------------------ >> Joshua Hursey >> Postdoctoral Research Associate >> Oak Ridge National Laboratory >> http://users.nccs.gov/~jjhursey >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > >