Hi all, I finally figured out the answer. I just put the parameter "-machinefile host" in the "ompi-restart" command and it restarted correctly. So is it unable to restart multi-threaded application on 1 node in OpenMPI?
Nguyen Toan On Tue, Jun 8, 2010 at 12:07 AM, Nguyen Toan <nguyentoan1...@gmail.com>wrote: > Sorry, I just want to add 2 more things: > + I tried configure with and without --enable-ft-thread but nothing changed > + I also applied this patch for OpenMPI here and reinstalled but I got the > same error > > https://svn.open-mpi.org/trac/ompi/raw-attachment/ticket/2139/v1.4-preload-part1.diff > > Somebody helps? Thank you very much. > > Nguyen Toan > > > On Mon, Jun 7, 2010 at 11:51 PM, Nguyen Toan <nguyentoan1...@gmail.com>wrote: > >> Hello everyone, >> >> I'm using OpenMPI 1.4.2 with BLCR 0.8.2 to test checkpointing on 2 nodes >> but it failed to restart (Segmentation fault). >> Here are the details concerning my problem: >> >> + OS: Centos 5.4 >> + OpenMPI configure: >> ./configure --with-ft=cr --enable-ft-thread --enable-mpi-threads \ >> --with-blcr=/home/nguyen/opt/blcr >> --with-blcr-libdir=/home/nguyen/opt/blcr/lib \ >> --prefix=/home/nguyen/opt/openmpi \ >> --enable-mpirun-prefix-by-default >> + mpirun -am ft-enable-cr -machinefile host ./test >> >> I checkpointed the test program using "ompi-checkpoint -v -s PID" and the >> checkpoint file was created successfully. However it failed to restart using >> ompi-restart: >> *"mpirun noticed that process rank 0 with PID 21242 on node rc014.local >> exited on signal 11 (Segmentation fault)" >> * >> Did I miss something in the installation of OpenMPI? >> >> Regards, >> Nguyen Toan >> > >