*Good morning,*

In my experiences, i have shared a folder in my cluster of 4 computers.
I then run a mpi programm (which calculate the nomber pi) on 2 computers,
and i checkpoint it with the ompi-checkpoint command.
For the restart, i modify the nodes file for change one computer name so
that the two new computer node are different than the two older for the
computing.
With the ompi-restart command (and the nodes file), when i restart the
checkpointed application, i got this error message : *mpirun noticed that
process rank 1 with PID 1619 on node oartwo exited on signal 11
(segmentation fault).*

I then want to kno the problem and how to solve it.



*Thank you*


*Hope to read you.*
*Best regards.*

-- 
DIBAMOU MBEUYO Cyrille
Computer Engineer, M.Sc.
Ph.D. Student in Computer Science
*Mobile* : (+237) 696 608 826 / 674 979 502
The University Of Ngaoundere,  CAMEROUN
*Other Email *: cdiba...@univ-ndere.cm

Reply via email to