Re: [OMPI users] ompi-restart fails with "found pid in use"

2010-05-18 Thread Josh Hursey
So I recently hit this same problem while doing some scalability testing. I experimented with adding the --no-restore-pid option, but found the same problem as you mention. Unfortunately, the problem is with BLCR, not Open MPI. BLCR will restart the process with a new PID, but the value ret

[OMPI users] ompi-restart fails with "found pid in use"

2010-05-14 Thread ananda.mudar
Hi I am using open mpi v1.3.4 with BLCR 0.8.2. I have been testing my openmpi based program on a 3-node cluster (each node is a Intel Nehalem based dual quad core) and I have been successful in checkpointing and restarting the program successfully multiple times. Recently I moved to a 15 node