Re: [OMPI users] change hosts to restart the checkpoint

2010-03-07 Thread Fernando Lemos
On Fri, Mar 5, 2010 at 12:03 PM, Josh Hursey wrote: > This type of failure is usually due to prelink'ing being left enabled on one > or more of the systems. This has come up multiple times on the Open MPI > list, but is actually a problem between BLCR and the Linux kernel.

Re: [OMPI users] change hosts to restart the checkpoint

2010-03-05 Thread Josh Hursey
This type of failure is usually due to prelink'ing being left enabled on one or more of the systems. This has come up multiple times on the Open MPI list, but is actually a problem between BLCR and the Linux kernel. BLCR has a FAQ entry on this that you will want to check out:

[OMPI users] change hosts to restart the checkpoint

2010-03-05 Thread 马少杰
2010-03-05 马少杰 Dear Sir: I want to use openmpi and blcr to checkpoint.However, I want restart the check point on other hosts. For example, I run mpi program using openmpi on host1 and host2, and I save the checkpoint file at a nfs shared path. Then I wan to restart the job