The document attached to the Open MPI Wiki discusses all of the MCA parameters for checkpoint/restart.
   http://svn.open-mpi.org/trac/ompi/wiki/ProcessFT_CR

There are two ways to save checkpoint file data. I would suggest that you set these parameters in your $HOME/.openmpi/mca-params.conf file so you don't have to pass them everytime to mpirun (Assuming $HOME is shared on all machines).

1) If you save to a globally shared directory (e.g., NFS directory) then you can set the following MCA paramter in mpirun to point to this location. This overrides the default directory which is $HOME.
  snapc_base_global_snapshot_dir=$HOME/my/ckpt/dir

2) You can save to the local disk and have Open MPI transfer the files from local disk to stable storage in a two step process. There are three MCA parameters you will need to set for this. To set the directory to save on the local disk you want to set the following MCA parameter:
  crs_base_snapshot_dir=/tmp
Set the global directory where all of the local checkpoints should be saved:
  snapc_base_global_snapshot_dir=$HOME/my/ckpt/dir
Activate the two step process:
  snapc_base_store_in_place=0

The C/R User Document on the wiki covers many of these and other parameters in more detail. I would encourage you to look through there as well.

Best,
Josh

On Sep 13, 2008, at 7:49 PM, arun dhakne wrote:

Hi,

I have blcr installed and I am able to dump checkpoints in the $HOME
using ompi-checkpoint, i was wondering whether there is some option or
something, so that I would be able to  dump the checkpoints at my
customized location say in /tmp ??

--
Thanks and Regards,
Arun
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to