Hi Basma,

Sorry for the delayed response. Could you please try out the dmtcp 2.0
release that came out yesterday? We can proceed from there to find a fix
for your problem.

Kapil


On Fri, Oct 4, 2013 at 3:02 PM, basma a.azeem
<basmaabdelaz...@hotmail.com>wrote:

> Any Suggestions ?
>
> ------------------------------
> From: basmaabdelaz...@hotmail.com
> To: ka...@ccs.neu.edu
> Subject: RE: [Dmtcp-forum] OpenMPI program Checkpoint restart
> Date: Tue, 1 Oct 2013 02:04:58 +0200
>
>
> Thank you for your reply
>
> OpenMPI version : 1.6.5
> DMTCP version :dmtcp-1.2.8
> gcc version :gcc-4.7
> linux ubuntu Kernel version 3.5.0-28-generic
> my program is MPI/C Integer Sort of Nas Parallel Benchmark NPB 3.3  which
> run using 4 processes
>
> is this the right way and the only way to restart Openmpi programusing
> DMTCP? or what i did wrong?
> and can i try a MPI/Fortran program also?
>
>
> Thank you
>
> ------------------------------
> From: ka...@ccs.neu.edu
> Date: Sun, 29 Sep 2013 11:42:29 -0400
> Subject: Re: [Dmtcp-forum] OpenMPI program Checkpoint restart
> To: basmaabdelaz...@hotmail.com
> CC: dmtcp-forum@lists.sourceforge.net
>
> Hi,
>
> Thank you for contacting us.
>
> Could your provide us more information about the OpenMPI version that you
> are using? Also, DMTCP, libc, gcc, and kernel versions too.
>
> Thanks,
> Kapil
>
>
> On Sat, Sep 28, 2013 at 6:49 PM, basma a.azeem <
> basmaabdelaz...@hotmail.com> wrote:
>
>  i need to use DMTCP to checkpoint and restart OpenMPI program
> DMTCP was installed on my machine
> and openmpi run normally
>
>
> so i ran the following command:
>
> :~$ dmtcp_checkpoint  mpirun -np 4
> /home/basma/NPB3.3/NPB3.3/NPB3.3-MPI/bin/is.A.4
>  my program is MPI/C Integer Sort of Nas Parallel Benchmark which run
> using 4 processes
>
> then i created a manual checkpoint using the dmtcp_coordinator
>
> so i had the following files in my home folder:
>
> ckpt_orterun_5721e6a7ff40367d-2937-52471e26.dmtcp
> ckpt_is.A.4_5721e6a7ff40367d-2942-52471e26.dmtcp
> ckpt_is.A.4_5721e6a7ff40367d-2944-52471e26.dmtcp
> ckpt_is.A.4_5721e6a7ff40367d-2947-52471e26.dmtcp
> ckpt_is.A.4_5721e6a7ff40367d-2950-52471e26.dmtcp
> dmtcp_restart_script.sh
> dmtcp_restart_script_5721e6a7ff40367d-2937-52471e26.sh
>
> i used the following command to restart:
>
> basma@basma-Satellite-A500:~$  ./dmtcp_restart_script.sh
>
> dmtcp_checkpoint (DMTCP + MTCP) 1.2.8
> Copyright (C) 2006-2011  Jason Ansel, Michael Rieker, Kapil Arya, and
>                                                        Gene Cooperman
> This program comes with ABSOLUTELY NO WARRANTY.
> This is free software, and you are welcome to redistribute it
> under certain conditions; see COPYING file for details.
> (Use flag "-q" to hide this message.)
>
> [3398] ERROR at connection.cpp:1137 in restore;
> REASON='JASSERT(jalib::Filesystem::FileExists(_path) == false) failed'
>      _path = /run/shm/open_mpi.0001
> Message:
> **** File already exists! Checkpointed copy can't be restored.
> ****Delete the existing file and try again!
> dmtcp_restart (3398): Terminating...
>
>
> which file should i use to restart the OpenMPI program?and which command?
>
>
>
>
> ------------------------------------------------------------------------------
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
> from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
> _______________________________________________
> Dmtcp-forum mailing list
> Dmtcp-forum@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dmtcp-forum
>
>
>
------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
_______________________________________________
Dmtcp-forum mailing list
Dmtcp-forum@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dmtcp-forum

Reply via email to