-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 04/07/14 17:06, Arjun J Rao wrote:

> Also, is the missing /usr/local/sbin/scch an integral part of the problem ? 

I think that's a red herring, the man page for slurm.conf says:

 checkpoint/blcr   Berkeley  Lab  Checkpoint Restart (BLCR).
                   NOTE: If a file is found at sbin/scch (relative
                   to the SLURM installation location), it will be
                   executed upon completion of the checkpoint. This
                   can  be  a  script used for managing the checkpoint
                   files.  NOTE: SLURM’s BLCR logic only supports batch
                   jobs.

*However* I think that NOTE at the end may explain it, you say you are doing:

srun -N2 -n24 --checkpoint 1 --checkpoint-dir /home/arjun/ACIM/Ctrl ./MPIJob

I think you'll need to do that inside an sbatch script for
this to work.

Caveat:  We've never used this, so YMMV.

All the best,
Chris
- -- 
 Christopher Samuel        Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: [email protected] Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/      http://twitter.com/vlsci

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlO54jYACgkQO2KABBYQAh9sdACfWoq1EBZJD7efbiEnYdqxY53U
y3gAnjDnMw39y2IoGWaMV9DUftXhhJ8U
=QPs4
-----END PGP SIGNATURE-----

Reply via email to