I don't find anything with grep -r RESTARTED slurm-2.2.7/ Did this not make it in?
What are people doing currently in the case where a status=NODE_FAIL and a job gets requeued? On Fri, Jan 23, 2009 at 11:05 AM, <[email protected]> wrote: >> Hi, >> >> Is there an environment variable visible to the running job which >> indicates that a slurm job has been restarted or requeued? Or is there >> some other way for the job to determine this? Perhaps some >> squeue/scontrol invocation? I've poked around a bit but have been >> unable to find anything in the sbatch,srun,scontrol manpages or in the >> environment of requeued jobs. >> >> I know grid engine uses the RESTARTED environment variable (from man >> qsub): >> RESTARTED This variable is set to 1 if a job was restarted >> either after a system crash or after a migration in case of a >> checkpointing job. The variable has the value 0 otherwise. >> >> I do see the requeue messages in slurmctld.log - those are very nice. >> We are using slurm 1.3.12 currently. Hopefully there is a way and I >> have just missed it, if not could you please add this if it isn't too >> much trouble? It seems that nn improvement on the above {1,0} method >> would be to increment the value upon each rerun so that a job could >> give up after some number of times. >> >> Thanks, >> Chris > > > Chris, > > Although this is a good idea, SLURM does not provide this information > today. Since changes to the RPCs would be required to implement this, > it will need to wait for the next major release of SLURM. We plan to > release version 1.4 in May, and I'll plan to add a SLURM_RESTARTED > environment variable with a counter per your suggestion. > -- > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Morris "Moe" Jette [email protected] 925-423-4856 > Integrated Computational Resource Management Group fax 925-423-6961 > Livermore Computing Lawrence Livermore National Laboratory > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > "The problem with the world is that we draw the circle of our family > too small." - Mother Teresa > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >
