Hi, In this case you can also use the comment field but its easier to use the "scontrol show job JOBID" if the purge time configured is high enough and the epilog does not run for long time. The exit code consists of 2 parts: (From slurm's scontrol man page) ExitCode=<exit>:<sig> Exit status reported for the job by the wait() function. The first number is the exit code, typically as set by the exit() function. The second number of the signal that caused the process to terminate if it was terminated by a signal.
I hope this has been useful to you, Regards, Carles Fenoy Barcelona Supercomputing Center On Mon, Aug 13, 2012 at 11:05 PM, Sarah Mulholland <[email protected]> wrote: > Thanks, I have read it several times, but I haven’t found a solution. I > want my epilog to communicate the job exit status through a specific port. > Thus I generate the epilog script on the fly before submitting the job. I > tried running squeue in the epilog, but the job is already off the queue. > Are there any other tricks I could use for getting the job exit status in > the epilog?**** > > > Thanks,**** > > > Sarah**** > > ** ** > > *From:* gugga 4u [mailto:[email protected]] > *Sent:* Monday, August 13, 2012 12:06 PM > > *To:* slurm-dev > *Subject:* [slurm-dev] Re: exit code in epilog script?**** > > ** ** > > > Refer to the section on "Prolog and Epilog Scripts" at > http://www.schedmd.com/slurmdocs/slurm.conf.html. > > **** > > ** ** > > *From:* Sarah Mulholland [mailto:[email protected]] > *Sent:* Monday, August 13, 2012 11:37 AM > > *To:* slurm-dev > *Subject:* [slurm-dev] Re: exit code in epilog script?**** > > ** ** > > I should say that I am generating my epilog script on the fly because it > communicates back to another process using a process-specific xmlrpc port > to report exit status. Thus a slurmctldepilog that is generically > specified in the slurm.conf won’t serve my purpose. Is there any way for > my job-specific epilog to get the exit code?**** > > ** ** > > ** ** > > On Mon, Aug 13, 2012 at 1:15 PM, Sarah Mulholland <[email protected]> wrote:* > *** > > When I print the environment from my job epilog script, I don’t see either > SLURM_JOB_DERIVED_EC or SLURM_JOB_EXIT_CODE. There are about a dozen > environment variables set, but nothing that suggests the exit code. Any > suggestions for how I can grab this value? I am running slurm-2.3.5**** > > **** > > My test (foo.py):**** > > **** > > #!/usr/bin/env python**** > > import sys**** > > print ‘running a test’**** > > sys.exit(1) > > My epilog script (bar.py):**** > > > #!/usr/bin/env python > from os import environ as env > for k,v in env.iteritems():**** > > print k, ‘:’, v**** > > **** > > My command line:**** > > srun –n 1 –epilog=bar.py foo.py | grep SLURM**** > > **** > > *From:* Lyn Gerner [mailto:[email protected]] > *Sent:* Wednesday, June 27, 2012 12:03 PM > *To:* slurm-dev > *Subject:* [slurm-dev] Re: exit code in epilog script?**** > > **** > > Hi Sarah,**** > > **** > > You can get this thru $SLURM_JOB_DERIVED_EC (highest exit code from the > job; sorry, can't locate it in the docs right now).**** > > **** > > Regards,**** > > Lyn**** > > **** > > On Wed, Jun 27, 2012 at 10:46 AM, Sarah Mulholland <[email protected]> wrote: > **** > > I’m a newbie setting up slurm. I found the example epilog script, and I > grabbed the user id and job number in my epilog script. I hunted through > the documentation and source code, but I don’t see if it is possible to get > the exit code of the job in the epilog script? Is it?**** > > **** > > Thanks in advance,**** > > > Sarah**** > > *Error! Filename not specified.***** > > **** > > [image: Image removed by sender.]**** > > *Error! Filename not specified.***** > > > [image: Image removed by sender.]**** > -- -- Carles Fenoy
<<image002.jpg>>
<<image001.jpg>>
