You have several options
1. Epilog: a script automatically executed on job completion
http://slurm.schedmd.com/prolog_epilog.html
Possibly you would want to set EpilogSlurmctld in slurm.conf, which is
executed by the slurmctld on the head node, rather than the regular
eplog executed by slurmd on each node.
2. Add the analysis as a dependant job:
from sbatch man http://slurm.schedmd.com/sbatch.html
*--dependency*=</dependency_list/>
Defer the start of this job until the specified dependencies have
been satisfied completed. </dependency_list/> is of the form
</type:job_id[:job_id][,type:job_id[:job_id]]/>. Many jobs can share
the same dependency and these jobs may even belong to different
users. The value may be changed after job submission using the
scontrol command.
*after:job_id[:jobid...]*
This job can begin execution after the specified jobs have
begun execution.
*afterany:job_id[:jobid...]*
This job can begin execution after the specified jobs have
terminated.
*afternotok:job_id[:jobid...]*
This job can begin execution after the specified jobs have
terminated in some failed state (non-zero exit code, node
failure, timed out, etc).
*afterok:job_id[:jobid...]*
This job can begin execution after the specified jobs have
successfully executed (ran to completion with an exit code
of zero).
*expand:job_id*
Resources allocated to this job should be used to expand the
specified job. The job to expand must share the same QOS
(Quality of Service) and partition. Gang scheduling of
resources in the partition is also not supported.
*singleton*
This job can begin execution after any previously launched
jobs sharing the same job name and user have terminated.
3. Using strigger:
http://slurm.schedmd.com/strigger.html
set a trigger on job completion with strigger --set -f ...
*-f*, *--fini*
Trigger an event when the specified job completes execution.
*-j*, *--jobid*=/id/
Job ID of interest. *NOTE:* The *--jobid* option can not be used in
conjunction with the *--node* option. When the *--jobid* option is
used in conjunction with the *--up* or *--down* option, all nodes
allocated to that job will considered the nodes used as a trigger
event.
*-p*, *--program*=/path/
Execute the program at the specified fully qualified pathname when
the event occurs. You may quote the path and include extra program
arguments if desired. The program will be executed as the user who
sets the trigger. If the program fails to terminate within 5
minutes, it will be killed along with any spawned processes.
*--set*
Register an event trigger based upon the supplied options. NOTE: An
event is only triggered once. A new event trigger must be set
established for future events of the same type to be processed.
Triggers can only be set if the command is run by the user
/SlurmUser/ unless /SlurmUser/ is configured as user root.
Depending on your setup, possibly #3 is least desirable due to:
1. the trigger is not immediate, but only happens after polling (might
not be a concern)
2. set is one time only - you will have create a new trigger each time
you wish this functionality (can be automated)
point 2 brings the biggest difference between #1 and #2 in my opinion
(to your particular use case)
If you always want to trigger analysis after each job - #1 is probably
the most suitable. Define it once and forget it
#2 is more suitable if only some of the jobs require analysis - create
your own sbatch templates which are used when starting a job+analysis
pair, otherwise use a different sbatch template (job only).
A simplified example of such a template:
#!/bin/bash
srun myapp
STEP_ID=${SLURM_JOB_ID}
srun -d afterok:${STEP_ID} myanalysis
####
The other issue is default resource allocation - #2 (unless modifying
the analysis step) will allocate the same resource for the app and the
analysis.
On 05/13/2015 10:39 PM, Franco Broi wrote:
Try strigger.
On 14 May 2015 3:23 am, Trevor Gale <[email protected]> wrote:
No, I haven’t. What is epilog?
Thanks,
Trevor
> On May 13, 2015, at 3:21 PM, Daniel Letai <[email protected]> wrote:
>
>
> Have you looked into epilog as a means to start your analysis
automatically?
>
> On 05/13/2015 05:33 PM, Trevor Gale wrote:
>> Hey all,
>>
>> I was just wondering if there is any mechanism built into slurm
to signal to the user when jobs are done (other than email). I’m
making a script to run a series of jobs and want to run some
analysis on the results after the jobs return. I was wondering if
there is a way to signal that a job submission has ended so my
program can just start the analysis and not have to have the
analysis executed separately.
>>
>> Thanks,
>> Trevor
------------------------------------------------------------------------
This email and any files transmitted with it are confidential and are
intended solely for the use of the individual or entity to whom they
are addressed. If you are not the original recipient or the person
responsible for delivering the email to the intended recipient, be
advised that you have received this email in error, and that any use,
dissemination, forwarding, printing, or copying of this email is
strictly prohibited. If you received this email in error, please
immediately notify the sender and delete the original.