You have several options

1. Epilog: a script automatically executed on job completion
http://slurm.schedmd.com/prolog_epilog.html

Possibly you would want to set EpilogSlurmctld in slurm.conf, which is executed by the slurmctld on the head node, rather than the regular eplog executed by slurmd on each node.

2. Add the analysis as a dependant job:
from sbatch man http://slurm.schedmd.com/sbatch.html
*--dependency*=</dependency_list/>
   Defer the start of this job until the specified dependencies have
   been satisfied completed. </dependency_list/> is of the form
   </type:job_id[:job_id][,type:job_id[:job_id]]/>. Many jobs can share
   the same dependency and these jobs may even belong to different
   users. The value may be changed after job submission using the
   scontrol command.

       *after:job_id[:jobid...]*
           This job can begin execution after the specified jobs have
begun execution. *afterany:job_id[:jobid...]*
           This job can begin execution after the specified jobs have
terminated. *afternotok:job_id[:jobid...]*
           This job can begin execution after the specified jobs have
           terminated in some failed state (non-zero exit code, node
failure, timed out, etc). *afterok:job_id[:jobid...]*
           This job can begin execution after the specified jobs have
           successfully executed (ran to completion with an exit code
of zero). *expand:job_id*
           Resources allocated to this job should be used to expand the
           specified job. The job to expand must share the same QOS
           (Quality of Service) and partition. Gang scheduling of
resources in the partition is also not supported. *singleton*
           This job can begin execution after any previously launched
jobs sharing the same job name and user have terminated.

3. Using strigger:
http://slurm.schedmd.com/strigger.html

set a trigger on job completion with strigger --set -f ...

*-f*, *--fini*
   Trigger an event when the specified job completes execution.
*-j*, *--jobid*=/id/
   Job ID of interest. *NOTE:* The *--jobid* option can not be used in
   conjunction with the *--node* option. When the *--jobid* option is
   used in conjunction with the *--up* or *--down* option, all nodes
   allocated to that job will considered the nodes used as a trigger
event. *-p*, *--program*=/path/
   Execute the program at the specified fully qualified pathname when
   the event occurs. You may quote the path and include extra program
   arguments if desired. The program will be executed as the user who
   sets the trigger. If the program fails to terminate within 5
minutes, it will be killed along with any spawned processes. *--set*
   Register an event trigger based upon the supplied options. NOTE: An
   event is only triggered once. A new event trigger must be set
   established for future events of the same type to be processed.
   Triggers can only be set if the command is run by the user
   /SlurmUser/ unless /SlurmUser/ is configured as user root.

Depending on your setup, possibly #3 is least desirable due to:
1. the trigger is not immediate, but only happens after polling (might not be a concern) 2. set is one time only - you will have create a new trigger each time you wish this functionality (can be automated)

point 2 brings the biggest difference between #1 and #2 in my opinion (to your particular use case) If you always want to trigger analysis after each job - #1 is probably the most suitable. Define it once and forget it

#2 is more suitable if only some of the jobs require analysis - create your own sbatch templates which are used when starting a job+analysis pair, otherwise use a different sbatch template (job only).

A simplified example of such a template:

#!/bin/bash

srun myapp
STEP_ID=${SLURM_JOB_ID}
srun -d afterok:${STEP_ID} myanalysis
####

The other issue is default resource allocation - #2 (unless modifying the analysis step) will allocate the same resource for the app and the analysis.

On 05/13/2015 10:39 PM, Franco Broi wrote:

Try strigger.

On 14 May 2015 3:23 am, Trevor Gale <[email protected]> wrote:


    No, I haven’t. What is epilog?

    Thanks,
    Trevor

    > On May 13, 2015, at 3:21 PM, Daniel Letai <[email protected]> wrote:
    >
    >
    > Have you looked into epilog as a means to start your analysis
    automatically?
    >
    > On 05/13/2015 05:33 PM, Trevor Gale wrote:
    >> Hey all,
    >>
    >> I was just wondering if there is any mechanism built into slurm
    to signal to the user when jobs are done (other than email). I’m
    making a script to run a series of jobs and want to run some
    analysis on the results after the jobs return. I was wondering if
    there is a way to signal that a job submission has ended so my
    program can just start the analysis and not have to have the
    analysis executed separately.
    >>
    >> Thanks,
    >> Trevor


------------------------------------------------------------------------


This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed. If you are not the original recipient or the person responsible for delivering the email to the intended recipient, be advised that you have received this email in error, and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately notify the sender and delete the original.


Reply via email to