My reading of it is that this was added in Slurm 14.11.0pre1 and I don’t see 
any changes to it later, though I could have missed it:

-- Sent SIGCONT/SIGTERM when a job is selected for preemption with GraceTime
    configured rather than waiting for GraceTime to be reached before notifying
    the job.

Does anyone have a similar setup that is working?

Naveed


On 9/13/16, 3:49 AM, "Near-Ansari, Naveed" <nav...@caltech.edu> wrote:

    We are setting up preemption using QOS on our cluster.  Documentation seems 
to say that when a job is preempted it should be getting a SIGCONT and SOGTERM 
when selected for preemption, and then a SIGTERM, SIGCONT, AND SIGKILL at the 
end of gracetime.
    
    We have checked all of this and we are sent the signals at the end of 
GraceTime, but not when selected for preemption.  We are listening for these 
signals to checkpoint when preempted. We are checking for the signals both in 
the script and the launched executables in case we are wrong about what catches 
the signals.
    
    I am unclear whether the problem is of my understanding of how it is 
supposed to work, my configuration, or the documentation.
    
    The doc that mentions the signals sent it 
http://slurm.schedmd.com/preempt.html.
    
    This is the qos setup:
    
          Name  GraceTime    Preempt PreemptMode 
    ---------- ---------- ---------- ----------- 
        normal   00:00:00                cluster 
        sxs-lo   00:20:00                 cancel 
        sxs-hi   00:20:00     sxs-lo      cancel 
    
    
    /etc/slurm/slurm.conf:
    
    …
    PreemptType=preempt/qos
    PreemptMode=CANCEL
    …
    
    
    What am I doing wrong on this?
    
    Thanks,
    
    Naveed 
    
    

Reply via email to