I've found that using srun to launch the python application allows it to
receive the signals from SLURM.  Unsure if that's the intended behavior,
but it works.

- Trey

=============================

Trey Dockendorf
Systems Analyst I
Texas A&M University
Academy for Advanced Telecommunications and Learning Technologies
Phone: (979)458-2396
Email: [email protected]
Jabber: [email protected]

On Fri, Jan 16, 2015 at 1:59 PM, Trey Dockendorf <[email protected]> wrote:

> I'm attempting to have a batch script receive SIGUSR1 60 seconds before
> walltime is reached.  I have the python program that runs in the job
> handling the signals.  When I run my jobs interactively and send "kill -s
> USR1 <pid>" the python code responds as I'd expect.  However when I run
> either interactively or via a batch script and use scancel to send the USR1
> signal nothing seems to happen.  I even added this to my batch script just
> to see if signals are being sent
>
> trap 'echo "SIGNAL CAUGHT"' SIGUSR1
>
> I try 'scancel --signal=USR1 --batch <jobID>' and nothing prints.
>
> I've used "#SBATCH --signal=USR1" in my batch scripts.  I'm unsure if
> there is something I'm missing that is the key to making these signals
> work.  We are using cgroups for ProctrackType and TaskPlugin.  The python
> code I'm running is not executed via srun.
>
> Thanks,
> - Trey
>
> =============================
>
> Trey Dockendorf
> Systems Analyst I
> Texas A&M University
> Academy for Advanced Telecommunications and Learning Technologies
> Phone: (979)458-2396
> Email: [email protected]
> Jabber: [email protected]
>

Reply via email to