Hi all,

I saw post earlier today (or yesterday) about jobs in a dependency chain
starting while the prior job epilogue is still running.  I have a related,
but more general case of this.

I've been using a test configuration of slurm on a Cray XC30 in hybrid
mode.  I've seen that the end-of-reservation nodehealthcheck (a Cray thing)
will often run at the same time as, or before a spank plugin epilogue
runs.  This generates a race between the two - especially since I use the
nodehealthcheck to validate that the epilogue properly cleaned up the job.

Is it feasible to run the job/spank epilogues *before* releasing the
resources?
Or, is this already the behavior and I'm misdiagnosing this.

Thanks,
Doug

----
Doug Jacobsen, Ph.D.
NERSC Computer Systems Engineer
National Energy Research Scientific Computing Center <http://www.nersc.gov>
[email protected]

------------- __o
---------- _ '\<,_
----------(_)/  (_)__________________________

Reply via email to