[slurm-dev] Re: Parallel epilogs

Taras Shapovalov Tue, 28 Feb 2012 09:07:10 -0800

Hi Carles,

On 02/28/2012 04:11 PM, Carles Fenoy wrote:

    ...

    So, firstly jobs 1586 and 1587 had NumCPUs=1, than these jobs was
    finished in parallel on node ts-sl5slurm, then their NumCPUs were
    increased to maximum value (12 CPUs) automatically and then state
    of 1587 was changed to PD, because all CPUs are allocated by
    epilog of job 1586.
As far as I can see here, job 1587 waits untill the job 1586 finishes.In the squeue output you added, 1586 is completing because, probablybecause its epilog is running. When it finishes job 1587 starts. Asyou have Shared=NO in your partition configuration, slurm considersthe job has used all the cpus in the node and sets NumCPUs=12.So there are NO 2 jobs running in parallel, but sequentially becauseof the non shared partition.


You right, "issue" was in shared parameter.
Now epilogs is running in parallel.
Thanks!

--
Best regards,
  Taras

[slurm-dev] Re: Parallel epilogs

Reply via email to