[slurm-dev] Re: Jobs cancelled "DUE TO TIME LIMIT" long before actual timelimit

2017-08-30 Thread Christopher Samuel
On 30/08/17 04:34, Brian W. Johanson wrote: > Any idea on what would cause this? It looks like the job *step* hit the timelimit, not the job itself. Could you try the sacct command without the -X flag to see what the timelimit for the step was according to Slurm please? $ sacct -S 071417 -a

[slurm-dev] Re: openmpi, slurm and pmix

2017-08-30 Thread Phil K
OK, I infer from your answer that, yes, the two pmix libraries (internal on ompi, external on slurm) are cooperating to run the jobs. My inclination would be to configure ompi to use the same external pmix 1.2 as slurm (one ring to rule them all), but apparently some people reported problems

[slurm-dev] Re: openmpi, slurm and pmix

2017-08-30 Thread r...@open-mpi.org
SLURM currently supports PMIx v1.2, which is what you’d find in the OMPI v2.x series. As long as you stay within that OMPI release series, you should be fine as the internal OMPI library will match what you used for SLURM. I’m afraid that OMPI will always use its internal version unless you

[slurm-dev] openmpi, slurm and pmix

2017-08-30 Thread Phil K
I've got slurm running with openmpi  and pmi2 and pmix are working just fine.  I notice when using pmix,  e,g, srun --mpi=mpix_v1 ..., the slurm plugin mpi_pmix.so (mpi_pmix_v1.so) is used which is itself.linked to the pmix library I have installed on the cluster (libpmix.so.2).  My openmpi

[slurm-dev] Slurm accounting: limit by max nodes period

2017-08-30 Thread Jacob Chappell
Hi All, Our center was previously using Moab, and we are trying to move to Slurm exclusively now. One of the features Moab gave us was the ability to limit the maximum number of nodes a user/group could use across their entire allocation. In Slurm (via the accounting enforcement), GrpNodes