On 30/08/17 04:34, Brian W. Johanson wrote:
> Any idea on what would cause this?
It looks like the job *step* hit the timelimit, not the job itself.
Could you try the sacct command without the -X flag to see what the
timelimit for the step was according to Slurm please?
$ sacct -S 071417 -a
OK, I infer from your answer that, yes, the two pmix libraries (internal on
ompi, external on slurm) are cooperating to run the jobs. My inclination would
be to configure ompi to use the same external pmix 1.2 as slurm (one ring to
rule them all), but apparently some people reported problems
SLURM currently supports PMIx v1.2, which is what you’d find in the OMPI v2.x
series. As long as you stay within that OMPI release series, you should be fine
as the internal OMPI library will match what you used for SLURM. I’m afraid
that OMPI will always use its internal version unless you
I've got slurm running with openmpi and pmi2 and pmix are working just fine.
I notice when using pmix,
e,g, srun --mpi=mpix_v1 ..., the slurm plugin mpi_pmix.so (mpi_pmix_v1.so) is
used which is itself.linked to the pmix library I have installed on the cluster
(libpmix.so.2). My openmpi
Hi All,
Our center was previously using Moab, and we are trying to move to Slurm
exclusively now. One of the features Moab gave us was the ability to limit
the maximum number of nodes a user/group could use across their entire
allocation. In Slurm (via the accounting enforcement), GrpNodes