Hi everyone,

I have managed to solve the first part of this problem. It was caused
by the quota on /tmp, that's where the session directory of openmpi
was stored. There's a XFS default quota of 100MB to prevent users from
filling up /tmp. Instead of an over quota message, the result was the
openmpi crash from a bus error.

After setting TMPDIR in slurm, I was finally able to run IMB-MPI1 with
1024 cores and openmpi 1.10.6.

But now for the new problem: with openmpi3, the same test (IMB-MPI1,
1024 cores, 32 nodes) hangs after about 30 minutes of runtime. Any
idea on this?

Regards, Götz Waschk
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to