Il 29/01/21 15:58, Gilles Gouaillardet via users ha scritto:

Hi Gilles.

Tks for the answer.

> the mpirun command line starts 2 MPI task, but the error log mentions
> rank 56, so unless there is a copy/paste error, this is highly
> suspicious.
Uhm... Going to re-check. Most probably it's just my error substituting
a variable, but worth checking again.

> I invite you to check the filesystem usage on this node, and make sure
> there is a similar amount of available space in /tmp and /dev/shm (or
> other filesystem if you use a non standard $TMPDIR
Well, on all those nodes /tmp is disk-based (~52G available) while
/dev/shm is a tmpfs w/ 239G "available" mounted as
tmpfs /dev/shm tmpfs defaults,size=95%
(we've had to increase the default to 95% because that's required by the
new release of a library that "mixes" MPI and OpenMP to squeeze a bit
more speed reducing comms overhead).

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786

Reply via email to