Hi Alan,
Thanks a lot for the feedback! I've opened a new issue now:
https://github.com/easybuilders/easybuild-easyconfigs/issues/15651
Best regards,
Ole
On 6/9/22 10:52, Alan O'Cais wrote:
Ole,
Can you please copy this over to an issue in
https://github.com/easybuilders/easybuild-easyconfigs/issues
<https://github.com/easybuilders/easybuild-easyconfigs/issues> so we can
keep track of things there? It is also being discussed in Slack but we
should really have the discussion and progress in a location where anyone
can find it.
If you don't have a GitHub account, can you give me permission to copy
over the content of your email to create the issue.
Thanks,
Alan
On Wed, 25 May 2022 at 10:54, Ole Holm Nielsen <[email protected]
<mailto:[email protected]>> wrote:
Hi Easybuilders,
I'm testing the upgrade of our compute nodes from Almalinux 8.5 to 8.6
(the RHEL 8 clone similar to Rocky Linux).
We have found that *all* MPI codes built with any of the Intel toolchains
intel/2020b or intel/2021b fail after the 8.5 to 8.6 upgrade. The codes
fail also on login nodes, so the Slurm queue system is not involved.
The FOSS toolchains foss/2020b and foss/2021b work perfectly on EL 8.6,
however.
My simple test uses the attached trivial MPI Hello World code running
on a
single node:
$ module load intel/2021b
$ mpicc mpi_hello_world.c
$ mpirun ./a.out
Now the mpirun command enters an infinite loop (running many minutes) and
we see these processes with "ps":
/bin/sh
/home/modules/software/impi/2021.4.0-intel-compilers-2021.4.0/mpi/2021.4.0/bin/mpirun
./a.out
mpiexec.hydra ./a.out
The mpiexec.hydra process doesn't respond to 15/SIGTERM and I have to
kill
it with 9/SIGKILL. I've tried to enable debugging output with
export I_MPI_HYDRA_DEBUG=1
export I_MPI_DEBUG=5
but nothing gets printed from this.
Question: Has anyone tried an EL 8.6 Linux with the Intel toolchain and
mpiexec.hydra? Can you suggest how I may debug this issue?
OS information:
$ cat /etc/redhat-release
AlmaLinux release 8.6 (Sky Tiger)
$ uname -r
4.18.0-372.9.1.el8.x86_64
Thanks a lot,
Ole
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark,
Fysikvej Building 309, DK-2800 Kongens Lyngby, Denmark
E-mail: [email protected]
Homepage: http://dcwww.fysik.dtu.dk/~ohnielse/
Mobile: (+45) 5180 1620