Re: [OMPI users] psec warning when launching with srun

2023-05-18 Thread christof.koehler--- via users
Hello again,

I should add that the openmpi configure decides to use the internal pmix

configure: WARNING: discovered external PMIx version is less than internal 
version 3.x
configure: WARNING: using internal PMIx
...
...
checking if user requested PMI support... yes
checking for pmi.h in /usr/include... not found
checking for pmi.h in /usr/include/slurm... found
checking pmi.h usability... yes
checking pmi.h presence... yes
checking for pmi.h... yes
checking for libpmi in /usr/lib64... found
checking for PMI_Init in -lpmi... yes
checking for pmi2.h in /usr/include... not found
checking for pmi2.h in /usr/include/slurm... found
checking pmi2.h usability... yes
checking pmi2.h presence... yes
checking for pmi2.h... yes
checking for libpmi2 in /usr/lib64... found
checking for PMI2_Init in -lpmi2... yes
checking for pmix.h in ... not found
checking for pmix.h in /include... not found
checking can PMI support be built... yes
checking if user requested internal PMIx support(yes)... no
checking for pmix.h in /usr... not found
checking for pmix.h in /usr/include... found
checking libpmix.* in /usr/lib64... found
checking PMIx version... version file found
checking version 4x... found
checking PMIx version to be used... internal

I am not sure how it decides that, the external one is already a quite
new version.

# srun --mpi=list
MPI plugin types are...
pmix
cray_shasta
none
pmi2
specific pmix plugin versions available: pmix_v4


Best Regards

Christof

-- 
Dr. rer. nat. Christof Köhler   email: c.koeh...@uni-bremen.de
Universitaet Bremen/FB1/BCCMS   phone:  +49-(0)421-218-62334
Am Fallturm 1/ TAB/ Raum 3.06   fax: +49-(0)421-218-62770
28359 Bremen  


[OMPI users] psec warning when launching with srun

2023-05-18 Thread christof.koehler--- via users
Hello everybody,

we are seeing the sypmptoms described in
https://github.com/open-mpi/ompi/issues/11557

However, according to the systems package manager (dnf) all munge
related packages on the build node and the execution node are identical,
see details at the bottom. So, the explanation given by Ralph Castain in 
the git issue, which I read to refer to a missing munge library, does 
not appear to explain the warning. I would like to note that a 
PMIX_MCA_psec=native is sufficient to make the warning go away, it is 
not necessary to disable munge completely. Launching with 
srun --mpi=pmi2 instead works fine, anyway.

Related to that: pmix has a configure switch to disable munge support.
Would it be possible and/or adviseable to disable munge in the pmix
build ?

I have to note that we are seeing also 
PMIX ERROR: ERROR in file gds_ds12_lock_pthread.c at line 168
for each rank launched. This might be a separate issue or related.
Setting PMIX_MCA_gds=^ds12 makes this go away.

With or without setting PMIX_MCA_psec and PMIX_MCA_gds the job launches
and is executed. Still, I would like to understand this better in case
we have a broken slurm or mpi setup which would need to be corrected.

Best Regards

Christof

System Details:

Rocky Linux 9.1, slurm 23.02.2, open pmix 4.2.4rc1, openmpi 4.1.5.

slurm and pmix were compiled in the same step to rpms in a mock 
build environment using the spec files included with them. I can 
provide the build logs of the rpms. Both are not from distribution
repositories.

openmpi was configured with 
./configure --enable-mpi1-compatibility
--enable-orterun-prefix-by-default
--with-ofi=/cluster/libraries/libfabric/1.18.0/  --with-slurm
--with-pmix --with-pmix-libdir=/usr/lib64 --with-pmi
--with-pmi-libdir=/usr/lib64

Installed munge packages from distribution according to dnf

on build node
 munge.x86_64   0.5.13-13.el9
 munge-devel.x86_64 0.5.13-13.el9
 munge-libs.x86_64  0.5.13-13.el9

on execution node
 munge.x86_640.5.13-13.el9
 munge-devel.x86_64  0.5.13-13.el9
 munge-libs.x86_64   0.5.13-13.el9

-- 
Dr. rer. nat. Christof Köhler   email: c.koeh...@uni-bremen.de
Universitaet Bremen/FB1/BCCMS   phone:  +49-(0)421-218-62334
Am Fallturm 1/ TAB/ Raum 3.06   fax: +49-(0)421-218-62770
28359 Bremen