Just trying to understand - why are you saying this is a pmix problem? Obviously, something to do with mpirun is failing, but I don't see any indication here that it has to do with pmix.
Can you add --enable-debug to your configure line and inspect the core file from the dump? > On Jan 31, 2021, at 8:18 PM, Andrej Prsa via devel <devel@lists.open-mpi.org> > wrote: > > Hello list, > > I just upgraded openmpi from 4.0.3 to 4.1.0 to see if it would solve a weird > openpmix problem we've been having; I configured it using: > > ./configure --prefix=/usr/local --with-pmix=internal --with-slurm > --without-tm --without-moab --without-singularity --without-fca > --without-hcoll --without-ime --without-lustre --without-psm --without-psm2 > --without-mxm --with-gnu-ld > > (I also have an external pmix version installed and tried using that instead > of internal, but it doesn't change anything). Here's the output of configure: > > Open MPI configuration: > ----------------------- > Version: 4.1.0 > Build MPI C bindings: yes > Build MPI C++ bindings (deprecated): no > Build MPI Fortran bindings: mpif.h, use mpi, use mpi_f08 > MPI Build Java bindings (experimental): no > Build Open SHMEM support: false (no spml) > Debug build: no > Platform file: (none) > > Miscellaneous > ----------------------- > CUDA support: no > HWLOC support: external > Libevent support: external > PMIx support: Internal > > Transports > ----------------------- > Cisco usNIC: no > Cray uGNI (Gemini/Aries): no > Intel Omnipath (PSM2): no > Intel TrueScale (PSM): no > Mellanox MXM: no > Open UCX: no > OpenFabrics OFI Libfabric: no > OpenFabrics Verbs: no > Portals4: no > Shared memory/copy in+copy out: yes > Shared memory/Linux CMA: yes > Shared memory/Linux KNEM: no > Shared memory/XPMEM: no > TCP: yes > > Resource Managers > ----------------------- > Cray Alps: no > Grid Engine: no > LSF: no > Moab: no > Slurm: yes > ssh/rsh: yes > Torque: no > > OMPIO File Systems > ----------------------- > DDN Infinite Memory Engine: no > Generic Unix FS: yes > IBM Spectrum Scale/GPFS: no > Lustre: no > PVFS2/OrangeFS: no > > Once configured, make and sudo make install worked without a glitch; but when > I run mpirun, I get this: > > andrej@terra:~/system/openmpi-4.1.0$ mpirun --version > mpirun (Open MPI) 4.1.0 > > Report bugs to http://www.open-mpi.org/community/help/ > andrej@terra:~/system/openmpi-4.1.0$ mpirun > malloc(): corrupted top size > Aborted (core dumped) > > No matter what I try to run, it always segfaults. Any suggestions on what I > can try to resolve this? > > Oh, I should also mention that I tried to remove the global libevent; openmpi > configured its internal copy but then failed to build. > > Thanks, > Andrej >